A user-friendly, Object-Oriented

Multi-MEDIA Mail FIlterer

 

 

 

Craig Cockburn

 

 

 

 

 

 

 

A report submitted in partial fulfilment of the requirements of

Napier University for the degree of Master of Science in

Large Software Systems Development.

 

 

Department of Computer Studies

February 1994

 


Abstract

 

The increasing use of electronic mail and the great diversity of materials that are sent via electronic mail has resulted in people having problems managing the volume of mail and identifying important messages in their mailboxes. Until recently, electronic mail consisted of plain text. However, with proposed new standards it is now possible to send and receive sound, graphics, compound messages and many other types of mail. These new formats are likely to pose new problems to the user who wants to handle mail efficiently.

 

This M.Sc. report describes the research, design, implementation and future development issues for an innovative prototype mail filterer. An Object Oriented (OO) Analysis and Design method that was used to implement the filterer is also presented.

 

The filterer is designed to be user-friendly and to handle electronic mail messages that conform to the draft standard for Multi-Media mail written by Nathaniel Borenstein and Ned Freed. This draft standard, called Multi-purpose Internet Mail Extensions will be referred to in this document by its usual abbreviation MIME.

 

The filterer implemented is believed to be the first user friendly mail filterer built specifically for MIME format mail.


Abstract...................................................................................................................................... 1

Acknowledgements...................................................................................................................................... 4

1.   Introduction and background      5

1.1. Aims.......................................................................................................................... 6

2.   Research      6

2.1.   Project History         7

2.2.   Prior knowledge         7

2.3.   Filterer research         8

2.4.   Existing Filterers         11

2.4.1.   Diagram of a mail system with filterer            13

2.4.2.   Text based mail filterers            14

2.4.3.   GUI based mail filterers            14

2.5.   MIME         15

2.6.   Development environment         16

3.   System specification      17

3.1.   User requirements         17

3.2.   System requirements         19

4.   Analysis and Design      20

4.1.   Research of Analysis and Design Methods         20

4.2.   Outcome of Analysis and Design Research         20

4.3.   Analysis         22

4.3.1.   Analysis diagram            22

4.4.   Design         23

4.4.1.   Overview design model            23

4.4.2.   Table of correspondences between design meta-classes            24

4.4.3.   Design model for problem domain and interface classes            24

4.4.4.   Design model for application interface classes            25

4.5.   HCI aspects         25

5.   Implementation      28

5.1.   HCI aspects         28

5.1.1.   Examples showing user interface            32

5.2.   Coding issues         34

5.2.1.   Use of wxWindows demonstration programs            34

5.2.2.   Discussion of code            35

5.3.   C++ issues         38

6.   Discussion      40

6.1.   Major problems encountered         40

6.2.   Evaluation of achievement         42

6.3.   Major Outcomes         44

6.3.1.   Departmental outcomes            44

6.3.2.   Personal outcomes            45

6.4.   Future Directions         45

6.4.1.   Development of Application            45

6.4.2.   Filtering trends            46

6.4.3.   MIME and Industry trends            47

6.5.   Summary         48

7.   References      49

Appendices...................................................................................................................................... 52

A.  User Documentation      52

1.   Installation      52

2.   First time use      52

3.   Subsequent use      53

B   Description of program menu items      54

C   Sample program output      57

D   Project Schedule      58

E    Survey  responses      59

F    Points from December review      61

G   Project Diary      63

H   Overview of wxWindows      68

I     Overview of procmail      69

J    Overview of Pine      70

K   Sample MIME message      71

L    MIME types      72

M   Code Samples      73


Acknowledgements

 

I would like to thank the following people for their assistance with this project

 

My project supervisor Alison Crerar for her support, guidance and comments and particularly for the thorough reviews and extensive comments on this report.

Julian Smart at the Dept of Artificial Intelligence, University of Edinburgh for help with using his wxWindows tool.

Neil Rumney for his help with keeping the link between the Department of Computer Studies and the Internet running so that I could access discussion lists and for installing a considerable amount of software on the Suns.

Finally, all the people on Internet who have replied to my mail messages and postings to news with questions related to this project.


1.  Introduction and background

 

The author of the project has been interested in the subject of Electronic Mail for some time. For several years, he has jointly run a world-wide language teaching mailing list. (GAELIC-L). In 1992 he founded The UK Internet List, the first guide to the UK's Internet providers, and has been using electronic mail in various forms for over 10 years. At Digital, he used world-wide E-mail on a daily basis as part of his job. It was through seeing the usefulness of this powerful business medium that the motivation arose to research future mail developments for this project. The project was the author's own proposal.

 

As the result of being on several mailing lists, ranging from the purely technical and work-related to others of a more recreational nature, the author receives about 40 mail messages a day. Many of these are automatically generated by the software controlling the GAELIC-L list. There is a danger, when receiving many messages of differing priority, that some important ones may get lost in the volume. In just the same way that many managers have a secretary to prioritise and sort their incoming letters, so it is useful to have a program handle incoming electronic mail in a similar way. The author has previously used a simple mail filterer and found it very useful, but it was easy to enter the wrong command and find that all the incoming mail was being automatically deleted. This accidental error could prove very costly in business and so it was decided to implement a system that was much more fail-safe.

 

Although mail filterers have been in use for several years (e.g. Deliver [ER3], ELM-Filter [ER4]), none have addressed issues concerned with the draft MIME standard. It is likely that MIME will become a full standard by the end of 1994. MIME covers more than just Multi-Media however, it can also be used to send and receive structured messages and foreign character sets.

 

Research into filterers also revealed that there are were no known free filterers that have a user friendly interface. To address this issue, the filterer implemented for this project was designed to be easy to use. Two important considerations were that the filterer should provide simple access to the most frequently used actions which users require, whilst allowing more specialised users access to other, less frequently used facilities.

 

Procmail (see Appendix I) was identified as a tool capable of supporting filtering based on the draft MIME standard. This tool was chosen because it was recommended as being very powerful, it was free, it runs on UNIX and is not tied to a particular mail tool.

 

When the author started to research MIME based filterers, there was very little information available. Indeed, one of the developers of a MIME compliant mail tool wondered what the usefulness of such a tool would be. The idea of designing a filterer with MIME in mind seemed to be novel. The author is fairly sure that when the project started there were no other user-friendly free MIME filterers.

 

The project was undertaken by the author and was assisted by staff in the Department of Computer Studies. In the early stages of the project, assistance was also provided by the Computing and Communications department of Washington University in Seattle who developed the MIME based mail tool, Pine (see Appendix J for more information). Pine was used to investigate Multi-media mail issues, including header contents and message structure.

 

Various people on the world-wide electronic network, the Internet, assisted with the project. Two discussion lists were particularly useful. One list was for the wxWindows product by Julian Smart at Edinburgh University (see Appendix H) which was chosen as the development environment, and the other for the procmail product by Stephen R. van den Berg  at RWTH-Aachen, Germany (see Appendix I) to understand how the mail filtering worked. Access to these lists has proved invaluable, particularly as the author introduced both wxWindows and procmail to the Department during the project. This meant that there was no one in the Department who knew anything about the products, and so support from experienced users elsewhere was essential. The author also pioneered student access to usenet newsgroups and became the first student at Napier University to access usenet. Access to the object oriented usenet newsgroups has also proved a valuable source of information.

 

1.1. Aims

 

The following were identified as aims of the project:

 

i.    To research traditional mail filtering.

 

ii.   To research mail tools which support MIME.

 

iii. To use a MIME based mail tool and a traditional mail filterer to implement a new prototype filterer capable of supporting MIME.

 

iv. To ensure the prototype implemented meets the needs of users, including functionality and usability.

 

v.   To apply the skills learnt during the M.Sc. to a large Software Development project and to research and solve problems associated with implementing Object Orientation.

2.  Research

 

Much of the research for the project was carried out by non-conventional means. The primary research was carried out by identifying key usenet newsgroups to ask questions in, such as comp.object for Object Oriented (OO) related questions, comp.lang.c++ for C++ related questions and comp.mail.mime for questions related to MIME. By asking questions in these forums and having electronic discussions, it was possible to research issues by locating relevant papers, files, key individuals and research establishments. Recommended papers not available on-line were obtained by inter-library loan.

 

To carry out research on subjects that are very new, it was apparent that using libraries alone was not going to yield enough current information. Therefore access to usenet newsgroups had to be obtained to ask questions, learn about announcements and participate in discussions. As Napier University was not projected to have Internet access available until Spring 94, facilities at Edinburgh University were identified as a means of conducting this on-line research. The author became the first student in Napier University to have access to such resources.

 

2.1.    Project History

 

The project has undergone several major revisions before being accepted in its present form. In October, the proposal was to port the MIME-based mailer Pine to a windows environment, to make it easier to use. However, porting Pine proved to be too big and difficult a problem, particularly as the underlying Pine code was still rapidly changing. Four Pine update releases were issued during the development of the present project. The Pine developers at the University of Washington in Seattle suggested that a filterer for Pine would be a useful self-contained addition to the product. Having used filterers before, the author could see the benefits of such a system and so agreed to take this on as the project.  This proposal was put to the Department and accepted as a project on 22nd October 1993, giving a total of 14 weeks for the project to be completed. As time was so limited, it was crucial to thoroughly plan the project and identify milestones at the outset. To ensure that these milestones were achieved, a schedule was written, and is detailed in Appendix D. A copy of this schedule was given to Alison Crerar the project supervisor, so that she could monitor progress.

 

2.2.    Prior knowledge

 

Although the author had used E-mail for some time before joining the course, this was exclusively on VAX/VMS machines. When the author started the LSSD course, he had not used any of the following systems before: PCs, UNIX, C++, Borland's Resource Workshop, procmail, wxWindows or Microsoft's Word for Windows. Although the author had previously used mail filterers, he had never heard of MIME before - this was first encountered while researching for an option taken earlier on in the M.Sc. In addition, the author had not met with Object Oriented methods previously and only had a basic knowledge of traditional analysis and design. Virtually every module on the course, and in particular the research project, has been undertaken with no prior knowledge at all. Many of the resources used on the project, wxWindows, procmail, MIME and Pine were introduced by the author to the Department and so there was no local access to any Departmental experts on these topics, all help had to be sought via the external network.

 

As a result of coming across so many new subjects during this project, many of which were not formally taught as part of the LSSD degree, the author has had to carry out a great deal of personal research to identify issues, relevant papers and software. Although the author wrote part of a windows application as part of the Post Graduate Diploma part of the course, much of the windows programming was handled by other team members. To overcome the considerable learning curve for so many different aspects, a development environment was chosen which would allow the rapid creation of a windows application without having to become an expert in Windows programming first.

 

2.3.    Filterer research

 

As Napier University is not fully on the Internet yet, many staff and students at Napier University are not aware of the benefits or indeed pitfalls of full Internet connection. For example, the volume of mail for most users within Napier University is manageable. The problem of excessive volume and managing large quantities of mail however becomes apparent when messages from outside the University are considered too, and when additional messages arrive as the result of postings to news, then the volume increases still further. These overloading problems are mainly due to the very large volume of users, interest groups and discussion lists on the Internet. Conservative estimates of Internet usage [ER7][1] put the amount of Internet traffic at about 60Mb a day of news.  With the rapid growth of the Internet (9% a month), and its continual development [Press93] the volume of material on the Internet is not only growing rapidly, but is becoming more complex too. A few years ago, the information on the Internet was simple text, whereas today it is possible to send sounds and graphics by electronic mail, with ease.

 

Recently, the Internet has been getting considerable publicity and with the recent publication of many books on the Internet it seems that the Internet is going to become much more a way of life. E-mail addresses are becoming more common on business cards and even non-computing publications are including articles about the Internet and mentioning it on their covers[2]. With increasing numbers of non-computing users connecting to the network [Pope94], it seems the 9% growth per month is set to increase considerably.

 

There is clearly a major problem of "information overload" that is getting worse, and tools are required to help people to manage the huge volume of material which is available. Moreover, the problem is not confined to mail, usenet news carries about 60Mb of news a day, and the only support most newsreaders have for filtering messages is simply to delete ones which match certain criteria. It is apparent however, that news and mail technologies are merging and some mail readers such as Pine offer an interface to usenet newsgroups, as the MIME standard also applies to news articles on usenet. Clearly any lessons and applications based around E-mail filtering could also apply to usenet news.

 

A filterer may help to correct what has been called the "productivity paradox" [Constant93]. That is, that despite the huge investments in Information Technology, the expected huge productivity improvements have not been realised. A possible reason for the paradox is stated as "IT is being used to provide managers with a greater sense of control, without actually improving their decision making". This idea is also mentioned in [Brynjolfsson93] who states:

 

      A valuable heuristic in 1960 might have been "get all readily available information before making a decision."  The same  heuristic today would lead to information overload and chaos. Indeed the rapid  speedup enabled by IT can create unanticipated bottlenecks at each human in the information processing chain. More money spent on IT will not help until these bottlenecks are addressed.

 

The aim of a filterer is ultimately to help people decide which messages to read, their priority and how they should be presented. This structuring of mail will go a long way to improving its manageability and helping people at all levels of an organisation to be better informed.

 

A filterer works by processing electronic mail and performing some action based on various properties of the mail message. This action could include forwarding the message to other users, filing the message in a mail folder, running an application, printing the message or attaching fields to the message so that it can be prioritised or handled by an advanced mail reader. Probably the most popular use for a mail filterer though is to "answer" mail while the user is away on holiday or a business trip. Such a facility sends back a standard reply containing a message, usually explaining that the person is away and when they are likely to return.

 

A mail filterer is activated by running a set of rules on the mail messages. The rules are composed of two parts, the matching criteria and actions to perform. The matching algorithm compares information about the mail message, usually stored in the mail's header fields to decide whether a rule should apply to the mail message. The action to perform specifies what the mail filterer should do with the message if the match criteria hold true for the message. Some simple filterers are batch jobs which run at regular intervals and process messages in the user's mailbox, however such filterers have the dual drawbacks of activating even if there is no mail to process and of causing delays in the processing of messages. Therefore, the filterers presented in this document are all of the kind which do not run at regular intervals, but are instead called on-demand when a message arrives for the user.

 

To fully understand how the functionality of a filterer can help users, a few examples will be given to illustrate the uses of a filterer.

 

The author has been working at home using a modem to keep in touch with Napier University and mailing lists for the last 8 weeks of this project, and is amongst the growing numbers of teleworkers[3]. A filterer is of use to teleworkers, particularly the self employed, who have to pay their own phone bill. By having a mail filterer it is possible to restrict the messages that are downloaded. The author has used a "kill file", a simple form of filtering mechanism, to prevent very large messages from being downloaded as these would tie up the phone for a long time and run up a large bill. Many people have been using kill files for news articles for a long time, but some mail systems now support kill files too. The author also inadvertently sent a 120K MIME message containing graphics to an experimental MIME based list, without realising that the list was based at a site that paid 9p per kilobyte for international E-mail. With a dozen people on the list, this message could easily have cost 9p * 12 * 120 = £144 to forward on to all recipients if it had not been detected. Clearly, many telecommuters would welcome a facility which could save this much money automatically.

 

Instead of simply deleting long messages, a more intelligent mechanism might be to automatically redirect long messages to an account based at work or to extract just the text portion from a message containing sound and graphics. The facilities people have at home may also be less sophisticated than the facilities available at a University or computer company. It unnecessarily adds expense to a user's phone bill if they are paying to down-load a multi-media message (often over 500K bytes) containing sound and graphics to a machine that does not have the capability to display the graphics or play the sound in the message.

 

Another possible use for people with computers at home is to use a filterer at the office to forward mail which is likely to be non-work related onto their electronic address at home. This saves time in the office sorting the work related from the non-work related messages and allows non-work related messages to be replied to at leisure in personal time at home. In addition, messages that the teleworker is likely to want to read at home as well as in the office can easily be copied by a filterer so that work can continue at home if necessary.

 

Experienced mail filterers seem to adopt strategies for managing their mail. [Mackay89] reports three main strategies, namely:

 

i.    Keep it all

 

ii.   Move unimportant messages

 

iii. Move important messages.

 

The strategy (i) is not explained in the article, but is likely to include appending characteristics to mail messages (e.g. additional headers) or printing out particular messages. Neither of these actions has been implemented in this project, although procmail itself is capable of doing this.

 

Strategy (ii) moves unessential messages out of the in-box and uses the in-box as a store for unprocessed messages and things to do. Rules are used to identify low priority messages and to move them into folders or delete them. This prevents low priority messages from building up and cluttering a user's in-box.

 

Strategy (iii) involves writing rules to recognise high priority messages and moving them to a "priority" folder. Many people classify mail that is addressed personally to them (as opposed to mail received from a distribution list) as important. All these users disciplined themselves to read the "priority" folder first. The article reports that one user who was initially distrustful of mail filtering on incoming mail decided to create two rules to manage all his mail. One rule identified all personally addressed mail and moved it to a "priority" folder and another identified all mail related to a conference he was running and this rule sent all these messages to the conference administrator. This user said that this strategy was "very, very useful", that his mail before was "out of control" and that using these two rules "changed my life".

 

2.4.    Existing Filterers

 

To find out what tools are available, questions were asked on a number of usenet newsgroups and other on-line resources were located and searched. By querying the HCI Bibliography Database [ER8] under the keyword "rules" and "filter", a useful study was found on how people use mail filterers, the kind and number of rules they define and the way they are grouped [Mackay89].  This paper mentions that moving and deleting mail messages are the two most common uses for a filterer and that the majority of users prefer to have rules execute automatically as mail arrives, rather than applying rules retroactively to mail which has already been delivered.

 

In October 1993, the author spent two days in Seattle, Washington visiting the Pine development team. During discussions, it became apparent that a mail filterer could be considered as a completely different program to the Pine mailer itself. Rather than making the filterer integral, a filterer could simply run between the incoming mail daemon which accepts the messages on the system and the Pine mailer which allows the users to read and reply to mail. With many filtering actions, it is possible that the mails might never reach Pine at all, this would be the case if the message being filtered was being forwarded to another system, or being piped into an application or being deleted. It seemed sensible therefore not to consider the filterer to be a part of Pine, but to consider writing it as a separate program which would generate output not only for Pine but for other mail systems too. To fully evaluate the implications for writing a separate filterer, two mail systems [ER5] [Mackay89] were studied which do have integrated filterers to understand the possible disadvantages of not implementing an application integrated with Pine.

 

The outcome of researching into separate filterers against integrated filterers resulted in the following key points:

 


Advantages of a combined mail reader and filterer

 

i)   The mail reader and the filterer are likely to have a consistent and integrated user interface. This is the case with Lotus's cc:Mail V2.0 [ER5] which was evaluated and which has filtering capabilities. The method of entering rules is very much with the same "look and feel" as using the rest of the mail system.

 

ii)  Many users think of rules whilst reading mail [Mackay89]. An integrated filterer is therefore likely to be quicker and more convenient for users to access and add rules to when reading messages.

 

iii) The mail reader is able to view "deleted" messages. In The Information Lens [Malone87a] [Dix93], messages that have been deleted are still accessible by the user, but they are presented with a line drawn through them. This allows the user to verify that the correct messages are being deleted and allows "deleted" messages to be retrieved if necessary. If the filterer is independent, then messages are nearly always deleted before reaching the user's mailbox and the user never sees them.

 

iv) An integrated filterer is able to access the functionality in the mail reader that deals with handling structured MIME messages and decomposing them. This allows much more sophisticated handling (e.g. deleting part of a message)

 

 

Advantages of having the mail reader and filterer as separate applications

 

i)   A separate filterer is not dependent on one mail system, it can work can work with whatever mail tool the user prefers to read their mail with (e.g. Pine, ream, Elm)

 

ii)  It is easier to build a separate mail filterer than it is to write a standalone application, due to the increased dependencies. This is particularly true when the mail reader has bugs and is still being developed.

 

iii) A separate filterer is usually more easily ported to other platforms, as it is smaller.

 

iv) A separate filterer means that the filtering can take place on a system that the mail reader might not run on.

 

v)   A separate filterer does not require integration with the existing code. This can be particularly problematic if the two applications are written in different languages and under different paradigms (e.g. OO and non-OO). Research conducted in this area showed that many people who had tried to implement OO code on top of existing non-OO code had to abandon the project and rewrite everything in OO. This is of particular issue with this project as Pine was written in C and is non-OO. Interestingly, the author does not know the language which procmail is written in as it has never been necessary to know this.

 

vi) A separate filterer does not require the same learning time to write, as it is not necessary to learn how to interface with the mail reading code.

 

Having considered the advantages and disadvantages, it was decided that an integrated mail filterer can potentially offer more functionality but the logistics of writing an integrated filterer were outweighed by the need to minimise difficulties associated with integrating an OO application with a non-OO application and having to understand the internals of a volatile mail reader. Therefore a decision was reached to write a filterer as a separate application.

 

2.4.1.   Diagram of a mail system with filterer

 

The following diagram shows where a separate mail filterer fits into the mail system and demonstrates some of its capabilities.

 

The decision to separate the filterer from Pine resulted in the project proposal detailed in section 1.1 being revised on the 8th of November. This revision was to make a separate mail filter the key deliverable of the project.

 

Writing a separate filterer from Pine also meant being no longer dependent on the Pine development team, and this lessened risks associated with the project. From the logistical point of view, there was also insufficient Internet access from Napier University to write extensions to Pine. Pine is approximately 4 Mb and the only ways of receiving updates to the code would either have been to have them posted on disk, sent by FTPmail or copied to Edinburgh University and downloaded to several floppy disks from there. Having used FTPmail once to install Pine for evaluation purposes, it was evident that this was an unacceptable method of working as the mail gateway resulted in the files being split into dozens of mail messages that have to be manually edited and reassembled in the correct order. This method was tried once and it was decided that it was not feasible to use FTPmail again. Both of the disk options would also have been very time consuming.

 

A number of filterers exist already, and these have been categorised here by whether they have a Graphical User Interface (GUI) or whether they are dependent on the user editing a text file and writing the rules manually. A useful review of GUI E-mail packages was published in [Collin94]. Three of the five products reviewed in this article have integrated filterers.

 

2.4.2.   Text based mail filterers

 

i.         Deliver [ER3]. This is a tool that the author has used previously, but which only runs on VAX/VMS. Filtering is via Boolean logic and is limited to the fields in VAX/VMS mail, namely "from" "to" and "subject".

 

ii.         Procmail (see Appendix I). This is the tool chosen for the project for reasons             mentioned earlier.

 

iii.       Elm Filter [ER4]. This filterer is based on the popular Elm mail tool. This filterer only allows filtering on the "to" "from" "subject" fields, the size of the message and the message content. There is no facility to understand MIME.

 

 

2.4.3.   GUI based mail filterers

 

i.    The Information Lens [Malone87a], [Mackay89]. This provides a forms based interface and allows messages to be filtered based on Boolean logic match criteria on the following header fields: "from" "to" "cc" and "subject" as well as the message contents.

 

ii.   The Andrew Message System (AMS) [ER6].  Little information was obtained about this system, however it is of particular note as it is the only mail tool known to be capable of splitting messages and processing components of a composite MIME mail message. If a user sends a message with non-text components to a non-AMS recipient, AMS can cut out the non-text and replace it with a message indicating what was removed. This intelligent processing of MIME messages could prove very useful for instance to direct just the text components of a MIME message to a home based mail address.

 

iii. BeyondMail 2.0 and BeyondRules. [Collin94] states "BeyondMail broke all the rules in its first version. It included what everyone needed but didn't realise they did- intelligent, programmable rules". This product has by far the most sophisticated rule handling of any application studied, and includes system wide rules that apply to all users, rules that become activate or inactive over time and reminder rules. The filtering interface to the program is the only known commercial filtering application, BeyondRules. BeyondRules offers filtering capability to Microsoft Mail 3.2 which does not have rules. Mail filtering in BeyondMail is cited in [Lindholm93] as one of the three key goals necessary for a commercially successful product.

 

A key point learnt from conducting research into filterers is that it is quickly becoming the norm for commercial mail tools to have either a filterer built in, or have an optional add-on filterer. Mail tools that do not have filterers are now regarded as commercially inferior.

 

2.5.    MIME

 

MIME is a new protocol designed to handle many of the shortcomings with existing mail, which is limited to sending US-ASCII characters. The MIME standard was officially published in June 1992 as RFC1341. [ER1].

 

For an example of a MIME encoded mail message, see Appendix K. The most important header in the message is the one that explains the Content-Type. The example in the appendix is MULTIPART/MIXED. This means that the type is MULTIPART (meaning the message has more than one component) and the types of those components (the subtypes of the message) are MIXED. For a full list of MIME types and subtypes, see Appendix L.

 

Although the implementation of MIME does not require a great leap in technology, there have been several failed attempts at introducing a MIME standard. MIME itself was designed for graceful inclusion in the Internet protocol suite. It does this by not building an entirely new protocol but by adding features to RFC822 mail.  This is called a bottom up approach by Nathaniel Borenstein, the author of the draft MIME standard. Earlier experimental models for Multi-media mail (RFC767, RFC759) took a different approach by building a new transport and document format that did not behave compatibly with the existing mail protocol (RFC822) and would have required disposing of a popular and working model for mail. Ensuring backwards compatibility may not always result in the most academically pleasing implementation (compare the evolutionary C++ with Smalltalk), however an evolutionary approach is usually more likely to result in a working implementation and one that is widely accepted.

 

It was considered important to investigate mail tools that supported MIME so that one could be installed in the Department to generate MIME mail messages and to provide a means of testing out a MIME based filterer

 

The main source of information for MIME based mail systems was the comp.mail.mime Frequently Asked Questions list (FAQ) [ER9]. This is nearly 80Kb of extremely useful information and has a section on commercial and freely available MIME products, including mail systems and news readers. Another excellent source was an M.Sc. report by Magnus Hedberg which covers Multi-media mail systems and Asynchronous Computer Supported Cooperative work [Magnus92].

 

From the research carried out into MIME mailer systems, the Pine system was chosen as there was no other system detailed which matched the hardware available in the Department and which was easy to use. The installation kits for the PC version and UNIX versions of Pine were obtained and Pine was installed on a PC in the Department. Pine was then configured to send messages to the Departmental Suns and the external network to evaluate the system more fully. Once Neil Rumney had installed the required software on the Sun, the UNIX version was also configured. Word of Pine's capabilities and user-friendly interface soon began to spread and it has now become the preferred mail tool of many people in the Department, particularly those who are new to electronic mail. The author now uses Pine on a daily basis, and has configured it to send and receive accented characters such as á, è, í, etc. This has made conversing in languages other than English much more convenient as accented characters can be sent and received through the mail to other MIME users without corruption.

 

2.6.    Development environment

 

Four commercial packages for development under Microsoft Windows were available in the Department. These were:

 

i.    Visual Basic

 

ii.   Visual C++

 

iii. Borland C++

 

iv.  Asymmetrix Toolbook

 

Visual Basic and Asymmetrix Toolbook are ideal for rapid prototyping of user interfaces, however neither supports Object Orientation and so neither was considered suitable for a project of this size. It seemed that for the purposes of the project, there was no difference in the suitability of Borland C++ or Visual C++ and so Borland C++ was chosen as the author had used Borland C++ to  develop two applications earlier in the course.

 

Not having chosen Visual Basic or Asymmetrix Toolbox caused a major problem in that the user interface is a major part of the program and Borland C++ does not provide a good environment to quickly develop a complex user interface. Therefore, an environment had to be found which would allow rapid development of the user interface in the 14 weeks available to develop and document the project.

 

A request was posted to the Internet newsgroups comp.object and comp.lang.c++ to see if there were any suitable applications that could be used with Borland to assist with rapid prototyping. Although these groups are distributed world-wide, the only reply received was from Julian Smart at Edinburgh University AI Department, and mentioned his wxWindows tool. WxWindows is a multi-platform C++ development environment designed to help users write portable code and hide users from many of the difficulties of windows programming. WxWindows also allows HCI based applications to be developed quickly as it provides its own well-documented class library. Although there would be a learning curve associated with wxWindows, the demonstration programs of wxWindows showed that it could provide the required functionality and seemed an ideal choice.  WxWindows also had the benefit of being compatible with the Department's existing mail platform (UNIX) and the Department's proposed mail platform (PCs). A further benefit is that plans are underway to port Pine from MS-DOS to Microsoft Windows. By already having a MIME compatible filterer in Windows, it could integrate well with Pine when the Windows version of Pine is complete.

 

A decision was therefore made that wxWindows would be used as the development environment.  However, there were two problems with wxWindows, namely that no one in the Department knew anything about it (therefore I was dependent on the Internet for help) and the other problem was that wxWindows was developed for use with Turbo C++. This resulted in the author becoming the first person to port wxWindows for use with Borland C++. 

 

It is certain that without wxWindows, the tool would not have been developed as quickly as it has been.

 

3.  System specification

 

To specify the system that was to be built, it was necessary to investigate the requirements of potential users. This would ensure that the aim of designing a system that provided the most useful functionality was fulfilled. From these user requirements, the software and hardware required for such a system was then identified.

 

3.1.    User requirements

 

Clearly with the information overload mentioned in section 2.3, some tools are required to manage the volume of mail. Time is valuable, and the more a computer can assist people to do their job, the more productive they are likely to be. However, it was first necessary to establish the functionality that a mail processing tool should provide.

 

No one in the Department is known to use a mail filterer at the moment. Procmail has not been announced to the Department and no other filterers in the Department are known to exist. This is likely to change however soon, when Napier University becomes fully connected to the Internet and more and more people start to use the Internet and participate in newsgroups. Asking questions in newsgroups, and posting notes to newsgroups can generate many replies via E-mail, some of a low-priority personal nature and others of a high priority work related nature. It has been interesting to note that while this project was underway, a great increase in usage of E-mail within the Department and use of the Internet has taken place.

 

To fulfil the aim of giving users access to the most frequently used filtering actions, the author conducted a survey in the Department to determine which features would be the most useful to implement. Three replies were received from members of the Department who receive large amounts of mail, and their replies are summarised in Appendix E. The action "move message to a folder" was rated as priority 5 (the highest) by all recipients. Other actions that were rated highly include auto-replying to messages and forwarding messages to other users. Whilst the results for this survey were useful, it was felt that as the people surveyed had never used a filterer before, it was necessary to carry out additional research.

 

To anticipate the needs of Departmental users once they had become experienced users, surveys of actual usage of experienced filterer users were sought out. One survey [Mackay89] agreed with the responses from the Department and showed that moving messages was a popular feature. 57% of rules involve filing a message to a folder based on the recipient field. The next most common rule in [Mackay89] was deleting messages, with 28% of the rules in the sample being used to delete. This contrasts with the Departmental responses which indicated that there was very little demand for automatically deleting mail. This difference can perhaps be explained by the fact that if people have never used a filterer before then they are probably reluctant to trust a filterer to delete messages, whereas if they use a filterer frequently they can more readily "trust"  the filterer to delete genuinely unimportant messages.

 

In [Mackay89], it is also reported that in a sample of 13 users, they generated 190 rules between them and each user had between 2 and 35 rules. With 35 rules, it seemed sensible to consider whether groups of rules would be related to a particular "scenario", such as working in the office, being away on a business trip or being on holiday. If a user was on holiday, then they might want to invoke a set of rules that deleted all mail from certain distribution lists, forwarded certain work related mails to other team members and auto-replied to others. However, if a user was working in the office then they might want a different set of rules. It was therefore decided to group rules into scenarios that could contain sets of rules. These scenarios could help users to manage their rules more effectively, particularly since sets of rules could be quickly activated and deactivated simply by activating or deactivating the scenario.

 

As a result of research carried out into mail filterers, particularly Deliver [ER3], it was realised that the order of rules and scenarios was important. Consider a scenario with two rules, one which saves messages from a mailing list into a file and another rule which auto-replies to mail while you are away on holiday. If you receive a message which matches both rules, then it is likely that you would not want an auto-reply going back to the mailing list and possibly hundreds of users on that list. Therefore it is important to save the message to a file and to stop processing at that point so that the auto-reply function is never called. This means that the "save to file" rule must come before the "auto-reply" rule. As a result of the importance placed on rule ordering, it was important to implement a means of examining the order of rules and scenarios and to change the order if necessary.

 


3.2.    System requirements

 

The hardware required to implement the project was identified as being:

 

i.          The Departmental Suns, as this is the hardware platform which people         currently use for electronic mail

 

ii.         A PC connected to the Suns via PC-NFS. The filterer was not implemented on         a Sun as there were indications that people would rather use their PCs for mail.

However, the filterer implemented is designed to be easily ported to a Sun.

 

The aim was to write a system which people could run from PCs to process mail on the Suns as it arrives. However if the Department does not migrate to PCs for mail, then the application could still be used on the Suns under X without major modification.

 

 

The software requirements for this project were as follows:

 

i.    The C++ development environment was chosen as  Borland C++ V3.1 and wxWindows for reasons mentioned in section 2.6

 

ii.   An underlying system capable of filtering mail. This was identified as procmail,

 

iii. Windows 3.1 running on a PC.

 

iv.  A tool capable of generating icons and bitmaps for use with the application. This is Borland's resource workshop V1.02

 

v.   A system capable of receiving mail messages from various sources. The Departmental Suns were used to fulfil this.

 

vi.  Network software to allow the PC where the application is running to modify files on the Sun, where the mail is processed. PC-NFS was used to achieve this.

 

vii. Word for Windows V2.0 to generate this report

 

viii. OMTool V2.0 from the GE Research and Development Center to generate the diagrams for the object models.

 

ix.  Paint Shop Pro V1.02a by JASC Inc, Minnetonka, Minnesota. This was used to  transfer the models in OMTool into this document by screen capture.


4.  Analysis and Design

 

4.1.    Research of Analysis and Design Methods

 

The Rumbaugh method [Rumbaugh91] was the main OO analysis and design method taught on the LSSD course. This method seems to be one of the major methods in use, and offers a useful way of decomposing the problem into an object model, a dynamic model and a functional model. However, this approach has a major weakness in that it results in three models that are difficult to integrate. Michael Blaha, one of the co-authors of [Rumbaugh91] sent the current author the following message regarding integrating the models:

 

      Our integration of the three models in the book is incomplete and unsatisfactory. We  openly acknowledge this. We have improved integration  in our tutorials and will incorporate the new ideas in our future books. Our current understanding of integration of the models is much better than in our book, but quite honestly still has much room for improvement.

 

This weakness in the Rumbaugh method has resulted in several papers attempting to integrate the Object, Dynamic and Functional models [ER10], [Hayes91], [D'Souza93].  However, the author believes that building three models and only mentioning objects in one of them is not the most suitable method for OO Analysis and Design. In the case of this project, the Rumbaugh method was also not considered suitable as the project can be viewed as a form of translator, taking input from the user and translating it into procmail code. This means that after the objects, much of the work is of a functional nature transferring data from one form to another. Rumbaugh however, places the functional model last in the analysis and design phases and so places low importance on this model.

 

The process of combining three different models is noted in [Monarchi92], here it is classified as the "combinative approach" of Object Oriented Analysis and Design. This article also mentions the "pure" approach, which the author of this report favours. The Booch method [Booch94] is an example of the "pure" OO approach, where instead of developing three different models, the object is kept central to the analysis and design and aspects of that object are added to the object at various stages.

 

4.2.    Outcome of Analysis and Design Research

 

The key steps were taken from the Booch method and integrated with the steps detailed in [Henderson-Sellers93]. These steps were then classified to produce the following three stage method for OO Analysis & Design:

 


A)  Object identification stage

i.    Identify candidate object classes (usually nouns in problem domain)

 

ii.   Identify class attributes

 

iii. Identify operations provided by and required of each class

 

B)  Object relationship stage

i.    Establish associations between objects. These can be established by running through interaction scenarios or "use cases" [D'Souza93]

 

ii.   Identify aggregations between objects

 

iii. Identify generalisations between objects

 

Repeat stages A and B twice. On the first repetition add meta-classes representing the solution domain and their relationships with the problem domain. These meta-classes are shown in section 4.4.1. On the second repetition add candidate objects in the meta-classes.

 

C)  Object definition stage

i.    Evaluate the outcomes of stages A) and B) and redesign and optimise as necessary.

      Any recurring patterns in the code should be identified and considered for optimisation. For details of the kinds of patterns seen in Object Oriented code, see [Coad92].

 

iii. Implement the object classes

 

As a rough guide, steps A) and B) define the code that will appear in the header file, and step C) defines the code in the main code file associated with the class (usually .cc or .cpp).

 

It was this stepwise development of objects which was used to perform the analysis and design for this project. This method results in objects always being kept central to the method and so does not result in three unrelated models like Rumbaugh. However, it does make it more difficult to optimise the implementation from the functional or dynamic view but this is not considered to be a major problem.

 

A method consists of two parts, the "process" and the "notation". The author considers the notation for Rumbaugh to be powerful and concise for the Object Model, whereas Booch's method has been strong on process and this is taken further in the latest book describing Booch's method [Booch94] which has approximately twice the space devoted to describing the process as the previous edition. As the author is unfamiliar with using the Booch notation [ER11], and as there are no tools in the Department that support the Booch notation, the Rumbaugh notation has been used to develop the graphical models illustrated in this report. As a full object model detailing every object, attribute and operation would be far too complex to draw, "layering" [Henderson-Sellers93] has been used instead to show the model at a comparatively high level of abstraction.

 

4.3.    Analysis

 

The identification of objects to model for analysis was accomplished quickly. The analysis was achieved by a "bottom-up" approach to identify the objects and their relationships. The final program is a means of generating a procmail script, and so it was the elements of a procmail script that were taken as the first candidates for analysis. A procmail script consists of match criteria and actions and so these were identified as the first objects in the analysis phase. The rule object was introduced next as a means of grouping together the match criteria and actions. Objects were then considered which would have an association with rules. These objects were then considered for associations with other objects and the process repeated until a stable model was reached. This resulted in the first iteration of phases A and B detailed in section 4.2 producing the following diagram representing the object model for the problem domain.

 

4.3.1.   Analysis diagram

        


4.4.    Design

 

Whilst Analysis is concerned with modelling objects in the problem domain, design is concerned with modelling objects in the solution domain. This solution domain includes all the semantic classes in the problem domain, but also adds classes dealing with "interface", "application" and "base/utility"  [Monarchi92]. Design can also cause the classes identified during analysis to be redesigned or extended if abstractions are found.

 

Using the first iteration of the Analysis and Design method described in section 4.2, the meta-classes for "application", "interface" and "base" classes were added. The outcome of this stage is shown in the following diagram.

 

4.4.1.   Overview design model

 

 

The advantage of separating the problem domain classes from Interface and Utility classes is that the Interface and Utility classes are likely to be much more dependent on the hardware or software platform used for the final implementation. As a result, by having implementation specific code in these classes, it becomes easier to swap these classes for other classes if the resulting application is to be ported to a different hardware or software platform. The implementation of the problem domain classes should be independent of the hardware or software platform chosen for the solution.

 

An interesting outcome of the overview design phase was that a 1-1-1 mapping emerged between many of the classes in the problem domain, the classes in the user interface domain and menu items and tool bar items in the application interface domain, as shown in the following table

 

4.4.2.   Table of correspondences between design meta-classes

Problem domain class

User Interface domain class

Menu item

 

Rule

 

EditRule

 

Rule

Scenario

EditScenario

Scenario

Rule Match Criteria

EditRuleMat

Called via "rule" menu

Rule Actions

EditRuleAct

Called via "rule" menu

Scenarios

EditScenarios

Scenario "list" option

Rules

EditRules

Rule "list" option

 

The second iteration of the OO A&D method detailed in section 4.2 resulted in the following two detailed design diagrams and the introduction of abstract classes such as "MyForm" to allow generic handling of forms for all input. The final iteration of the Analysis and Design method detailed in section 4.2 produced the detailed design diagrams that follow in sections 4.4.3 and 4.4.4.

 

4.4.3.   Design model for problem domain and interface classes

 

4.4.4.   Design model for application interface classes

4.5.    HCI aspects

 

Initial design of the user interface was achieved by testing various layout scenarios for the forms and menus on paper. These ideas formed the basis for the initial forms to be built into the first software prototype.

 

To design an effective user interface Alison Crerar, a lecturer in Human Computer Interfaces (HCI), was asked to take part in an expert walk-through of the first prototype on 16th December. This first prototype was available for evaluation approximately two weeks after coding started. It was considered important to obtain comments from a potential user as soon as possible to ensure that the interface was well designed from the user's point of view and provided suitable functionality.

 

The key points arising from this review are detailed in Appendix F together with the resolutions reached. Although Alison is highly knowledgeable about User Interfaces, she had never come across a mail filterer before and so she was representative of future users in the Department.

 

The original prototype was designed with the standard "File" menu on the menu bar. This was to try and present a consistent user interface for users who were already familiar with Windows applications. However, as a result of the review, this menu was changed to say "Scenario", as most of the functions on the menu were related to scenarios. Later in development, the few functions on this menu not connected with scenarios were moved onto the "Export/Quit" menu.

 

To fulfil the user requirements detailed in section 3.1, it had to be easy for the users to access the most frequently used match criteria and the most frequently used actions based on those criteria. As a result, predefined form