Organising e-mail using an Ontology

Like most people I receive e-mail. Like some people I like to save that e-mail for future reference. Like virtually no-one I want to have a unique, somewhat formal way of storing that e-mail so that I can find it again later. The traditional approach to this has been to use folders: Many people store e-mail by the sender’s name, some by when the message was sent. The downsides to this approach include the fact that in order to find the message again you must know the single item of information used to file the e-mail, if a message was sent to many people you may have to store multiple copies of the message etc etc

Enter tags: Tags should allow you to add extra information to your filing system: Now you can record the fact that this e-mail from Clive was about the “Winter Weekend”, that one was also from Clive but it was about “sport” and there is a group of your friends called the sports club who regularly conduct e-mail discussions. The problems with this approach include that without a controlled vocabulary the number of tags can become too numerous to manage, how do you know whether to classify something as “film”, “cinema” or “movie” (or even “movies”) and even knowing what this message is actually about.

Fortunately, there are some readily available resources: to determine whether to use film, movie or cinema use WordNet (BTW the answer is “movie”. And “cinema” is a different thing (the building that shows the movie) so don’t use them interchangeably) and to know where to classify the message use SUMO: So the WinterWeekend is a Meeting, possibly a SocialParty but not a FormalMeeting. And just to complete the hierarchy- a Meeting is a SocialInteraction, which is an IntentionalProcess, which is a Process, which is Physical, which is an Entity. Phew! (And I am stopping putting in the hyperlinks now- go look them up yourself)

Which all sounds fantastic. And it is, but I want more: The WinterWeekend was for a SocialGroup (called the Strollers, which is a GroupofPeople, which is a Group, which is a Collection, which is also Physical and still an Entity.

So what I need is the ability to classify my Tag WinterWeekend in mulitple locations in the SUMO hierarchy (or taxonomy to give its proper title). And I am fairly sure no e-mail system allows me to do that (on account of the fact that none of the tools I use even have the SUMO concepts built into them). And yet there is more: the O in SUMO stands for Ontology. And I want an ontology to be “a taxonomy with attributes“. So the WinterWeekend is a SocialParty which means that it has a date, a location etc. And the WinterWeekend involves a group so it has members and those members who showed up (attendees) etc.

SUMO doesn’t define the attributes for each item, so I have to make them up. OK, I am a data modeller by profession so I can make a good stab at the attributes, but it shouldn’t be up to me. But if I do define the attributes then I can use my e-mail to answer the query: Find me all the messages from November 2007 that refer to Lisa. And it should find Clive’s message about the WinterWeekend because Lisa is a member of the Strollers and the WinterWeekend took place in November.

Now maybe I am the only anally rententive person in the world who would try to file e-mail this way, but the point is; if I set up the tag WinterWeekend in the taxonomy, it will prompt me for the metadata (or even extract it from my Calendar) and from then on all I need to do is apply the tag and all the attached information is immediately available. To quote Mary Chapin Carpenter: “Is it too much to ask?”

IT and Helicoptering

The IT world is changing: We are starting to try to create a clever, semantic web that understands things (but why does web 2.0 seem to consist of endless Facebook applications and bad MySpace pages?) we are trying to capture tricky data (that will be pretty much all metadata) by defining the Web Ontology Language (

BTW I think I finally know what an ontology is: An ontology is a taxonomy with attributes. And a taxonomy is a controlled vocabulary organised into a hierarchical structure (

This creates a nice logical progression: controlled vocabulary -> taxonomy -> ontology. So why can’t I find a straightforward description of this anywhere on the web? I can think of two reasons:

  1. I am wrong. Eminently possible. I make mistakes all the time.
  2. The people involved with ontologies are so buried in the weeds that they are unable to abstract their passion and make it accessible.

There are more examples of this attempt to move IT into semantics and meaning or

And this brings me on to helicoptering: I work as a Solution Architect, meaning I have to understand quite a lot about data, applications, process and infrastructure (though I tend to leave the boxes and cables to other guys) and then ensure that the system I am involved with meets the business’ needs. So when articles are presented about giving data meaning (which is what users expect) or enabling users to define the system (which they try to do and invariably fail) it has to be possible to provide a logical connection between the technical detail (which is in the articles), the system specification (the sort of thing that I will create) and the business strategic leaders (the management speak)

The abiilty for one person to dive into the technical detail, ensure it is consistent with the overall architecture and can be communicated to the strategic leaders is called helicoptering. And IT people are generally extremely bad at helicoptering.

I can’t complain too much, because part of why I get the work that I do is that I have some ability to transcend these levels. But as a larger body of people, I implore us:

Use people who can operate at multiple levels of detail. Please. Otherwise all these smart ideas will struggle to gain wider acceptance. And we do, genuinely, need them to be accepted.