Classification and Ontology – UDCC Seminar 2011

26th September, 2011 Fran Start a conversation
Estimated reading time 2–4 minutes

I thoroughly enjoyed the third biennial International UDC Consortium seminar at the National Library of the Netherlands, The Hague, last Monday and Tuesday. The UDC conference website includes the full programme and slides and the proceedings have been published by Ergon Verlag.

This is a first of a series of posts covering the conference.

Aida Slavic, UDC editor-in-chief, opened the conference by pointing out that classification is supposed to be an ordered place, but systems and study of it are difficult and complex. We still lack terminology to express and discuss our work clearly. There is now an obvious need to instruct computers to use and interpret classifications and perhaps our work to make our classifications machine readable will also help us explain what we do to other humans.

On being the same as

Professor Patrick Hayes of the Florida Institute for Machine Learning and Cognition delivered the keynote address, pointing out that something so simple as asserting that one thing is the same as another is actually incredibly difficult and one of the problems facing the development of the Semantic Web is that people are asserting that two things are the same when actually they are merely similar.

He explained that the formalisms and logic underpinning the Semantic Web are all slimmed down versions of modern 20th century logic based on a particular world view and set of assumptions. This works very well in theory, but once you start applying such logics to the real messy and complex world with real objects, processes, and ideas, the logics are put under increasing stress.

In logic, when two things are referred to as the same, this means they are two different names for the same thing, not that there are two things that are logically equivalent. So, Paris, the city of my dreams, and Paris the administrative area, Paris throughout history, and Paris – the capital of France are not necessarily all the same. This means that in logic we have to separate out into different versions aspects of an idea that in ordinary language we think of as the same thing.

He described this as the problem of “logic versus Occam” (as in Occam’s razor). Logic drives us to create complexity, in that we have to precisely define every aspect of a concept as a different entity. In order for the Semantic Web to work, we need to be very clear about our definitions so that we don’t muddle up different aspects of a concept.

Top

Building Enterprise Taxonomies – Book Review

29th August, 2011 Fran Start a conversation
Estimated reading time 2–2 minutes

There aren’t many books on taxonomies, so it is good to have another on the shelf. Darin L Stewart‘s book is based on a series of lectures and provides a good introduction to key topics. As a format, that means you can pick the sections that are relevant to you. It has a very American student textbook tone, with pop quotes and definitions of key concepts in information science (e.g. precision and recall), but that doesn’t mean it isn’t a useful refresher for professionals. I particularly enjoyed the sections on XML, RDF, and ontologies as most of the coverage of these topics is either highly technical or very abstract. As the title suggests, it has a very corporate focus and so doesn’t really cover scientific taxonomies or library classifications.

The chapters introduce the concept of findability, cover the basics of metadata, types of taxonomies, how to go about developing a taxonomy and performing a content audit, general guidance on choosing terms and structures, some of the technical issues – introducing XML, XSLT, RDF, and OWL, and summarising ontologies and folksonomies.

I found a few typos and a few places slightly odd – for example I found the use of “Google whacking” to illustrate “teleporting” confusing and the descriptions of how to go about taxonomy work to be a little prescriptive. However, textbooks have to simplify the world in order to provide students with a starting point. Overall the book covers a good range of topics and concepts and is a light but informative read.

Top

Online Information Conference – day two

9th January, 2011 Fran 1 comment
Estimated reading time 6–10 minutes

Linked Data in Libraries

I stayed in the Linked Data track for Day 2 of the Online Information Conference, very much enjoying Karen Coyle‘s presentation on metadata standards – FRBR, FRSAR, FRAD, RDA – and Sarah Bartlett‘s enthusiasm for using Linked Data to throw open bibliographic data to the world so that fascinating connections can be made. She explained that while the physical sciences have been well mapped and a number of ontologies are available, far less work has been done in the humanities. She encouraged humanities researchers to extend RDF and develop it.

In the world of literature, the potential connections are infinite and very little numerical analysis has been done by academics. For example, “intertextuality” is a key topic in literary criticism, and Linked Data that exposes the references one author makes to another can be analysed to show the patterns of influence a particular author had on others. (Google ngrams is a step in this direction, part index, part concordance.)

She stressed that libraries and librarians have a duty of care to understand, curate, and manage ontologies as part of their professional role.

Karen and Sarah’s eagerness to make the world a better place by making sure that the thoughtfully curated and well-managed bibliographic data held by libraries is made available to all was especially poignant at a time when library services in the UK are being savaged.

The Swedish Union Catalogue is another library project that has benefited from a Linked Data approach. With a concern to give users more access to and pathways into the collections, Martin Malmsten asked if APIs are enough. He stressed the popularity of just chucking the data out there in a quick and dirty form and making it as simple as possible for people to interact with it. However, he pointed out that licences need to be changed and updated, as copyright law designed for a print world is not always applicable for online content.

Martin pointed out that in a commercialised world, giving anything away seems crazy, but that allowing others to link to your data does not destroy your data. If provenance (parametadata) is kept and curated, you can distinguish between the metadata you assert about content and anything that anybody else asserts.

During the panel discussion, provenance and traceability – which the W3C is now focusing on (parametadata) – was discussed and it was noted that allowing other people to link to your data does not destroy your data, and often makes it more valuable. The question of what the “killer app” for the semantic web might be was raised, as was the question of how we might create user interfaces that allow the kinds of multiple pathway browsing that can render multiple relationships and connections comprehensible to people. This could be something a bit like topic maps – but we probably need a 13-year-old who takes all this data for granted to have a clear vision of its potential!

Tackling Linked Data Challenges

The second session of day two was missing Georgi Kobilarov of Uberblic who was caught up in the bad weather. However, the remaining speakers filled the time admirably.

Paul Nelson of Search Technologies pointed out that Google is not “free” to companies, as they pay billions in search engine optimisation (SEO) to help Google. Google is essentially providing a marketing service, and companies are paying huge amounts trying to present their data in the way that suits Google. It is therefore worth bearing in mind that Google’s algorithms are not resulting in a neutral view of available information resources, but are providing a highly commercial view of the web.

John Sheridan described using Linked Data at the National Archives to open up documentation that previously had very little easily searchable metadata. Much of the documentation in the National Archives is structured – forms, lists, directories, etc. – which present particular problems for free text searches, but are prime sources for mashing up and querying.

Taxonomies, Metadata, and Semantics: Frameworks and Approaches

There were some sensible presentations on how to use taxonomies and ontologies to improve search results in the third session.
Tom Reamy of KAPS noted the end of the “religious fervour” about folksonomy that flourished a few years ago, now that people have realised that there is no way for folksonomies to get better and they offer little help to infrequent users of a system. They are still useful as a way of getting insights into the kind of search terms that people use, and can be easier to analyse than search logs. A hybrid approach, using a lightweight faceted taxonomy over the top of folksonomic tags is proving more useful.

Taxonomies remain key in providing the structure on which autocategorisation and text analytics is based, and so having a central taxonomy team that engages in regular and active dialogue with users is vital. Understanding the “basic concepts” (i.e. Lakoff and Rosch’s “basic categories”) that are the most familiar terms to the community of users is vital for constructing a helpful taxonomy and labels should be as short and simple as possible. Labels should be chosen for their distinctiveness and expressiveness.

He also pointed out that adults and children have different learning strategies, which is worth remembering. I was also pleased to hear his clear and emphatic distinction between leisure and workplace search needs. It’s a personal bugbear of mine that people don’t realise that looking for a hairdresser in central London – where any one of a number will do – is not the same as trying to find a specific shot of a particular celebrity shortly after that controversial haircut a couple of years ago from the interview they gave about it on a chat show.

Tom highlighted four key functions for taxonomies:

knowledge organisation systems (for asset management)
labelling systems (for asset management)
navigation systems (for retrieval and discovery)
search systems (for retrieval)

He pointed out that text analytics needs taxonomy to underpin it, to base contextualisation rules on. He also stressed the importance of data quality, as data quality problems cause the majority of search project failures. People often focus on cool new features and fail to pay attention to the underlying data structures they need to put in place for effective searching.

He noted that the volumes of data and metadata that need to processed are growing at a furious rate. He highlighted Comcast as a company that is very highly advanced in the search and data management arena, managing multiple streams of data that are constantly being updated, for an audience that expects instant and accurate information.

He stated that structure will remain the key to findability for the foreseeable future. Autonomy is often hailed as doing something different to other search engines because it uses statistical methods, but at heart it still relies on structure in the data.

Richard Padley made it through the snow despite a four-hour train journey from Brighton, and spoke at length about the importance of knowledge organisation to support search. He explained the differences between controlled vocabularies, indexes, taxonomies, and ontologies and how each performs a different function.

Marianne Lykke then talked about information architecture and persuasive design. She also referred to “basic categories” as well as the need to guide people to where you want them to go via simple and clear steps.

Taxonomies, Metadata, and Semantics in Action

I spoke in the final session of the day, on metadata life cycles, asset lifecycles, parametadata, and managing data flows in complex information “ecosystems” with different “pace layers”.

Neil Blue from Biowisdom gave a fascinating and detailed overview of Biowisdom’s use of semantic technologies, in particular ontology-driven concept extraction. Biowisdom handle huge complex databases of information to do with the biological sciences and pharmaceuticals, so face very domain-specific issues, such as how to bridge the gap between “hard” scientific descriptions and “soft” descriptions of symptoms and side-effects typically given by patients.

In the final presentation of the day, Alessandro Pica outlined the use of semantic technologies by Italian News agency AGI.

Top

In the beginning was the word: the evolution of knowledge organisation

28th November, 2010 Fran 1 comment
Estimated reading time 3–5 minutes

I was delighted to be introduced by Mark Davey to Leala Abbott on Monday. Leala is a smart and accomplished digital asset management consultant from the Metropolitan Museum of Art in New York and we were discussing how difficult it is to explain what we do. I told her about how I describe “the evolution of classification” to people and she asked me to write it up here. So, this is my first blog post “by commission”.

word
In the beginning there was the word, then words (and eventually sentences).

list
Then people realised words could be very useful when they were grouped into lists (and eventually controlled vocabularies, keyword lists, tag lists, and folksonomies).

taxonomy
But then the lists started to get a bit long and unwieldy, so people broke them up into sections, or categories, and lo and behold – the first taxonomy.

faceted taxonomy
People then realised you could join related taxonomies together for richer information structuring and they made faceted taxonomies, labelling different aspects of a concept in the different facets.

ontology
Then people noticed that if you specified and defined the relationships between the facets (or terms and concepts), you could do useful things with those relationships too, which becomes especially powerful when using computers to analyse content, and so ontologies were devised.

Here is a very simple example of how these different KO systems work:

I need some fruit – I think in words – apples, pears, bananas. Already I have a shopping list and that serves its purpose as a reminder to me of things to buy (I don’t need to build a fruit ontology expressing the relationships between apples and other foodstuffs, for example).

When I get to the shop, I want to find my way around. The shop has handy signs – a big one says “Fresh fruit”, so I know which section of the shop to head for. When I get there, a smaller sign says “Apples” and even smaller ones tell me the different types of apples (Gala, Braeburn, Granny Smith…). The shop signs form a simple taxonomy, which is very useful for helping me find my way around.

When I get home, I want to know how to cook apple pie, so I get my recipe book, but I’m not sure whether to look under “Apples” or “Pies”. Luckily, the index includes Apples: Pies, Puddings and Desserts as well as Pies, Puddings and Desserts: Apples. The book’s index has used a faceted taxonomy, so I can find the recipe in either place, whichever one I look in first.

After dinner, I wonder about the history of apple pies, so I go online to a website about apples, where a lot of content about apple pies has been structured using ontologies. I then can search the site for “apple pie” and get suggestions for lots of articles related to apples and pies that I can browse through, based on the ideas that the people who built the ontology have linked together. For example, if the article date has been included, I could also ask more complex questions such as “give me all the articles on apple pies written before 1910”, and if the author’s nationality has been included, I could ask for all the articles on apple pies written before 1910 by US authors.

People often ask me if a taxonomy is better than a controlled vocabulary, or if an ontology is the best of all, but the question doesn’t make sense out of context – it really depends what you are trying to do. Ontologies are the most complex and sophisticated KO classification tools we have at the moment, but when I just want a few things from the shop, it’s a good old fashioned list every time.

Top

Financial sector ontologies

14th October, 2010 Fran Start a conversation
Estimated reading time 1–2 minutes

I went to a Semantic Web meetup event on Tuesday where Mike Bennett of the EDM introduced an ontology for managing financial sector information that he has been developing.

It is always reassuring to discover that people working in completely different industries are facing the same challenges. Handling multiple viewpoints and the need to keep provenance of terminology well defined and clear was a key theme, as terms like “equities” can mean very different things in different contexts. Mike defined his own “archetypes” and used an “upper ontology” to act as an umbrella to connect other ontologies. I was particularly interested in the discussion of solutions for managing synonyms, one of which included a quite sophisticated use of RDF.

It was also interesting to hear Mike’s explanations of his use of taxonomies within the ontology and of the ingenious ways he finds to present his ideas to business people who don’t speak OWL!

Top

Linked Data one-day conference

18th September, 2010 Fran Start a conversation
< 1 minute

I thoroughly enjoyed the Linked Data one-day conference organised by ISKO UK last week. You can find my summary of it on the ISKO UK blog.

Top

Are you a semantic romantic?

15th June, 2010 Fran Start a conversation
Estimated reading time 8–12 minutes

The “semantic web” is an expression that has been used for long enough now that I for one feel I ought to know what it means, but it is hard to know where to start when so much about it is presented in “techspeak”. I am trying to understand it all in my own non-technical terms, so this post is aimed at “semantic wannabes” rather than “semantic aficionados”. It suggests some ways of starting to think about the semantic web and linked open data without worrying about the technicalities.

At a very basic level, the semantic web is something that information professionals have been doing for years. We know about using common formats so that information can be exchanged electronically, from SGML, HTML, and then XML. In the 90s, publishers used “field codes” to identify subject areas so that articles could be held in databases and re-used in multiple publications. In the library world, metadata standards like MARC and Dublin Core were devised to make it easier to share cataloguing data. The semantic web essentially just extends these principles.

So, why all the hype?

There is money to be made and lost on semantic web projects, and investors always want to try to predict the future so they can back winning horses. The recent Pew Report (thanks to Brendan for the link) shows the huge variety of opinions about what the semantic web will become.

On the one extreme, the semantic evangelists are hoping that we can create a highly sophisticated system that can make sense of our content by itself, with the familiar arguments that this will free humans from mundane tasks so that we can do more interesting things, be better informed and connected, and build a better and more intelligent world. They describe systems that “know” that when you book a holiday you need to get from your house to the airport, that you must remember to reschedule an appointment you made for that week, and that you need to send off your passport tomorrow to renew it in time. This is helpful and can seem spookily clever, but is no more mysterious than making sure my holiday booking system is connected to my diary. There are all sorts of commercial applications of such “convenience data management” and lots of ethical implications about privacy and data security too, but we have had these debates many times in the past.

A more business-focused example might be that a search engine will “realise” that when you search for “orange” you mean the mobile phone company, because it “knows” you are a market analyst working in telecoms. It will then work out that documents that contain the words “orange” and “fruit” are unlikely to be what you are after, and so won’t return them in search results. You will also be able to construct more complex questions, for example to query databases containing information on tantalum deposits and compare them with information about civil conflicts, to advise you on whether the price of mobile phone manufacture is likely to increase over the next five years.

Again, this sort of thing can sound almost magical, but is basically just compiling and comparing data from different data sets. This is familiar ground. The key difference is that for semantically tagged datasets much of the processing can be automated, so data crunching exercises that were simply too time-consuming to be worthwhile in the past become possible. The evangelists can make the semantic web project sound overwhelmingly revolutionary and utopian, especially when people start talking in sci-fi sounding phrases like “extended cognition” and “distributed intelligence”, but essentially this is the familiar territory of structuring content, adding metadata, and connecting databases. We have made the cost-benefit arguments for good quality metadata and efficient metadata management many times.

On the other extreme, the semantic web detractors claim that there is no point bothering with standardised metadata, because it is too difficult politically and practically to get people to co-operate and use common standards. In terms familiar to information professionals, you can’t get enough people to add enough good quality metadata to make the system work. Clay Shirky in “Ontology is overrated” argued that there is no point in trying to get commonalty up front, it is just too expensive (there are no “tag police” to tidy up), you just have to let people tag randomly and then try to work out what they meant afterwards. This is a great way of harvesting cheap metadata, but doesn’t help if you need to be sure that you are getting a sensible answer to a question. It only takes one person to have mistagged something, and your dataset is polluted and your complex query will generate false results. Shirky himself declares that he is talking about the web as a whole, which is fun to think about, but how many of us (apart from Google) are actually engaged in trying to sort out the entire web? Most of us just want to sort out our own little corner.

I expect the semantic web to follow all other standardisation projects. There will always be a huge “non-semantic” web that will contain vast quantities of potentially useful information that can’t be accessed by semantic web systems, but that is no different from the situation today where there are huge amounts of content that can’t be found by search engines (the “invisible web” or “dark web”) – from proprietary databases to personal collections in unusual formats. No system has been able to include everything. No archive contains every jotting scrawled on a serviette, no bookshop stocks every photocopied fanzine, no telephone directory lists every phone number in existence. However, they contain enough to be useful for most people most of the time. No standard provides a perfect universal lingua franca, but common languages increase the number of people you can talk to easily. The adoption of XML is not universal, but for everyone who has “opted in” there are commercial benefits. Not everybody uses pdf files, but for many people they have saved hours of time previously spent converting and re-styling documents.

So, should I join in?

What you really need to ask is not “What is the future of the semantic web?” but “Is it worth my while joining in right now?”. How to answer that question depends on your particular context and circumstances. It is much easier to try to think about a project, product, or set of services that is relevant to you than to worry about what everyone else is doing. If you can build a product quickly and cheaply using what is available now, it doesn’t really matter whether the semantic web succeeds in its current form or gets superseded by something else later.

I have made a start by asking myself very basic questions like:

What sort of content/data do we have?
How much is there?
What format is it in at the moment?
What proportion of that would we like to share (is it all public domain, do we have some that is commercially sensitive, but some that isn’t, are there data protection or rights restrictions)?

If you have a lot of data in well-structured and open formats (e.g. XML), there is a good chance it will be fairly straightforward to link your own data sets to each other, and link your data to external data. If there are commercial and legal reasons why the data can’t be made public, it may still be worth using semantic web principles, but you might be limited to working with a small data set of your own that you can keep within a “walled garden” – whether or not this is a good idea is another story for another post.

A more creative approach is to ask questions like:

What content/data services are we seeking to provide?
Who are our key customers/consumers/clients and what could we offer them that we don’t offer now?
What new products or services would they like to see?
What other sources of information do they access (users usually have good suggestions for connections that wouldn’t occur to us)?

Some more concrete questions would be ones like:

What information could be presented on a map?
How can marketing data be connected to web usage statistics?
Where could we usefully add legacy content to new webpages?

It is also worth investigating what others are already providing:

What content/data out there is accessible? (e.g. recently released UK government data)
Could any of it work with our content/data?
Whose data would it be really interesting to have access to?
Who are we already working with who might be willing to share data (even if we aren’t sure yet what sort of joint products/projects we could devise)?

It’s not as scary as it seems

Don’t be put off by talk about RDF, OWL, and SPARQL, how to construct an ontology, and whether or not you need a triple store. The first questions to ask are familiar ones like who you would like to work with, what could you create if you could get your hands on their content, and what new creations might arise if you let them share yours? Once you can see the semantic web in terms of specific projects that make sense for your organisation, you can call on the technical teams to work out the details. What I have found is that the technical teams are desperate to get their hands on high quality structured content – our content – and are more than happy to sort out the practicalities. As content creators and custodians, we are the ones that understand our content and how it works, so we are the ones who ought to be seizing the initiative and starting to be imaginative about what we can create if we link our data.

A bit of further reading:
Linked Data.org
Linked Data is Blooming: Why You Should Care
What can Data.gov.uk do for me?

Top

Using taxonomies to support ontologies

4th April, 2010 Fran 9 comments
Estimated reading time 4–6 minutes

What is an ontology?
Ontologies are emerging from the techie background into the knowledge organisation foreground and – as usually happens – being touted as the new panacea to solve all problems from content management to curing headaches. As with any tool, there are circumstances where they work brilliantly and some where they aren’t right for the job.

Basically, an ontology is a knowledge model (like a taxonomy or a flow chart) that describes relationships between things. The main difference between ontologies and taxonomies is that taxonomies are restricted to broader and narrower relationships whereas ontologies can hold any kind of relationship you give them.

One way of thinking about this is to see taxonomies as vertical navigation and ontologies as horizontal. In practice, they usually work together. When you add cross references to a taxonomy, you are adding horizontal pathways and effectively specifying ontological rather than taxonomical relationships.

The flexibility in the type of relationship that can be defined is what gives ontologies their strength, but is also their weakness in that they are difficult to build well and can be time consuming to manage because there are infinite relationships you could specify and if you are not careful, you will specify ones that keep changing. Ontologies can answer far more questions than taxonomies, but if the questions you wish to ask can be answered by a taxonomy, you may find a taxonomy simpler and easier to handle.

What are the differences between taxonomies and ontologies?
A good rule of thumb is to think of taxonomies as being about narrowing down, refining, and zooming in on precise pieces of information and ontologies as being about broadening out, aggregating, and linking information. So, a typical combination of ontologies and taxonomies would be to use ontologies to aggregate content and with taxonomies overlaid to help people drill down through the mass of content you have pulled together.

Ontologies can also be used as links to join taxonomies together. So, if you have a taxonomy of regions, towns, and villages and a taxonomy of birds and their habitats you could use an ontological relationship of “lives in” to show which birds live in which places. By using a taxonomy to support the ontology, you don’t have to define a relationship between every village and the birds that live there, you can link the birds’ habitats to regions via the ontology and the taxonomy will do the work of including all the relevant villages under that region.

Programmers love ontologies, because they can envisage a world where all sorts of relationships between pieces of content can be described and these diverse relationships can be used to produce lots of interesting collections of content that can’t easily be brought together otherwise. However, they leave it to other people to provide the content and metadata. Specifying all those relationships can be complicated and time-consuming so it is important to work out in advance what you want to link up and why. A good place to start is to choose a focal point of the network of relationships you need. For example, there are numerous ways you could gather content about films. You could focus on the actors so you can bring together the films they have appeared in to create content collections describing their careers, or focus on genres and release dates to create histories of stylistic developments, or you could link films that are adaptations of books to copies of those books. The choices you make determine the metadata you will need.

Know your metadata
At the moment, in practice, ontologies are typically built to string together pre-existing metadata that has been collected for navigational or archival taxonomies, but this is just because that metadata already exists to be harvested. There is a danger in this approach that you end up making connections just because you can, not because they are useful to anybody. As with all metadata-based automated systems, you also need to be careful with the “garbage in garbage out” problem. If the metadata you are harvesting was created for a different purpose, you need to make sure that you do not build false assumptions about its meaning or quality into your ontology – for example, if genre metadata has been created according to the department the commissioning editor worked for, instead of describing the content of the actual programme itself. That may not have been a problem when the genre metadata was used only by audience research to gather ratings information, but does not translate properly when you want to use it in an ontology for content-defining purposes.

Feeding your ontology with accurate and clearly defined taxonomies is likely to give you better results than using whatever metadata just happens to be lying about. Well-defined sets of provenance metadata – parametadata – about your taxonomies and ontologies is becoming more and more valuable so that you can understand what metadata sets were built for, when they were last updated, and who manages them.

Why choose one when you can have both?
Ontologies are very powerful. They perform different work to taxonomies, but ontologies and taxonomies can support and enhance each other. Don’t throw away your taxonomies just because you are moving into the ontology space. Ontologies can be (they aren’t always – see Steve’s comment below) big, tricky, and complicated, so use your taxonomies to support them.

Top

New method for building multilingual ontologies

22nd September, 2008 Fran Start a conversation
< 1 minute

New method for building multilingual ontologies appeared on AlphaGalileo.Org – the Internet-based news centre for European science, engineering and technology. Researchers at the Universidad Politécnica de Madrid’s School of Computing (FIUPM) claim to have created a language-independent ontology-building tool. I think it will work very well for consistent well-structured information – for example in catalogues and directories – but it seems to me that it is essentially being an “auto-indexer” that only really works if you control linguistic forms, and perhaps even vocabulary, very tightly. That’s great – and means plenty of work for editors making sure everything is neat, tidy, and consistent to suit the system – but isn’t it going to be an awful lot of work? Or am I massively missing the point?

Top

Applying Turing’s Ideas to Search

11th September, 2008 Fran Start a conversation
Estimated reading time 1–2 minutes

Applying Turing’s Ideas to Search – Boxes and Arrows: The design behind the design applies the Turing test to the problem of understanding searches in order to provide better results. Ferrara suggests we need to revisit the parsing approach (moving on from the pattern-matching paradigm) and to develop “social ontologies” in order to get better search results. The “social ontologies” are – if I have understood correctly – wikis of relationships that can then be accessed by search engines to make semantic inferences. The ontologies would have to be socially constructed as there is just too much information out there to put it all together any other way. It struck me that this is a bit like what SKOS is essentially hoping to do. Once upon a time I wanted to build a fully linked thesaurus of the English language where every word was linked to every related word, so you could navigate through the entire language, following pathways of meaning, with no word left out. People thought it was a daft idea, but compared with trying to build ontologies of everything, it doesn’t seem so crazy. Just shows how times have changed!

Top

1 2 Next »

Tag Archives: ontology