Tag Archives: search

Applying Turing’s Ideas to Search

    Start a conversation 
Estimated reading time 1–2 minutes

Applying Turing’s Ideas to Search – Boxes and Arrows: The design behind the design applies the Turing test to the problem of understanding searches in order to provide better results. Ferrara suggests we need to revisit the parsing approach (moving on from the pattern-matching paradigm) and to develop “social ontologies” in order to get better search results. The “social ontologies” are – if I have understood correctly – wikis of relationships that can then be accessed by search engines to make semantic inferences. The ontologies would have to be socially constructed as there is just too much information out there to put it all together any other way. It struck me that this is a bit like what SKOS is essentially hoping to do. Once upon a time I wanted to build a fully linked thesaurus of the English language where every word was linked to every related word, so you could navigate through the entire language, following pathways of meaning, with no word left out. People thought it was a daft idea, but compared with trying to build ontologies of everything, it doesn’t seem so crazy. Just shows how times have changed!

Information Retrieval

    4 comments 
Estimated reading time 3–5 minutes

The ISKO event at UCL on Thursday was fascinating. It was a real treat to hear the eminent Brian Vickery summarise the last 75 years of information retrieval developments, setting out the key questions to be answered and the challenges still to be overcome. At 90 years old he has a unique overview, having been a key member of the Classification Research Group and director of SLAIS. He pointed out that most retrieval systems have a particular user community in mind and that this affects the choice of information collected as well as the way the collection is structured. He also argued that being accepted as part of a specialist community involves use of the specialist terminology. I am very interested in the reverse of this – that lack of access to the “rght” terminology is exclusionary. It’s all about shibboleths! He said that key questions at the moment include – whether the costs and effort of building expensive retrieval systems like taxonomies are justified, whether the need for harmonisation is increasing, what is the future for general ontologies, and what needs to be done to improve statistical retrieval systems.

Stephen Robertson from Microsoft Research, who developed search algorithms that still power most of the big search engines today, talked about the TREC competition, which has almost always been won by statistically based searches. He drew a distinction between general purpose search and specialised search for highly specific contexts – such as individual organisations – adding that in general specialist search is lagging behind. He also said that we need to find ways of feeding other sources of knowledge – such as taxonomies – into statistical searching because only by yoking the power of both will we get marked improvements.

Ian Rowlands then talked about the much publicised JISC survey on the “Google generation” concluding that they are much the same as other generations. In all age groups about 20% are expert users of technology and 20% technophobes, with everyone else muddling along in the middle. The JISC project team observed that some people spend a long time looking at online navigation systems, sometimes without accessing any articles at all. It is hard to know whether this counts as success or failure. I can think of scenarios either way – often I just want to know what’s there and will return later, sometimes it means I can rule out a source as useless (which might be a good thing if it has saved me the time of reading through irrelevant articles or might be a bad thing if it means I can’t find what I need).

There was then a very interesting discussion in which people expressed concerns about information overload and the way that students find it hard to distinguish between authoritative and trivial sources. Ian lamented the fact that online you don’t have the visual clues that you had in physical libraries – big chunky leather bound books have an obvious “weight” and authority. Personally, I wonder how much this has been driven by the desire of publishers and teachers to make educational resources “fun”. If all your text books look like adverts and all your online learning resources look like pop videos, how are you going to learn which is which? It is perfectly possible to have an authoritative online style and publishers will produce it if that is what sells best. Throughout my career I have urged “authoritativeness” in design and been told by marketing departments that it isn’t what parents, teachers and kids want – they’ll only buy it if it looks flashy and fluffy! Another issue is the lack of a canon in a post-modern world – but that’s another story!

Here’s a post on the event on Madi Solomon’s Taxonomy Society blog.

Just read the headlines

    Start a conversation 
Estimated reading time 1–2 minutes

At UCL David Nicholas gave an interesting lecture on The Virtual Scholar. He pointed out that people are apparently now happy to make major life decisions based on the results they get from typing 1.8 words into Google (the average for a search).

The information society is not much good if no-one is actually gaining any real knowledge. Do diminishing attention spans mean that we are happy just to flick through a few headlines that confirm our existing prejudices? It’s an Eddie Izzard joke about being “thinly read” but don’t we still need some people to read more than just the summary? I’m as guilty as everyone else of saving a pdf to read later, while feeling that I have somehow become better informed in the process.