Information Retrieval

Estimated reading time 3–5 minutes

The ISKO event at UCL on Thursday was fascinating. It was a real treat to hear the eminent Brian Vickery summarise the last 75 years of information retrieval developments, setting out the key questions to be answered and the challenges still to be overcome. At 90 years old he has a unique overview, having been a key member of the Classification Research Group and director of SLAIS. He pointed out that most retrieval systems have a particular user community in mind and that this affects the choice of information collected as well as the way the collection is structured. He also argued that being accepted as part of a specialist community involves use of the specialist terminology. I am very interested in the reverse of this – that lack of access to the “rght” terminology is exclusionary. It’s all about shibboleths! He said that key questions at the moment include – whether the costs and effort of building expensive retrieval systems like taxonomies are justified, whether the need for harmonisation is increasing, what is the future for general ontologies, and what needs to be done to improve statistical retrieval systems.

Stephen Robertson from Microsoft Research, who developed search algorithms that still power most of the big search engines today, talked about the TREC competition, which has almost always been won by statistically based searches. He drew a distinction between general purpose search and specialised search for highly specific contexts – such as individual organisations – adding that in general specialist search is lagging behind. He also said that we need to find ways of feeding other sources of knowledge – such as taxonomies – into statistical searching because only by yoking the power of both will we get marked improvements.

Ian Rowlands then talked about the much publicised JISC survey on the “Google generation” concluding that they are much the same as other generations. In all age groups about 20% are expert users of technology and 20% technophobes, with everyone else muddling along in the middle. The JISC project team observed that some people spend a long time looking at online navigation systems, sometimes without accessing any articles at all. It is hard to know whether this counts as success or failure. I can think of scenarios either way – often I just want to know what’s there and will return later, sometimes it means I can rule out a source as useless (which might be a good thing if it has saved me the time of reading through irrelevant articles or might be a bad thing if it means I can’t find what I need).

There was then a very interesting discussion in which people expressed concerns about information overload and the way that students find it hard to distinguish between authoritative and trivial sources. Ian lamented the fact that online you don’t have the visual clues that you had in physical libraries – big chunky leather bound books have an obvious “weight” and authority. Personally, I wonder how much this has been driven by the desire of publishers and teachers to make educational resources “fun”. If all your text books look like adverts and all your online learning resources look like pop videos, how are you going to learn which is which? It is perfectly possible to have an authoritative online style and publishers will produce it if that is what sells best. Throughout my career I have urged “authoritativeness” in design and been told by marketing departments that it isn’t what parents, teachers and kids want – they’ll only buy it if it looks flashy and fluffy! Another issue is the lack of a canon in a post-modern world – but that’s another story!

Here’s a post on the event on Madi Solomon’s Taxonomy Society blog.

Language and Social Identity

    Start a conversation 
Estimated reading time 1–2 minutes

Language and Social Identity is a collection of fascinating sociolinguistic papers. Dealing with gender and ethnicity, the researchers seek to show how stereotypes often arise from simple linguistic misunderstandings. For example, one paper argues that speakers of Indian English tend to use pronouns, conjunctions, and intonation very differently to speakers of UK English. UK speakers typically fail to pick up on the Indian English speakers’ cues and assume that what they are saying is confused or incoherent. Conversely, Indian English speakers think the UK English speakers must be either daft or extremely patronising because of their apparent failure to understand very simple logic. Another paper claims that men and women typically use utterances like “mm hmm” to mean different things. Women mean simply “I’m listening”, whereas men mean emphatically “I agree”. Men then think that women keep changing their minds and women think men just aren’t listening!

The most relevant paper from a taxonomic point of view was one on the highly charged political nature of language use in Montreal. The need to cut across language differences and negotiate norms of communication when diverse groups feel they have something to lose through compromise mirrors the inter-departmental language mediation that usually needs to happen in taxonomy projects.

SAGE journals free trials

    Start a conversation 
< 1 minute

SAGE will be running a free trial to its entire
portfolio of Information Science journals throughout July and August. To sign up (for access to
journals such as the IFLA Journal, Journal of Information Science and Information
Development) go to
(from the 1st of July). Alternatively email to be informed when the trial goes live.

Sorting Things Out

    1 comment 
Estimated reading time 1–2 minutes

Sorting Things Out – Classification and its Consequences is a joy of a book, crammed with research and insights. It is very well written but is aimed at a serious academic audience, so pretty dense and packed with references. Bowker and Star examine in depth the development of the International Classification of Causes of Death, going back to 17th century archives and considering how something as apparently obvious and clearcut as death is in fact mired in political, religious, and economic biases. They go on to discuss the treatment of TB patients and the development of the Nursing Interventions Classification, again both of which would appear to be “objectively measurable” but are revealed to be complex intertwinings of various pressures. They then assess South Africa’s system of apartheid from the point of view of classification, showing how the arbitrary categorisation of people added to the brutality and cruelty of the regime. The book is not just a stark warning of how dominant regimes can use classification as a tool of oppression, but is also an important investigation of the powerplays involved in all categorisations.

Essential Classification

    Start a conversation 
< 1 minute

Here’s a review of Essential Classification by Vanda Broughton, a core Library Studies textbook and very easy read. It’s a sound introduction to classification – very practical and really aimed at trainee librarians, but included enough background and theory to keep me interested, including some pointers to the biases in the big classification systems. I was also intrigued by the assertion that people find it easier to remember numbers, so numerical shelfmarks are generally more popular than those based on letters. I always thought it was easier to remember letters, because you can make them into little phrases, but perhaps that’s just me!