There aren’t many books on taxonomies, so it is good to have another on the shelf. Darin L Stewart‘s book is based on a series of lectures and provides a good introduction to key topics. As a format, that means you can pick the sections that are relevant to you. It has a very American student textbook tone, with pop quotes and definitions of key concepts in information science (e.g. precision and recall), but that doesn’t mean it isn’t a useful refresher for professionals. I particularly enjoyed the sections on XML, RDF, and ontologies as most of the coverage of these topics is either highly technical or very abstract. As the title suggests, it has a very corporate focus and so doesn’t really cover scientific taxonomies or library classifications.
The chapters introduce the concept of findability, cover the basics of metadata, types of taxonomies, how to go about developing a taxonomy and performing a content audit, general guidance on choosing terms and structures, some of the technical issues – introducing XML, XSLT, RDF, and OWL, and summarising ontologies and folksonomies.
I found a few typos and a few places slightly odd – for example I found the use of “Google whacking” to illustrate “teleporting” confusing and the descriptions of how to go about taxonomy work to be a little prescriptive. However, textbooks have to simplify the world in order to provide students with a starting point. Overall the book covers a good range of topics and concepts and is a light but informative read.
Public knowledge, private ignorance by Patrick Wilson is one of those fascinating books that reads as if it had been written yesterday, but in fact was written in 1977. In what struck me as such a contemporary theme, he discusses the importance of personal contacts and trusted authorities as sources of knowledge – a theme that has returned with a vengeance in the form of “social search” and leveraging social networks for recommendations etc. A wonderful example of this was given to me by a friend last night who told me about how their archives division suddenly gained recognition when the new media lot realised they needed metadata for their rapidly growing digital repositories. The new media folks were in a panic until the talked to the librarians and archivists and realised they had already worked out – and been assiduously cataloguing all the digital assets. Without that personal contact, the new media folk might have ended up building their own catalogue, and duplicating all that work! It’s a sadly familiar story even in these days of information abundance when you’d think such communication would be easy, but even the problem of how to make sure “public” knowledge is actually used is nothing new. Wilson quotes Lord Rayleigh at a meeting of the British Association of the Advancement of Science in1884 (yes – eighteen eighty four – it’s not a typo) as noting how much scientific knowledge was published but unread, saying “It is often forgotten that the rediscovery in the library may be a more difficult and uncertain process than the first discovery in the laboratory”. Wilson points out that “knowledge existing only ‘in the literature’ is no different from knowledge possessed by undiscoverable or inaccessible individuals”.
Wilson also claims that “where there is knowledge, there must be a knower”, which struck me as a challenge to notions that publishing alone – whether it be tweets or academic research papers – is only half the battle. This reinforces to me the absolute fundamentalness of findability and serving the needs of the user both within individual publications and across the whole of our “public” digital repositories. Again, this is nothing new. Way back in the 19th century Charles Cutter was anxious to serve library users better, Grace Kelley (disambiguated from the other one by the extra “e”!) in 1937 was what we would now refer to as a “usability evangelist” and both Ranganathan (1959) and Bliss (1935) had a passion for getting the right information out to the right people.
As Computer Science from the 1970s took over much of information retrieval and then with commercial products being heavily marketed, I worry that this sort of passion has been lost in a blur of what you can get an algorithm to do, rather than what people actually need. As Wilson says “we do not make knowledge available simply by making available documents in which knowledge is represented”. Google is wonderful, but that is, essentially, all it does. Of course the more sophisticated programmatic tools we have the better, but as information providers we should never be afraid to say “this gets us so far, but not far enough”. We need to keep reminding everyone that it is the minds of the knowers and potential knowers we need to be serving and so we should not be afraid to keep demanding ever more sophisticated systems that are mixed, variable, and downright difficult to automate.