Category Archives: KO

KO

Vocab Control

    Start a conversation 
Estimated reading time 2–2 minutes

Having spent years working as an editor fussing over consistency of style and orthography, I shouldn’t have been as surprised as I was to find my tags on even this little blog site, written solely by me, had already become a mess. It didn’t take too long to tidy them up, but there are only a handful of articles here so far.

I worked with some extremely clever people in my first “proper” job back in the 90s, and we used to have a “90%” rule regarding algorithmic-based language processing (we mostly processed very well-structured text). However brilliant your program, you’d always have 10% of nonsense left over at the end that you needed to sort out by hand – mainly due to the vagaries of natural language and general human inconsistency. I’m no expert on natural language processing, but I get the impression that a lot of people still think 90% is really rather good. Certainly auto-classification software seems to run at a much lower success rate, even after manual training. It strikes me that there’s a parallel between folksonomies and this sort of software. Both process a lot of information on cheaply, so make possible processing on a scale that just couldn’t be done before, but you still need someone to tidy up around the edges if you want top quality.

I think the future of folksonomies depends on how this tidying-up process develops. There are various things happening to improve quality – like auto-complete predictive text. Google’s tag game is another approach, and ravelry.com use gentle human “shepherding” of taggers, personally suggesting tags and orthography (thanks to Elizabeth for pointing this one out to me).

I would really like to get hold of some percentages. If 75% is a decent showing for off-the peg auto-categorisation/classification software, and we could get up to 90% with bespoke algorithms processing structured text, what perecentages can you expect from a folksonomic approach?

Now keyword search is dead…

    Start a conversation 
Estimated reading time 1–2 minutes

I can’t help thinking the information world has become very morbid. There was Green Chameleon’s Dead KM Walking debate, CMS Watch’s Taxonomies are dead punt, and now keyword search is dead, according to the Enterprise Search Center (via Taxonomy Watch).

Stephen Arnold says “Established system vendors and newcomers promise silver bullets that will kill the werewolves plaguing enterprise search. Taxonomies resonate in some vendors’ marketing spiels. Others focus on natural language processing… ” This makes taxonomies sound like they are some new fangled techie trick, rather than the traditional sorting out we’re all used to. He then states that users expect “a search system to … Offer a web page that gives users specific suggestions and options with hotlinks to topics, categories, and key subjects … provide the user with point­ and-click options … Allow the user to drill down or jump across topics.” Are those not taxonomies for navigation?

KO

Science as Social Knowledge

    1 comment 
Estimated reading time 2–2 minutes

I thoroughly enjoyed Science as Social Knowledge by the US philosopher Helen Longino. It was recommended to me by Judith Simon, a very smart researcher I met at the ISKO conference in Montreal last summer. She researches trust and social software and suggested that Longino’s analysis of objectivity would be helpful to me. It took me a while to get settled with the book, but I recognised an essentially Wittgensteinian take on the notion of shared meaning. Longino works this into a set of principles for establishing degrees of objectivity in scientific enquiry. If I have grasped it all correctly, she basically says that although there is no such thing as “ideal” objectivity – a one true perspective up in the sky – we do not have to collapse into an “anything goes” relativism. We can accept that background assumptions can be challenged and change, and embed the notion of challenge and criticism into the heart of scientific enquiry itself. That establishes a self-regulating system that is more or less objective, depending on how open it is to criticism and how responsive it is to legitimate challenges. Objectivity arises out of the process of consensus-building in an open, reflective, and self-challenging community.

Applying this to taxonomy work appears to mean that the process of taxonomy building can be more or less objective, depending on how open the process is to the community and to adapting to legitimate challenges or complaints. This seems to be very much like the practical advice offered by taxonomists expressed in terms of “get user buy-in”, “consult all stakeholders”, “ensure that you consider all relevant viewpoints”, or “ensure that you have regular reviews and updates”, so it’s reassuring to know we are basically epistemologically valid in our methods!

KO

Organising Knowledge » What Are We?

    Start a conversation 
< 1 minute

I’ve been mulling over what to say about CMS Watch’s “Taxonomies are Dead” teaser, but defer to Patrick Lambe of Green Chameleon, who has written a very good post in response: Organising Knowledge » What Are We?.

One thought of my own is that there seems to be increasing differentiation between taxonomy creators and implementers (which I take as a sign that taxonomies are thriving rather than dying). I’ve always been on the content side of things, so I see knowledge organisation as primary, and the technology you use as secondary. However, more and more it seems to be the case that people understand the word “taxonomist” to mean someone who is a sort of Sharepoint sysadmin.

KO

Taxonomy and Records Management

    Start a conversation 
Estimated reading time 2–2 minutes

Taxonomy and Records Management « Not Otherwise Categorized… is a blog post I wish I’d read a year ago when studying a records management module for my Masters. A lot of people seemed to think it was strange that I had chosen the RM option and I couldn’t understand why the records managers didn’t talk more about taxonomy. Of course, taxonomists often work on records management systems in one form or another, and are happy to discuss the differences between taxonomy as file plan, taxonomy for RM, taxonomy as classification, taxonomy for navigation, and so on.

I think it shows that there is really very little widespread understanding of what a taxonomy is. People assume it is something mysterious and technical in the heart of whichever system they encountered one in first and don’t realise that taxonomies crop up all over the place. It’s not even very easy to find an “official” definition.

Alan Gilchrist and Barry Mahon in Information Architecture: Designing Information Environments for Purpose say “TFPL takes the view that a ‘corporate taxonomy’ can be viewed as an enterprise-wide master file of the vocabularies and their structures, used or for use, across the enterprise, and from which specific tools may be derived for various purposes, of which navigation and search support are the most prominent.”

Patrick Lambe in Organising Knowledge: Taxonomies Knowledge and Organisational Effectiveness describes taxonomies as taking many forms, including “lists, trees, hierarchies, polyhierarchies, matrices, facets, system maps” and Vanda Broughton in Essential Classification points out that taxonomy is now often taken to mean “any vaguely structured set of terms in a subject area”.

Settling on a single, popular definition of taxonomy might help promote taxonomists and taxonomy work, but as taxonomies need to do so much in so many different contexts, there just might not be a simple definition that works. Perhaps we need a taxonomy of taxonomies!

The Social Life of Information

    Start a conversation 
Estimated reading time 3–5 minutes

The Social Life of Information by John Seely Brown and Paul Duguid is an info classic. It’s one of those delightful books that manages to be very erudite, cover a huge range of theory, but reads effortlessly and even had me laughing out loud from time to time. (My favourite anecdote was that BT’s answer to homeworkers’ sense of isolation was to pipe a soundtrack of canned background noise and chatter into their offices!)

Essentially, the book argues that information and information technology cannot be separated from its social context and ignoring the human factors in technology adoption and use leads to fundamental misunderstandings of what it can and does do. This may mean overestimating the potential of information technology to change pre-existing institutions and practices, on both a personal and collective scale, and underestimating the ability of people to adapt technology to suit their ends rather than those envisaged by the technologists.

The authors argue that many “infoenthusiasts” miss subtleties of communication, such as the implicit social negotiations that take place in face-to-face conversations or the social meanings conveyed by a document printed on high quality paper or a book with expensive leather binding. Such nuances are easily lost when the words from such communications are removed from their original context and placed in a new environment – such as an electronic database.

Similarly, although personalisation is often touted as a great advance – you can have your own uniquley customised version of a website or a newspaper – such personalisation diminishes the power of the information source to act as a binding-point for a community. If we all have different versions of the newspaper, then we can’t assume we share common knowledge of the same stories. We then have to put additional work into reconnecting and recreating our knowledge communities, so the benefits of personalisation do not come without costs.

The importance of negotiation, collaboration, and improvisation is argued to be highly significant but extremely hard to build into automated systems. The social nature of language and the complexities of learning how to be a member of a community of practice, including knowing when to break or bend rules, are also essential to how human beings operate but extremely difficult to replicate in technological systems.

The theme of balance runs throughout the book – for example between the need to control processes while allowing freedom for innovation in companies or between the need for communication amongst companies and the need to protect intellectual property (knowledge in companies was often either seen as too “sticky” – hard to transfer and use – or too “leaky” – flowing too easily to competitors). At an institutional level, balance is needed between the importance of stability for building trust and openness to evolution (the perception of the value of a degree is bound up with the established reputation of an educational institution).

I found this very interesting, as my brother has been trying to persuade me that Daoism with its emphasis on things moving gradually from one state to another is a more productive way at looking at complex systems than the Aristotelian view that something can be in one category, or its opposite, but never both at once. (Here is a sisterly plug for an article he has written on the application of Daoist ideas to environmentalism). It also fits in with the idea of balancing the stability of an ordered taxonomy with the fast-flowing nature of folksonomies and of finding a way of using social media to support rather than compete with more formalised knowledge management practice. Brown and Duguid say: “For all the advantages of fluidity, we should not forget that fixity still has its charms. Most of us prefer the checks we receive and the contracts we write to remain stable”, which seems particularly apt given the global credit crisis!

KO

The Fractal Nature of Knowledge « Not Otherwise Categorized…

    Start a conversation 
Estimated reading time 1–2 minutes

The Fractal Nature of Knowledge « Not Otherwise Categorized… is Seth Earley’s response to a question about whether we “need more categories” as knowledge becomes more specialised. He points out that “categories are only meaningful given a specific scale” and that the level of abstraction you need depends on the context.

The metaphor of the fractal nature of knowledge strikes me as quite a good one in this respect – a knowledge organisation system should allow you to pan out or zoom in to get different views, but obviously there are practical limits (Borges’s map of the empire that is the same size as the empire itself) so you have to make a selection – in both breadth and depth. Seth Earley notes that “Communities of Practice can coalesce around extremely arcane branches of knowledge” and they could well need a very “fine grain” that no-one else in their organisation would ever use.

He adds that “there is no ‘standard’ way of organizing knowledge even for a specific process in a specific industry” and describes the way different organisations (businesses, libraries, universities) have different “knowledge consumers” and therefore different classification needs. He also argues that for businesses to gain maiximum value from their knowledge, they should find the “sweet spot” between chaos and control – allowing people to “self-organise” while contributing to the overall goals of the business.

ISKO UK Conference 2009 – call for papers

    Start a conversation 
< 1 minute

ISKO UK Conference 2009 – call for papers. ISKO UK 2009 will provide a rare opportunity for researchers, practitioners and innovators from all sectors to share ideas on the opportunities and challenges implicit in the digitization and networking of diverse information resources. The Conference will address issues in the organization and integration of text, images, data and voice – multimedia and multilingual.

UDC Seminar 2009 – call for papers

    Start a conversation 
< 1 minute

UDC Seminar 2009 – call for papers. The “Classification at a Crossroads” conference will address the potential of classification, the Universal Decimal Classification in particular, in supporting information organization, management and resource discovery in the networked environment. It will explore solutions for better subject access control and vocabulary sharing services.