Here are some articles on folksonomy that I found in a reading list on the Indiana University School of Library and Information Science website.
Peterson, E. (2006). Beneath the metadata: some philosophical problems with folksonomy. D-Lib Magazine, 12(11).
Vander Wal, T. (2007). Folksonomy coinage and definition.
Quintarelli, E. (2005). Folksonomies: power to the people
Mathes, A. (2004). Folksonomies: cooperative classification and communication through shared metadata.
Golder, S. A. & Huberman, B. A. (2006). Usage patterns of collaborative tagging systems. Journal of Information Science, 32(2). 198-208.
Intranet 2.0: the need for ‘lean intranets’ « manIA has some sensible advice on keeping an Intranet efficient and functional. I was drawn to the section where Patrick Walsh discusses “controlled folksonomies”, a phrase he attributes to Christina Wodkte. Essentially, you let content contributors choose their own tags, but prompt them with suggestions. Presumably, people are far more likely just to use the existing tags (thus preserving the underlying controlled vocabulary) most of the time, because it is easier than making up their own. He implies that people could use terms not in the CV, but not what would become of those tags. If they get added to the CV automatically, you would lose the control element as mis-spellings and ambiguous terms etc would slowly creep in. To keep the CV tidy would require some ongoing editorial work. For one of the CMSs used at the BBC, there are rules – once a folksonomic tag has been used a certain number of times, it gets sent to the IA team who can then add it to the core CV if they think it will be useful. Presumably, you also need someone to produce an initial CV in the first place.
A Journey In Social Media: A Breakthrough In Taxonomy? A discusssion of the pros and cons of folksonomies and taxonomies, with the debate categorised as being between the traditionalists and the emergents. The writer is at a large company and promises updates on the taxo/folkso divide there.
Pace layering in ia is a paper by D. Grant Campbell and Karl V. Fast from the Faculty of Information and Media Studies, University of Western Ontario. They bring pace layering theories from ecology and environmental science into information architecture, viewing ia as an “ecology”. Basically, ecologists have noted that events occurring over different timescales interact to affect an environment – something like the lowering of the water table would be a slow event, but a flash flood would be a fast event. Only by looking at the ways these differently “paced layers” interact, can you predict how the local environment will respond. They propose that the underlying ia of website, with taxonomies and embedded nvigation structures etc., is a slow layer but that folksonomies bubble away as a fast layer of the site, changing rapidly and responding quickly. They suggest that the most stable structure will be one that can accommodate both fast-moving and slow-moving layers and that the slow layer must be robust and flexible enough to adjust itself to pressures from the fast layer.
I don’t think I have grasped all the implications of this, but my first impression is that it fits well with the “best of both worlds” approach – encouraging social tagging but not relying on it for critical information management, while using the folksonomic tags as feedback for updating and reviewing taxonomies.
A tag counting experiment – one to add to the growing collection of investigations of folksonomies. The authors claim that over 60% of folksonomic tags are “factual” and therefore ripe for harvesting as metadata. They make no claims as to the accuracy of the tags, although they refer to a previous study that showed that folksonomic tags were more accurate than auto-tagging software. They chose a very specific field – CSS style sheets – but the number crunching is an impressive effort – they claim to have have checked them all! Some typos though.
Folksonomies – Cooperative Classification and Communication Through Shared Metadata is a well-written paper that outlines some key issues in the usefulness and functioning of folksonomies. It includes some interesting ideas about “feedback loops” that motivate social tagging, ideas for further research, and a set of useful links to other articles.
I’ve now spoken to two more taxonomy consultants who both expressed the opinion that folksonomies should be embraced, but only where they really work, and that they can’t always substitute for formal systems. Would anyone entrust their child’s health to the opinions of a random crowd, rather than a thorough examination by a trained and qualified expert? On a different theme, if you want a comprehensive stock control inventory so that you know how many items to order from your wholesaler, you want to know exactly how many widgets you have in your warehouse, not how many widgets, plus doodahs, plus gizmos, you have and hope when you’ve added them up you have the right number. You want to know that whenever a shipment has arrived, it has been logged on the system as a box of widgets, and not as whatever whoever happened to log the delivery felt like calling it at the time. On the other hand, you want your customers to be able to search for widgets using any term that springs to mind, and if it helps them to add a tag to your website labelling widgets “grandma’s buttons” so they can find and order them easily another time, then let them do it!
I have just started reading Organising Knowledge: Taxonomies, Knowledge and Organisational Effectiveness by Patrick Lambe. Of course, I turned straight to the last chapter – where folksonomies are discussed. Lambe argues that folksonomies work best with large quantities of new content, where social tagging creates some way of grouping similar items quickly and cheaply, but when the users start to demand comprehensiveness and accuracy in their searches, and once the size of the collection becomes too large, some sort of formal taxonomic structure works best. Sites are starting to add traditional facets, like location, to control and focus their social tagging.
There is a counter argument that for very small well-defined communities, social tagging works well, because the users have a good understanding of the terminology, tend to think in the same way, and so tend to use very similar tags. This would explain why the folksonomic approach was so popular in the web community – a new highly specialised community all speaking the same jargon were all tagging new content in very similar ways. The danger is that once the community expands, people stop using terms with such precision and the helpfulness of the social tagging get diluted.
I’ve recently had fascinating conversations with two professional taxonomists – one at EDS and one at the BBC – and both use very different but imaginative and innovative combinations of folksonomic and traditional taxonomic procedures.
All the best taxonomists advocate consulting as much as possible with your users, which is obvious, and a folksonomy is pretty much a glorified mass user consultation exercise. But why stop with the consultation stage?
You still get an awful lot of noise to your signal in folksonomies and the best way to clear that is still to apply some trained thoughtful evaluation – the principles of taxonomy. The combined approach gives you the best of both worlds – gather the tags as a folksonomy (you still need a critical mass of taggers), and then do a bit of pruning and tidying to make them work properly. Ideal!
First person: ‘Folksonomy’ takes power from expert librarians, an article by David Bowen of Bowen Craggs & Co in the Financial Times‘s Digital Business section on November 7th highlights some of the advantages of having a well-crafted carefully structured taxonomy instead of relying on folksonomies. He says that folksonomies are great in some cases, but that really valuable information is by definition specialised and therefore doesn’t get read by enough people for mass social tagging to be helpful.
I think there are two key limitations to the usefulness of the folksonomic approach. Firstly, you need loads of people. If you don’t have a huge number of people actively tagging – and only huge mass market websites do – you don’t generate a large enough data set to get a decent signal-to-noise ratio. Secondly, it has to be of no consequence if chunks of your content are never found due to weird or bad tagging. This is fine for Flickr, say, where people just want any old picture, not to see all the pictures. It’s not so great if you want to make sure you have checked every one – that you’ve looked at all the relevant legislation, for example, not just the first couple of laws that happened to pop up.