I will probably be on the other side of the Atlantic when the ISKO UK conference takes place in July in London, UK. I will be sorry to miss it, because the committee have brought together a diverse, topical, and fascinating collection of speakers.
ISKO UK excels in unifying academic and practitioner communities, and the conference promises to investigate the barriers that separate research from practice and to seek out boundary objects that can bring the communities together.
This is demonstrated in person by the keynote speakers Patrick Lambe of Straits Knowledge and Martin White of Intranet Focus Ltd – both respected for their commercial as well as academic contributions to the field of Knowledge Organization.
Amidst what is already shaping up to be a very full and varied programme, the presentations by Jeremy Tarling and Matt Shearer (BBC News) and Jarred McGinnis and Helen Lippell (Press Association) will show how research in semantic techniques is now being put to practical use in managing the fast-flowing oceans of information that news organizations handle.
The programme also includes a whole session on combining ontologies with other tools, as well as papers on facet analysis and construction of controlled vocabularies. There’s even some epistemology to please pure theoreticians.
Back in the summer, I was very lucky to meet Jonah Bossewitch (thanks Sam!) an inspiring social scientist, technical architect, software developer, metadatician, and futurologist. His article The Bionic Social Scientist is a call to arms for the social sciences to recognise that technological advances have led to a proliferation of data. This is assumed to be unequivocably good, but is also fuelling a shadow science of analysis that is using data but failing to challenge the underlying assumptions that went into collecting that data. As I learned from Bowker and Star, assumptions – even at the most basic stage of data collection – can skew the results obtained and that any analysis of such data may well be built on shaky (or at the very least prejudiced) foundations. When this is compounded by software that analyses data, the presuppositions of the programmers, the developers of the algorithms, etc. stack assumption on top of assumption. Jonah points out that if nobody studies this phenomenon, we are in danger of losing any possibility of transparency in our theories and analyses.
As software becomes more complex and data sets become larger, it is harder for human beings to perform “sanity checks” or apply “common sense” to the reports produced. Results that emerge from de facto “black boxes” of calculation based on collections of information that are so huge that no lone unsupported human can hope to grasp are very hard to dispute. The only possibility of equal debate is amongst other scientists, and probably only those working in the same field. Helen Longino’s work on science as social practice emphasised the need for equality of intellectual authority, but how do we measure that if the only possible intellectual peer is another computer? The danger is that the humans in the scientific community become even more like high priests guarding the machines that utter inscrutable pronouncements than they are currently. What can we do about this? More education, of course, with the academic community needing to devise ways of exposing the underlying assumptions and the lay community needing to become more aware of how software and algorithms can “code in” biases.
This appears to be a rather obscure academic debate about subjectivity in software development, but it strikes to the heart of the nature of science itself. If science cannot be self-correcting and self-criticising, can it still claim to be science?
A more accessbile example is offered by a recent article claiming that Facebook filters and selects updates. This example illustrates how easy it is to allow people to assume a system is doing one thing with massed data when in fact it is doing something quite different. Most people think that Facebook’s “Most Recent” updates provides a snapshot of the latest postings by all your friends, and if you haven’t seen updates from someone for a while, it is because they haven’t posted anything. The article claims that Facebook prioritises certain types of update over others (links take precedence over plain text) and updates from certain people. Doing this risks creating an echo chamber effect, steering you towards the people who behave how Facebook wants them to (essentially, posting a lot of monetisable links) in a way that most people would never notice.
Another familiar example is automated news aggregation – an apparently neutral process that actually involves sets of selection and prioritisation decisions. Automated aggreagations used to be based on very simple algorithms, so it was easy to see why certain articles were chosen and others excluded, but very rapidly such processing has advanced to the point that it is almost impossible (and almost certainly impractical) for a reader to unpick the complex chain of choices.
In other words, there certainly is a ghost in the machine, it might not be doing what we expect, and so we really ought to be paying attention to it.
I’m still mulling over Helen Longino’s criteria for objectivity in scientific enquiry (see previous post: Science as Social Knowledge) and it occurred to me that folksonomies are not really open and democratic, but are actually obscure and impenetrable. The “viewpoint” of any given folksonomy might be an averaged out majority consensus or some other way of aggregating tags might have been used, and so you can’t tell if it is skewed by a numerically small but prolifically tagging group. This is the point Judith Simon made in relation to ratings and review software systems at the ISKO conference, but it seems to me the problem for folksonomies is even worse, because of the echo chamber effect of people amplifying popular tags. Without some way of showing who is tagging what and why, the viewpoint expressed in the folksonomy is a mystery. This is not necessarily the case, but I think you’d need to collect huge amounts of data from every tagger, then database it along with the tags, then run all sorts of analyses and publish them in order to show the background assumptions driving the majority tags.
If the folksonomic tags don’t help you find things, who could you complain to? How do you work out whether it doesn’t help you because you are a minority, or for some other reason? With a taxonomy, the structure is open – you may not like it but you can see what it is – and there will usually be someone “in charge” who you can challenge and criticise if you think your perspective has been overlooked. In many case the process of construction will be known too. I don’t see an obvious way of challenging or criticising a folksonomy in this way, so presumably it fails Longino’s criteria for objectivity.
You can just stick your own tags into a folksonomy and use them yourself so there is some trace of your viewpoint in there, but if the rest of the folksonomy doesn’t help you search, that means you can only find things once you have tagged them yourself, which would presumably rule out large content repositories. So, you have to learn and live with the imposed system – just like with a taxonomy – but it’s never quite clear exactly what that system is.
I thoroughly enjoyed Science as Social Knowledge by the US philosopher Helen Longino. It was recommended to me by Judith Simon, a very smart researcher I met at the ISKO conference in Montreal last summer. She researches trust and social software and suggested that Longino’s analysis of objectivity would be helpful to me. It took me a while to get settled with the book, but I recognised an essentially Wittgensteinian take on the notion of shared meaning. Longino works this into a set of principles for establishing degrees of objectivity in scientific enquiry. If I have grasped it all correctly, she basically says that although there is no such thing as “ideal” objectivity – a one true perspective up in the sky – we do not have to collapse into an “anything goes” relativism. We can accept that background assumptions can be challenged and change, and embed the notion of challenge and criticism into the heart of scientific enquiry itself. That establishes a self-regulating system that is more or less objective, depending on how open it is to criticism and how responsive it is to legitimate challenges. Objectivity arises out of the process of consensus-building in an open, reflective, and self-challenging community.
Applying this to taxonomy work appears to mean that the process of taxonomy building can be more or less objective, depending on how open the process is to the community and to adapting to legitimate challenges or complaints. This seems to be very much like the practical advice offered by taxonomists expressed in terms of “get user buy-in”, “consult all stakeholders”, “ensure that you consider all relevant viewpoints”, or “ensure that you have regular reviews and updates”, so it’s reassuring to know we are basically epistemologically valid in our methods!