Tag Archives: science

Assumptions, mass data, and ghosts in the machine

    1 comment 
Estimated reading time 3–5 minutes

Back in the summer, I was very lucky to meet Jonah Bossewitch (thanks Sam!) an inspiring social scientist, technical architect, software developer, metadatician, and futurologist. His article The Bionic Social Scientist is a call to arms for the social sciences to recognise that technological advances have led to a proliferation of data. This is assumed to be unequivocably good, but is also fuelling a shadow science of analysis that is using data but failing to challenge the underlying assumptions that went into collecting that data. As I learned from Bowker and Star, assumptions – even at the most basic stage of data collection – can skew the results obtained and that any analysis of such data may well be built on shaky (or at the very least prejudiced) foundations. When this is compounded by software that analyses data, the presuppositions of the programmers, the developers of the algorithms, etc. stack assumption on top of assumption. Jonah points out that if nobody studies this phenomenon, we are in danger of losing any possibility of transparency in our theories and analyses.

As software becomes more complex and data sets become larger, it is harder for human beings to perform “sanity checks” or apply “common sense” to the reports produced. Results that emerge from de facto “black boxes” of calculation based on collections of information that are so huge that no lone unsupported human can hope to grasp are very hard to dispute. The only possibility of equal debate is amongst other scientists, and probably only those working in the same field. Helen Longino’s work on science as social practice emphasised the need for equality of intellectual authority, but how do we measure that if the only possible intellectual peer is another computer? The danger is that the humans in the scientific community become even more like high priests guarding the machines that utter inscrutable pronouncements than they are currently. What can we do about this? More education, of course, with the academic community needing to devise ways of exposing the underlying assumptions and the lay community needing to become more aware of how software and algorithms can “code in” biases.

This appears to be a rather obscure academic debate about subjectivity in software development, but it strikes to the heart of the nature of science itself. If science cannot be self-correcting and self-criticising, can it still claim to be science?

A more accessbile example is offered by a recent article claiming that Facebook filters and selects updates. This example illustrates how easy it is to allow people to assume a system is doing one thing with massed data when in fact it is doing something quite different. Most people think that Facebook’s “Most Recent” updates provides a snapshot of the latest postings by all your friends, and if you haven’t seen updates from someone for a while, it is because they haven’t posted anything. The article claims that Facebook prioritises certain types of update over others (links take precedence over plain text) and updates from certain people. Doing this risks creating an echo chamber effect, steering you towards the people who behave how Facebook wants them to (essentially, posting a lot of monetisable links) in a way that most people would never notice.

Another familiar example is automated news aggregation – an apparently neutral process that actually involves sets of selection and prioritisation decisions. Automated aggreagations used to be based on very simple algorithms, so it was easy to see why certain articles were chosen and others excluded, but very rapidly such processing has advanced to the point that it is almost impossible (and almost certainly impractical) for a reader to unpick the complex chain of choices.

In other words, there certainly is a ghost in the machine, it might not be doing what we expect, and so we really ought to be paying attention to it.


Science as Social Knowledge

    1 comment 
Estimated reading time 2–2 minutes

I thoroughly enjoyed Science as Social Knowledge by the US philosopher Helen Longino. It was recommended to me by Judith Simon, a very smart researcher I met at the ISKO conference in Montreal last summer. She researches trust and social software and suggested that Longino’s analysis of objectivity would be helpful to me. It took me a while to get settled with the book, but I recognised an essentially Wittgensteinian take on the notion of shared meaning. Longino works this into a set of principles for establishing degrees of objectivity in scientific enquiry. If I have grasped it all correctly, she basically says that although there is no such thing as “ideal” objectivity – a one true perspective up in the sky – we do not have to collapse into an “anything goes” relativism. We can accept that background assumptions can be challenged and change, and embed the notion of challenge and criticism into the heart of scientific enquiry itself. That establishes a self-regulating system that is more or less objective, depending on how open it is to criticism and how responsive it is to legitimate challenges. Objectivity arises out of the process of consensus-building in an open, reflective, and self-challenging community.

Applying this to taxonomy work appears to mean that the process of taxonomy building can be more or less objective, depending on how open the process is to the community and to adapting to legitimate challenges or complaints. This seems to be very much like the practical advice offered by taxonomists expressed in terms of “get user buy-in”, “consult all stakeholders”, “ensure that you consider all relevant viewpoints”, or “ensure that you have regular reviews and updates”, so it’s reassuring to know we are basically epistemologically valid in our methods!