To index is to translate

28th July, 2013 Fran Start a conversation
Estimated reading time 3–4 minutes

Living in Montreal means I am trying to improve my very limited French and in trying to communicate with my Francophone neighbours I have become aware of a process of attempting to simplify my thoughts and express them using the limited vocabulary and grammar that I have available. I only have a few nouns, fewer verbs, and a couple of conjunctions that I can use so far and so trying to talk to people is not so much a process of thinking in English and translating that into French, as considering the basic core concepts that I need to convey and finding the simplest ways of expressing relationships. So I will say something like “The sun shone. It was big. People were happy” because I can’t properly translate “We all loved the great weather today”.

This made me realise how similar this is to the process of breaking down content into key concepts for indexing. My limited vocabulary is much like the controlled vocabulary of an indexing system, forcing me to analyse and decompose my ideas into simple components and basic relationships. This means I am doing quite well at fact-based communication, but my storytelling has suffered as I have only one very simple emotional register to work with. The best I can offer is a rather laconic style with some simple metaphors: “It was like a horror movie.”

It is regularly noted that ontology work in the sciences has forged ahead of that in the humanities, and the parallel with my ability to express facts but not tell stories struck me. When I tell my simplified stories I rely on shared understanding of a broad cultural context that provides the emotional aspect – I can use the simple expression “horror movie” because the concept has rich emotional associations, connotations, and resonances for people. The concept itself is rather vague, broad, and open to interpretation, so the shared understanding is rather thin. The opposite is true of scientific concepts, which are honed into precision and a very constrained definitive shared understanding. So, I wonder how much of sense that I can express facts well is actually an illusion, and it is just that those factual concepts have few emotional resonances.

A major aspect of poetry is about extending the meanings of words to their limits, to allow for the maximum emotional resonance and personal interpretation. Perhaps poetry speaks to individuals precisely because it doesn’t evoke a shared understanding but calls out new meanings and challenges the reader to think differently, to find new meanings? This is the opposite of indexing, which is about simplifying and constraining to the point at which all the fuzziness is driven away and you are left with nothing but “dead metaphors”. The only reason indexing the sciences seems easier is because so many scientific concepts have been analyzed and defined to this point already, doing much of the indexer’s work for them.

I am not sure if these musings have any practical applications. People sometimes ask me if I think my previous studies of languages and literature have helped in my current work. I have known many excellent monolingual indexers but am also aware that many people who are good at semantics speak more than one language. However, I am sure it is helpful to think of the process of indexing as a form of translation, albeit if the idea of removing all the poetry from language in order to create a usable, useful index is not at all romantic!

Top

This time it’s personal data – Indiverses and Personal APIs

5th June, 2013 Fran 5 comments
Estimated reading time 3–4 minutes

Sooner or later I was bound to find some other Semanticists in Canada and on Thursday I attended a Semantic Web meetup in Montreal. The audience was small, but that led to more of a group discussion atmosphere than a formal talk. The presenter, Dr Joan Yess Kahn, has coined the term Indiverse – Individual Information Universe – to facilitate her thinking about the set of personal information and data that we accumulate through our lives.

She pointed out that some of this information is created by us, some about us, some with our knowledge and consent, some without, and our entire digital lives can be stolen and abused. She made some interesting observations about how our personal and public information spaces were essentially one and the same before the industrial revolution, when most people’s work and home lives were intertwined (e.g. artisans living in their workshops), and that changes such as the industrial revolution and public education split those apart as people left home to work somewhere else. However, in the information age more people are returning to working from home while others are increasingly using their computers at work to carry out personal tasks, such as online shopping.

This blurring of the public and private has many social and commercial implications. We discussed the potential monetary value of personal attention and intention data to advertisers, and implications for surveillance of individuals by governments and other organizations.

We also talked about information overload and information anxiety. Joan has written about ways of categorizing, indexing, and managing our personal information – our address books, calendars, to do lists, etc. – and this led us to consider ideas of how to construct sharable, standardized Personal Data Lockers (for example The Locker Project) and to take back control of our online identity and information management, for example in shifting from Customer Relations Management (CRM) to Vendor Relations Management (VRM).

In previous posts I have talked about our need to become our own personal digital archivists as well and I was sent a link by Mark to a Personal API developed by Naveen. This takes personal information curation to the data level, as Naveen is seeking an easy way to manage the huge amounts of data that he generates simply by being a person in the world – his fitness routines, diet, etc.

There is a clear convergence here with the work done by such medical innovators as Patients Know Best electronic patient health records. Moral and social implications of who is responsible for curating and protecting such data are huge and wide-ranging. At the moment doting parents using apps to monitor their babies or fitness enthusiasts using apps (such as map my run etc.) are doing this for fun, but will we start seeing this as a social duty? Will we have right-wing campaigns to deny treatment to people who have failed to look after their health data or mass class actions to sue hospitals that get hacked? If you think biometric passports are information dense, just wait until every heartbeat from ultrasound to grave is encoded somewhere in your Indiverse.

Top

Libraries, Media, and the Semantic Web meetup at the BBC

2nd December, 2012 Fran Start a conversation
Estimated reading time 3–4 minutes

In a bit of a blog cleanup, I discovered this post languishing unpublished. The event took place earlier this year but the videos of the presentations are still well worth watching. It was an excellent session with short but highly informative talks by some of the smartest people currently working in the semantic web arena. The Videos of the event are available on You Tube.

Historypin

Jon Voss of Historypin was a true “information altruist”, describing libraries as a “radical idea”. The concept that people should be able to get information for free at the point of access, paid for by general taxation, has huge political implications. (Many of our libraries were funded by Victorian philanthropists who realised that an educated workforce was a more productive workforce, something that appears to have been largely forgotten today.) Historypin is seeking to build a new library, based on personal collections of content and metadata – a “memory-sharing” project. Jon eloquently explained how the Semantic Web reflects the principles of the first librarians in that it seeks ways to encourage people to open up and share knowledge as widely as possible.

MIMAS

Adrian Stevenson of MIMAS described various projects including Archives Hub, an excellent project helping archives, and in particular small archives that don’t have much funding, to share content and catalogues.

rNews

Evan Sandhaus of the New York Times explained the IPTC’s rNews – a news markup standard that should help search engines and search analytics tools to index news content more effectively.

schema.org

Dan Brickley’s “compare and contrast” of Universal Decimal Classification with schema.org was wonderful and he reminded technologists that it very easy to forget that librarians and classification theorists were attempting to solve search problems far in advance of the invention of computers. He showed an example of “search log analysis” from 1912, queries sent to the Belgian international bibliographic service – an early “semantic question answering service”. The “search terms” were fascinating and not so very different to the sort of things you’d expect people to be asking today. He also gave an excellent overview of Lonclass the BBC Archive’s largest classification scheme, which is based on UDC.

BBC Olympics online

Silver Oliver described how BBC Future Media is pioneering semantic technologies and using the Olympic Games to showcase this work on a huge and fast-paced scale. By using semantic techniques, dynamic rich websites can be built and kept up to the minute, even once results start to pour in.

World Service audio archives

Yves Raimond talked about a BBC Research & Development project to automatically index World Service audio archives. The World Service, having been a separate organisation to the core BBC, has not traditionally been part of the main BBC Archive, and most of its content has little or no useful metadata. Nevertheless, the content itself is highly valuable, so anything that can be done to preserve it and make it accessible is a benefit. The audio files were processed through speech-to-text software, and then automated indexing applied to generate suggested tags. The accuracy rate is about 70% so human help is needed to sort out the good tags from the bad (and occasionally offensive!) tags, but thsi is still a lot easier than tagging everything from scratch.

Top

Local is the new social – location data startups

7th October, 2012 Fran Start a conversation
Estimated reading time 4–6 minutes

A few weeks ago I attended an event by Dreamstake featuring a collection of startup companies that are using open geographical data – such as the data released by Ordnance Survey. There was much championing of the possibilities of much money to be made by using data that organisations release for free. This seems obvious to me – someone else has paid to do all the preparatory work so others can cash in. No-one seems concerned about the ethics of this. If UK taxpayers have paid for the OS work to be done, should they not automatically be shareholders in any company that profits from the fruits of this investment?

The companies showcased all had new twists on using location data. What I found especially interesting was the emphasis on context. When selling services, place alone is not enough. Time is important and also the circumstances. So, a businesswoman on a work trip will want probably different products and services to when she is out with her family.

The speakers were
James Pursey of Sortedapp
Sadiq Qasim LoYakk
Craig Wareham of Viewranger
Tim Buick of Streetpin

Location-based marketing

James Pursey opened by giving a brief history of location-based marketing, pointing out that this was pioneered by the Yellow Pages (now yell.com). His company attempts to match time, place, and location and makes the consumer the advertiser and the service provider the respondent. He explained this as a “reverse Ebay”. Instead of advertising your products and services, consumers post details of what they want, e.g. I need someone to clean my flat before my wife gets home (the data game still seems to be a man’s world!). The message is then pushed to local cleaners who have a window of time in which to respond. The app works on the location of your mobile phone, but you can alter that on a map so that you can be at home but arrange a service to be provided near your workplace, etc.

Chatting about a shared experience

Sadiq Qasim explained that LoYakk – local yakking – recognises that conversations are often focused around specific places and events. Social media links tend to be based on static lists of friends, with very little contextualisation. However, social relationships and conversations are often transient. You might want to chat to someone at a conference, but that doesn’t mean you want to become lifelong friends. By creating an app that mirrors the real world nature of such connections, people can drop in, chat to people in the vicinity and leave again. Events such as conferences, arts and sporting events, and holiday destinations are particularly well suited to this approach.

Mobile is local

Craig Wareham described Viewranger, which is an app for outdoorsy people. It combines guidebook information, a social community, a marketplace, based around location and has become popular with search and rescue teams.

Tim Buick of Streetpin emphasised that about half of searches on mobiles – perhaps unsurprisingly – are for something local. However, time is very relevant – he might be near a great pub that has a special offer on beer but he doesn’t want to be told about it at 8 in the morning when he has just dropped the kids off at nursery, but in the same location 12 hours later with his mates, the offer might be just what they want. The right information, to the right person, at the right place and at the right time is what matters.

The distinction between what is useful information and what is marketing becomes very blurred.

Place, space, maps

Thinking about this event along with the Shape of Knowledge event’s discussions of maps of cyberspace, and the Superhuman exhibition’s raising the question of the potential of transhumans to relate to space in a different way to current humans, made me wonder how location-based services will change in future. The technologically enhanced human will, presumably, need maps that make sense to computers as well as maps that make sense in real space and time. Navigation and location are most likely going to change beyond all recognition.

Top

Mapping the transhuman

7th September, 2012 Fran Start a conversation
Estimated reading time 1–2 minutes

Last night I popped in to “We are all a cyborg” an event as part of the Wellcome Collection’s Superhuman exhibition. It covered the history of human enhancement from ancient Egyptian prosthetic toes to visions of a transhuman future of hybrid bioengineered-human-machines. The relationship between society, the individual and the aesthetics of the “normal” was explored too. I was also drawn to the themes of embodiedness of cognition by an artwork in which the artist had built extensions to her fingertips to enable her to experience a greater area of space. By altering the physical confines of the body, how far did she change her way of thinking about the world as well?

These ideas fitted in with the idea of maps and spaces that I had been mulling over following the Shape of Knowledge event, and so I started to think about the crossover between human and machine, the leaking of cyberspace across into “real” space, and how we map – and with what we map – these shifting worlds and worldviews.

Top

UX field trip to Inition Studios for a 3D extravaganza

2nd September, 2012 Fran Start a conversation
Estimated reading time 2–4 minutes

I don’t manage to get to many London IA events, so I was very pleased to be able to attend a UX field trip a little while ago, arranged by the wonderful Alison Austin, UX practitioner, who has a knack for spotting interesting people doing fascinating things. She arranged a visit to Inition Studios, which gave us the opportunity to get our hands on a selection of their gadgets and devices. Inition and their sister company Holition deal with all things 3D. I wasn’t sure what 3D printing had in common with 3D film-making but a lot of the modelling, data management, and underlying software is essentially the same.

One of Inition’s researchers has a background in ergonomics and worked on systems for representing aeroplanes in virtual 3D models with the aim of devising new systems to help air-traffic controllers. Huge amounts of data need to be processed by the controllers, and combining 3D and 2D visualisations can show different aspects – for example a 3D model of the planes in the air, with 2D lists of data such as speeds etc. However, it is – thankfully – very difficult to get air traffic controllers to experiment with new devices – so it not easy to get new systems and methods adopted.

Many of Inition’s devices just seemed to be a lot of fun. They had an infrared camera rig set up to capture movement, so you could control a footballer on screen by kicking a pretend football. There were some haptic feedback devices that felt “heavy” when you tried to pick up a virtual block on screen, 3D cameras so we could watch ourselves on 3D TV, and lovely augmented reality devices. I tried on some virtual earrings and necklaces, and picked up and “painted” a virtual car. There were elaborate 3D cityscapes that could be used by architects and skeletons that could be useful in training doctors.

Animation can be triggered by QR codes, so we saw a plinth in the real world that when viewed on an i-Pad appeared to have buildings and cars and other objects on top of it.

For me, the most enchanting was a 3D display containing two 3D worlds – one with a complex artificial robotic arm that you could manipulate and deconstruct, another with games, mirrors you could move through, and figures you could move and play with. It reminded me of a Dali dreamscape. You moved these virtual objects with a pen controller that you waved towards, but not touching the screen itself. I am glad games were not so beautiful and sophisticated when I was a teenager as I don’t think I would have ever left the house!

Top

Skeptical Knowledge-Seeking: Business Research in the Age of ‘Truthiness’ – SLA Chicago

14th August, 2012 Fran Start a conversation
Estimated reading time 2–4 minutes

Although I don’t work in business research at the moment, subjectivity/objectivity is one of my pet topics, so I enjoyed hearing about how “truthiness” is being affected by online publishing and social media.

[“Truthiness” is a term invented by US comedian Stephen Colbert and used in his political satire to refer to politiciins who seek to persuade us that something must be true because it “feels right” rather than because of the weight of evidence or rational argument to support it. ]

Beware the echo chamber

Cynthia Lesky of Threshold Information talked about the seductiveness of the “echo chamber” effect in persuading people to think that a report must be true because it is being circulated widely and cited repeatedly. The Internet has exacerbated this effect because automated online content aggregators will regurgitate content without any editorial control, so there is no differentiation between accurate and inaccurate reports. It is also very easy for PR “spin” and propaganda to be replicated via aggregators and social media sites very quickly and with little fact checking and scant opportunity for counter-arguments to be put forward.

She offered some very useful tips to avoid being duped. Firstly, the researcher should work out not what is important to them or what is the most significant point being made in an article, but what is most important to the client who has commissioned the research. This enables the researcher to target fact-checking efforts most effectively. So, for example, in a piece about the opening of a factory to sell a new product, depending on the nature of the clients’ business, some will care about the effects on the market for that product, some will care about the effect on property prices in the area near the factory, and others will care most about employment opportunities.

Understanding the ways statistics can be presented is also useful. Cynthia offered an example of a survey in which 20% of people felt that their age had been a problem for them in gaining promotion. The survey was reported in one publication as evidence of a terrible blight of ageism in the workplace, and by another publication as evidence that only a minority of older people felt that they had been affected by age discrimination while 15% of respondents saw their age as a positive advantage. Publications will do this to exploit “confirmation bias” amongst their readers. People enjoy reading something that confirms views that they already hold, so reflecting back to readers what they already believe is an easy way of pleasing an audience.

Informed intuition

The researcher should use their “informed intuition” as a “defence against spin and error”. Researchers should also not shy away from telling their clients about problems with the research, gaps, and areas where further work ought to be undertaken. By showing to the client the difficulties inherent in the work, researchers do not make themselves look unprofessional, they demonstrate to the client the value of their skills and why it is worth paying for trained and experienced researchers.

In other words, if you use your own sense of “truthiness” wisely and treat it carefully, it can work to your advantage rather than leading you up the garden path.

Top

New York Public Library and metadata

31st July, 2012 Fran Start a conversation
Estimated reading time 2–2 minutes

I spent a wonderful afternoon at the New York Public Library on July 20th, thanks to Phil Sutton, reference librarian, who was kind enough to talk to me about his work and introduce me to several of his colleagues in the NYPL Labs, website, and local history teams.

As the Library holds such vast and diverse collections, it is not surprising that the metadata work of the Labs team is varied and wide ranging. One project involves rationalising and mapping metadata across collections that use different standards, another involves creating metadata for content strategy and website navigation, while more experimental work includes looking to use Linked Data techniques to open up and cross reference data sets.

What’s on the Menu? is using crowd sourced help to transcribe the Library’s collection of restaurant menus. So far, they have completed 998,899 dishes transcribed from 14,872 menus, and are investigating ways of linking the data to enable researchers to make interesting connections. So far, the data is in a fairly raw form, but is available to access through an API.

The Labs team are also working on the Library’s numerous directories, with an emphasis on helping genealogists, starting with census data from 1940 in the DirectMe project.

Previous projects have opened up collections of stereographs and maps, as well as content related to musical theatre, theatrical lighting, and the Shelley-Godwin archive.

Top

I friend dead people – Are social media mature enough to cope with bereavement?

13th May, 2012 Fran 3 comments
Estimated reading time 4–7 minutes

This is a very personal post about topics in which I am not an expert, so I welcome comments and suggestions.

When “like” and “lol” don’t help

In February, a young man I had never met died in sad circumstances. He was a friend of a friend and I was supposed to meet him on the day he died. Completely coincidentally, within a fortnight I myself lost a dear friend, someone I had known for over 20 years.

The closeness in timing has thrown out sharp contrasts in the way that these deaths have reverberated around my social media worlds (obviously the real world impacts have been huge, but I am not going to discuss those here).

In many ways, dealing with the death of my own friend on social media has been easier. Being well known to her family and her circle of closest friends has meant that I have felt able to post messages of condolence and remembrance as I instinctively know what is appropriate, and I know that most of the people reading them will know me. It has been strange to see her name pop up as a “friend available on chat” when I know any activity in her account must be one of her family members logging in to maintain the page. Yesterday was her birthday, and the reminders in my calendar and the little birthday gift “event reminder” were bittersweet, but not unwelcome. I think of her and her family often, and do not want to forget.

Just after she died, I received a message through a social media site from someone I had never met or even heard of, who had been a schoolfriend of hers long ago, asking what had happened to our mutual friend, and I felt comfortable in answering. It helped me to talk about her with this stranger. I even flattered myself that I was doing some good, in that they clearly felt awkward about contacting her family directly while I was able to act as an “information resource” meaning the family and closest friends could focus on their own grieving.

I friend dead people

In contrast, how to cope with the loss of an almost-friend on social media has been strange and unnerving. One social media application has tactlessly and repeatedly suggested him as a friend, noting how many friends we had (have?) in common. Somehow I didn’t have the heart to click on “ignore”. I realise now I should have done just that, because I was anguished when I accidentally clicked on “confirm”. I worried that his friends and relatives might see my “friend request” and be distressed by it. Maybe they would never spot the noitification, maybe they would assume it was sent at a time before his death – just another reminder of what might have been, maybe they would even be comforted by the continuation of these distant social interactions with almost-strangers. (I immediately emailed the site in question asking them to retrieve my suggestion, but received no reply.)

My uncertainty about the appropriate “social media etiquette” was no doubt increased rather than diminished by our social distance. I do not know his family and friends well enough to mention this casually in passing, to express that this had been a mistake and was not intended to distress, or even to know what sort of people they are and whether this is the sort of thing that might upset them. However, it is exactly these sort of loose “one degree of separation” relationships that online social media foster and this incident struck me as illustrating how inadequate such media are when interactions need to go beyond chirruping about the weather, saying a website is cool, or asking whether or not someone wants to go to a party.

Digital memorials

My friend’s social media pages have slipped into being a form of digital memorial, but this also raises new issues. There have been stories in the press of “trolls” deliberately desecrating memorial pages in an online equivalent of upturning flowers left on a grave or kicking over and spraying graffiti on a headstone (e.g. http://gawker.com/5868503/why-people-troll-dead-kids-on-facebook). The only way to deal with this seems to be to remove the page, which is a shame and in a way seems to mean the bullies have won. It also highlights a strange transition from personal to public. Our graveyards are either public spaces that the authorities monitor and maintain or privately curated grounds. I have previously thought of my social media pages as more like a private garden – people may peer over the wall, but it is essentially “my” space to maintain. People are starting to think more and more about their digital legacies (the British Computer Society recently held an event on this theme).

There are already “digital memorial” companies offering guarantees of “permanent” archiving and access to sites (e.g. Much Loved). Other sites offer memorial pages that allow people to make donations to charity, but presumably these are not expected to remain in place forever.

However, these sites are aimed at those who remain setting up the sites, not taking over the sites that belonged to their loved ones. The value of someone’s posts and pages changes dramatically when they become precious memories, and not just ephemeral chatter. If we (or our loved ones) want our own sites to go on after us, do we need to bequeath our passwords to trusted friends or family? How does that affect our contracts with hosts and service providers? What rights do families have to “reclaim” the pages and content if there is no such bequest? How would disputes over inheritance of such sites be decided? What recourse do we have if the site owner decides to shut down and delete the content or simply loses it?

It seems to me that such issues have the potential to cause far more distress than the strangenesses we encounter when automated reminders and friend suggestions behave as if we are all immortal.

Top

Change, technology, understanding, and the information professions

2nd April, 2012 Fran Start a conversation
Estimated reading time 3–4 minutes

Not being a morning person, I was unsure whether a networking breakfast would suit me, but the recruitment agent Sue Hill’s event offered good food and interesting conversation, so I thought I would give it a try. I wasn’t disappointed – the food was excellent and the big round tables promoted lively group discussion.

We were a mix of information professionals from public and private sector, at different stages of our careers, but three key themes prompted the most debate.

Change management

Managing technology change and bridging the cultural and political divisions within organisations in order to bring about change were key concerns. Information professionals can contribute by explaining how new technologies work, how technologies can be catalysts of changes in behaviour, and how they mitigate or increase informational and archival risks. Even simply letting people know new technology is out there can be hugely valuable. Knowledge and information workers can help manage change on political and cultural levels by understanding the corporate culture they are working in and helping their organisation to understand itself and so make good decisions about systems procurement. Information professionals can also often help to break down cultural barriers, to sharing information, for example.

Social media

Social media are now being used to differing degrees within organisations – some having embraced the technologies wholeheartedly, others seeing them as a problem or a threat. There was a general concern that technology is being adopted and used faster than we can understand its impacts and devise strategies for mitigating any risks.

Personal and cultural understanding of the divisions between the public and the private seemed to be a problematic area. Young people in particular were perceived as being vulnerable to “over exposure” as they seemed not to notice that postings about them – pictures especially – would remain available for decades to come and could compromise them in their future careers. Recruitment agents use social media to find out about potential job candidates, and notice inconsistencies between a very professional image presented in a CV or at interview with a Twitter feed that paints a picture of carelessness, foolishness, or irresponsibility.

Information literacy

Awareness of how to use and abuse social media, search engines and research tools, and data and statistics was seen as an arena in which information professionals can offer advice and mentoring, to young people, but also to organisations. Information professionals should also set good examples of how to use social media tools, adopt new working practices, and evaluate new technologies. They should also be able to explain how search engines work, what the pitfalls of poorly planned or too narrow research strategies are, and how to research in a more efficient and effective manner.

A new area that information professionals also need to understand is data analytics and how statistics and algorithmic data mining can be used or abused. Information professionals need not be advanced mathematicians to contribute in this area – an understanding of how to interpret data, the political and cultural issues that can bias interpretations, how to frame questions to get mathematically and statistically significant results, and how to understand the importance of outliers and statistical anomalies are skills that are becoming more important every day.

Overall, I thoroughly enjoyed being woken up by such thoughtful and interesting breakfast companions and went about the rest of the day with a head full of fresh ideas.

Top

« Previous 1 2 3 4 … 6 Next »

Category Archives: culture