Serco Artemis Digital – Realising the Value of Archives and Rehabilitating Prisoners

Bruce Hellman from Serco described the work they have been doing to employ prisoners as cataloguers and transcribers. The work, which varied from project to project, but which included typing up handwritten archival documents that were not suitable for OCR capture techniques and adding metadata, was very popular with prisoners.

Bruce argued that it gave them a chance to develop skills that would be useful in the workplace on their release, and allowed organisations to get work done more cheaply than by paying standard market rates.

How Metadata and Semantic Technologies will Revolutionise your Workflow

John O’Donovan of the Press Association gave an entertaining presentation about using semantic technologies to index or re-index and publish to the web content from a range of systems, including legacy systems and external feeds. He pointed out – with a series of amusing ambiguities and unintentional innuendos – that simple text search lacks context, and that newspaper headlines often contain jokes, ambiguous terms, and terms that quickly become obsolete. So, metadata is vital in assembling assets that are about the same topic.

He stressed the importance of keeping your metadata management separate from your content management, so that metadata can be changed without having to re-index assets. (An exception is rights and other non-subjective metadata that needed to be embedded in the asset for further tracking. This is not a major concern to the Press Association as they do not track assets once they are published onto the web. I wasn’t sure what would happen if you decided you wanted to repurpose your content, and so needed a new set of metadata, how you link content and metadata, and how you manage the metadata and content within their separate stores.)

The PA are using Mark Logic as the content repository and a BigOWLIM triplestore to handle the associated metadata. Content is fed into the content store, then out again to a suite of indexing technologies, including concept extraction and other text-processing systems, as well as facial recognition software, to create semantic metadata. Simple ontologies are used to model the content, mainly indexing people, places, and events – themes chosen as covering the most popular search terms entered by users of the website.

John argued that such gathering and indexing of assets in order to automatically create and publish collections of associated content was simpler and easier than ingesting diverse content and metadata into traditional search, content management, and online publishing systems.

DAM for Content Marketing, Curation, and Knowledge Organisation

Mark Davey of the DAM Foundation took us on an animated and musical tour of different perspectives on metadata, engagement, social media, and how different the “digital natives” – young people who have grown up with digital technologies – will be to previous generations. Kids of the future will be able to have an idea in the morning, go to an online website app and create their site, their brand, and their marketing strategy in the afternoon, and be engaging with their potential clients by the evening.

Mark pointed out that people have moved on from the initial narcissism of social media and self-publishing and now want compelling stories they can engage with. He pointed out that as semantic technologies advance, we are caught in a feedback loop with them – we are the ontology that is driving the machines – and so we should be aware and vigilant. As the technologies become more powerful and all pervasive, we may lose sight of how they are working to serve us, rather than how we are serving up information about ourselves to them.

Marketing will have to become more sophisticated. Amongst the many statistics he quoted, I noted that 84% of 25-34 year olds have left a favourite website because of ads. At the same time, our networks become more interconnected. In a “six degrees of separation” game, we discovered that three people in the audience had met the Dalai Lama, and we are linking to more and more people through social media sites every day.

The metaphor of information as water is a familiar one, especially in the knowledge management area, but Mark’s colleague Dave pointed out how appropriate it is when talking about a DAM/dam. The DAM system forms the reservoir of content.

(I couldn’t help comparing and contrasting the ever-changing semantic seas of information at the Press Association with the more manageable streams of content that flow within smaller organisations, and how very different approaches are needed for such different contexts. The other day I saw the metaphor used again, in an interview with – apparently – one of the LulzSec hackers who talked about their pirate boat and “copywrong” as an enemy of the seas. )

Black Holes and Revelations: DAM and a museum collection

As if to continue the water metaphor, the next speaker was Douglas McCarthy from the National Maritime Museum. However, he took the metaphor up a stage, to space ships and black holes, with their content assets hidden in black holes as 100,000 uncatalogued image files.

Having catalogued and improved their DAM system, the Musuem’s Picture Library is now showing a healthy profit. Many sales come from the “long tail” of images that no-one anticipated anyone would want. Rather than saturating the market, putting the images online has been stimulating demand, with customers calling for more collections to be made available.