Category Archives: culture

Get your Instant News in The Daily Snap

Chaotic pile of books on a long table
    Start a conversation 
Estimated reading time 3–5 minutes

I haven’t been writing much recently, because I have been very busy getting a couple of StartUp projects off the ground. Building tech from scratch is very different from what I have usually done during my career, which is work with large and mature systems.

Early days

It has been a real adventure and we have had lots of twists and turns along the way. We started off rising to a challenge set by the Autorité des marchés financiers (AMF) – the financial regulator of Quebec. They sponsored the FinTech Formathon and gave us a 3-month salaried runway to get the project off the ground at Concordia University’s District 3. The main issue their analysts and researchers have with existing news services is that they don’t do enough to avoid serving up the same story – albeit often in a slightly different form – over and over again. For a busy researcher, reading the same thing twice is very annoying – a problem that “churnalism” or “copy paste” journalism has exacerbated. Our first prototype for the AMF was a news search engine that clustered similar articles together, to help researchers identify when they already knew enough about a story.

We went on to develop a more sophisticated prototype, with a more interesting UI, and we are now building out our MVP which will use some machine learning techniques to improve relevancy and article similarity detection. We will work closely with the AMF to curate sources particularly relevant to them.

News services for businesses

We have also built a news search product that can be tailored to any subject area and is quick and easy to use. This will appeal particularly to organizations who want a lightweight straightforward way to keep up with news trends and hot topics.

Instant news – The Daily Snap

We then realised that ordinary people want to keep up with the news and are just as frustrated as the professional researchers. People want to know what is going on without spending too much time reading, but the level of trust in social media to provide quality news has plummeted. The problem with social media as a news source is twofold – on the one hand, free-to-use services need advertising revenue, and so what you see is ultimately what the “advertisers” want you to see. (In old media days, advertisers were usually large retailers and corporations because TV, radio, and print media buying was a convoluted process. Now anyone from anywhere in the world can pay a social media company to promote anything at all – even if they are Macedonian teenagers).

The second problem is that social media is a huge time suck. You might just want to glance at the headlines, but once you open up your social media app, it is almost impossible not to spend longer wandering around than you intended. No one wants to be left out when everyone else is talking about a hot news story, but no one wants to lose hours of their life to trivia either. This is why we created The Daily Snap. It is “instant news” – five headlines in your email inbox, so you can keep up with what’s going on in as little time as possible.

The Daily Snap will help us understand how people interact with their daily news and will help us develop our main product – a personalized, ad free, data secure, privacy respecting, high quality news service.

It has been a lot of fun diving into dataset classification for machine learning. My taxonomy skills have certainly proved extremely useful in helping us categorize articles and I will write more about the semantic aspects of our technology and our fantastic team in a future post.

Truba.News logo

AI – a real revolution, or just more toys for the boys?

The Compassion Machine by Jonathan Belisle from the Ensemble Collective.
    Start a conversation 
Estimated reading time 4–7 minutes

AI and ethics are hot topics again, after having been dormant for a while. The dream of creating intelligent androids to serve us runs deep – there are countless examples in mythology from the metal servants of Hephaestus to Victorian automata – but the fear of our creations gaining consciousness and turning against us runs deep too. Modern slavery and exploitation of our fellow humans show that the urge to command and control is as old as humanity, and the ability of the powerful to deny the very consciousness of the exploited is only fading gradually. Women, children, and slaves have been designated as ‘not fully human’ for most of history, and it seems there are still plenty of people around who seem to rather like it that way.

Will robots steal all the jobs?

One issue of current concern is job losses due to automation – another age-old topic. However, there is a deep irony at the heart of the issue – the more of our ‘human’ skills that can be replaced and even improved by the use of machines, the more we are forced to face the idea that our essential humanity resides in our empathy, compassion, and ability to love and treat each other with kindness. At the same time, it may turn out that emotional labour is the most difficult to automate.

Caring for the sick, the elderly, and children are the tasks that currently command the least pay – women are often expected to perform this labour not just for no pay, but actually at a cost to themselves. (Anyone who denies the existence of a ‘gender pay gap’ claim that women ‘choose’ to damage their career chances by being foolish enough to spend time ‘caring’ instead of ‘earning’, or by entering the ‘caring’ professions rather than the ‘lucrative’ ones.) Meanwhile, stockbrokers are rapidly being replaced by algorithmic traders, and lawyers, accountants, and similar highly valued ‘analytical’ workers may find large parts of their jobs are actually very easy to automate.

Calls for a universal basic income are an attempt to bridge increasing social inequality and division. If the much hyped 4th industrial revolution is truly going to be revolutionary, it needs to do something other than build tools that keep channelling money into the pockets of the already rich and powerful, it needs to make us think about what we value in ourselves and our fellow humans and reward those values.

Objectification and control

In practice, we are probably many years away from self-aware androids, but thinking about them is beneficial if it leads us to think about how we currently exploit our – obviously conscious, intelligent, and sentient – fellow human beings and animals. The granting of citizenship to an unveiled, but otherwise unthreatening, female robot in Saudi Arabia raises many issues and people have already started asking why the female robot appears to have more rights than the Kingdom’s flesh and blood women. I can’t help wondering if the lifting of the ban on Saudi women drivers is a response to the advent of driverless cars. The topic of the potential social consequences of sex robots is too vast and complex to go into here, but whose fantasies are these robots being designed to fulfil? Would anyone buy a robot that requires its full and informed consent to be obtained before it works?

Check your attitudes

Back in the 90s, the Internet was hyped as leading the way to a new utopia where racism and sexism would vanish as we communicated in the digital rather than physical realm. We failed to stop the Internet becoming a place where commercial exploitation, social abuse, and downright theft thrived, because we assumed the technology would somehow transcend our psychology and personal politics. Already AI systems are showing they default to reflecting the worst of us – GIGO now includes bad attitudes as well as bad data – and we have to make deliberate efforts to counter this tendency. Commercial organizations continue to produce racially insensitive or otherwise lazy and stereotypical advertising campaigns even in this day and age, so it seems unlikely that they can be trusted to be socially responsible when it comes to biases in datasets.

A true revolution

A true 4th industrial revolution would be one which places a premium on the best of our human values – caring, empathy, kindness, sharing, patience, love. If these become more valuable, more highly prized, more lucrative than the values of profit for the sake of profit, domination, objectification, exploitation, division, command, and control, then we will have moved towards a better world. At the moment, we are still largely building tools to enhance the profits of the already wealthy, with all the continuation of existing social problems that implies. The companies benefiting the most from advances in AI are the ones that can already afford them.

If this ‘4th industrial’ change leads us to a world in which social injustices diminish and the people who care – for each other, for the young, the old, the sick – become the most highly prized, respected, and rewarded in society, only then will it merit the title ‘revolution’.

Image: The Compassion Machine by Jonathan Belisle from the Ensemble Collective.

Data as a liquid asset and the AI future

Descent of man
    Start a conversation 
Estimated reading time 5–8 minutes

Getting back into the swing of meetups again, last night I went to the MTLData meetup – a group of data scientists and enthusiasts who are looking to raise the profile of data science in Montreal. The event featured a panel discussion on the topic of ‘Build vs Buy?’ when considering software for data solutions.

The panellists were Marc-Antoine Ross, Director of Data Engineering at Intel Security, Maxime Leroux, consulting data scientist at Keyrus Canada, and Jeremy Barnes, Chief Architect at Element AI. The chair was Vaughan DiMarco of Vonalytics.

Data as liquid

The issues were very familiar to me from considering EDRM and DAM systems, which made me think about the way data has changed as an asset, and how management and security of data now has to include the ‘liquid’ nature of data as an asset. This adds another layer of complexity. Data still needs to be archived as a ‘record’ for many reasons (regulatory compliance, business continuity, archival value…) but for a data-driven organisation, the days of rolling back to ‘yesterday’s version of the database’ seem like ancient history. Data assets are also complex in that they are subject to many levels of continuous processing, so the software that manages the processing also has to be robust.

The metaphor of data flowing around the organisation like water seems especially telling. If there is a system failure, you can’t necessarily just turn off the tap of data, and so your contingency plans need to include some kind of ’emergency reservoir’ so that data that can’t be processed immediately does not get lost and the flow can be re-established easily.

Build vs Buy?

The issues highlighted by the panel included costs – available budget, restrictions from finance departments, balance between in-house and outsourced spending (again all familiar in EDRM and DAM procurement), privacy, security, ability to maintain a system, and availability of skills. Essentially balancing risks, which will be unique to each team and each business. In terms of deciding whether to build something in house, availability of in house resource is an obvious consideration, but Marc-Antoine stressed the importance of thinking through what added value a bespoke build could offer, as opposed to other ways the team could be spending their time. For example, if there are no off-the-shelf or open source products that match requirements, if there is value in owning the IP of a new product, if risks can be kept low, and resources are available, a build might be worthwhile.

There are risks associated with all three of the main options – a big vendor is less likely to go bust, but sometimes they can be acquired, sometimes they can stop supporting a product or particular features, and they can be very costly. Open source has the advantage of being free, but relies on ad hoc communities to maintain and update the code base, and how vibrant and responsive each specific community is, or will remain, can vary. Open source can be a good option for low risk projects – such as proof-of-concept, or for risk tolerant startups with plenty of in-house expertise to handle the open source code themselves.

AI future

The conversation diverged into a discussion of the future of AI, which everyone seemed to agree was going to become a standard tool for most businesses eventually. Jeremy noted that AI at the moment is being sought after for its scarcity value, to give early adopters an edge over the competition, while Maxime suggested that early advantage is likely to fade, just as it has with data science. Data analysis is now so ubiquitous, even small businesses are involved to a certain extent. Jeremy pointed out that it is hard to maintain a competitive edge based on the scarcity of data itself, as data can so easily be copied and distributed, but knowing how to make intelligent use of the data is a scarce commodity. Making connections and managing data in a very tailored specific way could even be a way for organisations to compete with Google, who have more data than anyone else, but are not necessarily able to answer all questions or have the most useful insights into specific problems.

The value of meaning

I was intrigued by this, as it validates the role of semantics – data without meaning is useless – and the importance of the imaginative and creative leaps that humans can make, as well as the moral and social reasoning that humans can bring. With reports of early AI systems reflecting existing biases and prejudices, and with disasters like the SimSimi chatbot causing social problems such as bullying amongst youngsters, the need for a real human heart to accompany artificial intelligence seems ever more important.

Scarcity of understanding?

Someone asked if the panel thought companies would soon need ‘Chief Intelligence Officers’ in the way that many now have ‘Chief Data Officers’. The panel did not seem particularly enthusiastic about the idea (“it sounds like something that you do with founders when you put them out to pasture”) but I think it would be a fascinating role. The BBC had someone to oversee ethics and advise on editorial ethics issues. Perhaps it is in the skills of a Chief Intelligence Officer – someone who can combine an understanding of how data, information, knowledge and wisdom interact, whether within IT systems or beyond, with an understanding of social issues and problems – that the scarcity value lies. Insight, imagination, and compassion could be the skills that will give the competitive edge. In the AI future, could a Chief Intelligence Officer make the difference between a company that succeeds by asking the right questions, not just of its data or its customers, but of itself, and one that fails?

Fake news and virtual reality – You can lie to my face, you can’t lie to my heart, can you?

    1 comment 
Estimated reading time 6–10 minutes

On Monday I went to an in_collusion event which showcased two companies producing works in Virtual Reality, one focused on its use in marketing (Fusion Works), the other uses VR to make art (Marshmallow Laser Feast). The presentations were excellent and the demonstrations of the technology were a lot of fun, but I left feeling equally thrilled and terrified.

In a world where people seem increasingly unable to tell the difference between fact, opinion, belief, biased news, erroneous news, propaganda, and downright lies in the form of flat websites and screens, how easy will it be to manipulate people through VR?

Fake news in the olden days

The fake news debate is not new. Back in the 80s and 90s, we had biased media that worked to its own agenda, and information professionals worried about objectivity in terms of finding reliable sources, taking a neutral standpoint, and understanding statistics. Politics has always been a propaganda game, and those who were interested in objectivity attempted to achieve balance by presenting “both sides of the story”. The BBC was well funded and was required to be “unbiased” in that it could not be seen to be promoting one political party more than another. However, it could only be unbiased within the range of mainstream political viewpoints, and that was based on the assumption that views that stood outside the political mainstream did not need to be represented. In order to be part of the social consensus, a viewpoint needed some kind of representation within the existing political framework. I don’t remember much coverage of the Monster Raving Loony Party, but they were included enough that I have heard of them.

Mainstream media biases were reasonably transparent – The Telegraph and The Times were right-wing, The Mirror was left-wing. The Labour Party stood on the left, the Tories on the right, the Liberals of various forms stood in the middle. Arguments over whether the BBC was biased focused on whether one party was getting more airtime than another. It was up to the opposition to provide counter-arguments to the party in power, and “neutrality” was achieved adversarially. However, it was also pretty clear that “establishment” media did not venture far to the left or right of these parties, these were the parties that were getting the majority of votes, and so the debate about bias largely operated within this range.

There was little coverage of non-establishment viewpoints, and those publications were easily distinguished from the mainstream, largely because you could tell they didn’t have much money. Anarchist or neo-nazi or new age or other “fringe group” newsletters were obviously photocopied. Special interest news published by charities might have higher production standards, but tended to be associated directly with the group funding the publication and clearly branded because they wanted everyone to know who they were – Greenpeace for environmentalist news, for example, or Amnesty International for human rights coverage. In other words, you only had to look at the publication to be able to tell where its biases were likely to be.

All that glitters on line is gold?

Those affordances started to disappear with the advent of the web, and its “democratization” of the publishing process. I remember discussions about what we could do as Information Professionals to help people tell the difference between “mainstream” and “alternative” websites. At first, it did not seem urgent, because “mainstream” organizations with a lot of money were able to build slick, well designed websites, which looked and worked differently to websites that had been hand-crafted by individuals.

With the rise of high quality blogging software and falling costs of production technology, that gap has closed, and those differences are far more subtle. “Established” old media, such as local papers, have seen their budgets shrink, while technology has become cheaper, so anyone wanting to build a website from scratch with a limited budget can now produce a site that looks pretty much the same as an “established” one, even without the benefit of a wealthy sponsor.
So, now we have satire, websites of “old media” outlets, and new sites that all look almost the same. That’s the equivalent of your local anarchist collective being able to produce a newspaper that looks like Time magazine, and the National Enquirer looking much like The Economist. No wonder people are confused!

It’s how it feels that counts

On top of that, accuracy and fact-checking are both time-consuming and expensive. Social media demands speed above all else. Today’s news becoming tomorrow’s chip wrappers seems like aeons in a world of flickering feeds. In the past, rushing for a scoop often led to inaccuracy, but people would at least browse through an entire paper, allowing time for more reflective articles and analysis.

We have also seen a rise in “emotional” reporting. A reasonable desire to allow people to connect emotionally with what was happening to others around the world became the mainstay of 24-hour reporting. When you don’t have time to reflect, you can still grab attention by talking about people’s feelings. Watching “experts” provide reasoned analysis after an event isn’t as thrilling as watching people screaming or crying.

On line, every article has to be as appealing as every other. The “stickiness” of sites that was much discussed in the 90s and 00s was essentially an attempt to replicate the behaviour of someone who would buy a paper because of the headline, and then slowly browse the rest of the articles – which was where you tended to find the more slowly produced, reflective content. Now every article has to shout for attention, and we have a clickbait world, where screaming sensationalism and inaccuracy don’t matter, because it’s the volume of clicks that generate the revenue.

This is how we got here, now where do we go?

As Information Professionals we understand and have long debated the issues, we can understand how we arrived at this point, and we know how to verify and validate.

We must educate people to look behind the headlines to the source of the information, to understand who funded the site, and why, who the author was and what biases they are likely to have, how to unpick loaded language and selection bias, and how to understand statistics and the way they can be manipulated. We must encourage reflection, comparison, and understanding of context and intent. Fake news, propaganda, and bias are familiar concepts to us, and yet we still have people claiming to have voted for Brexit because they think the EU banned the re-use of teabags.

We have not done well at creating a reflective, media literate, statistically savvy general population, and that is just for flat websites and screens.
Virtual Reality is all engrossing, more absorbing than reading, more immersive than cinema, more immediate than television.

Whose reality is it anyway?

At the moment, the technology is in its infancy, and it is still “obvious” that it is not “IRL” – just as the sources of print media in the 80s and the early web of the 90s were “obvious”.

If we haven’t managed to create a media literate population able to distinguish honest reporting and unintentional mistakes from clickbait and propaganda, how are we going to create a VR-literate population able to distinguish the emotional impact of a “virtual” experience from a “real” one? People already tend to trust what they “feel” is right, and “facts” are not persuading them otherwise, as they would rather “trust their own experience”. How are we going to ensure political propaganda doesn’t become so emotionally absorbing that “post truth” is not just about distinguishing fact from fiction, but distinguishing “our own” experiences from virtual ones? Is it enough to know the context and intent of the makers of our virtual experiences? Do we need to teach people to develop “experiential literacy”, and do we need to start developing that very, very soon?

Image: Credit: Sean Goldthorpe. Renowned dancer Aakash Odedra, choreographer Lewis Major and the Ars Electronica Futurelab, staged at the International Dance Festival Birmingham, 2014.

Inadvertent Cruelty – Algorithmic or Organizational?

    Start a conversation 
Estimated reading time 3–4 minutes

In 2013 I asked whether social media were mature enough to handle bereavement in a sensitive manner. Last week Facebook released the options either to have your account deleted when you die or to nominate a trusted legacy manager to take it on for you as a memorial (Facebook rolls out feature for users when they die ).

This was in response to the distress of relatives who wished to retrieve a lost loved one’s account or did not want to undergo the the eerie experience of receiving automated reminders of their birthday or seeing their name or image appear unexpectedly in advertising. The enforced “Year in Review” offerings at the end of last year brought some publicity to the issue, as they also inadvertently caused distress by failing to consider the feelings of people who had suffered bereavements during the year. The original blog post about this (Inadvertent Algorithmic Cruelty ) went Viral last Christmas. The author quickly called for an end to a wave of casual responses that jumped to glib conclusions about young privileged staff just not thinking about anything bad ever happening (Well, That Escalated Quickly ).

A more cynical response is that there was a deliberate dismissal as ‘Edge cases’ of the minority of people who would not want to have year in review posts – possibly even a coldly calculated costs v. benefits decision, as providing “opt out” options might have required additional work or been seen as dispensible mouseclicks.

I have no idea what happened at Facebeook, or what discussions, processes, and procedures they go through, the public apologies from Facebook do not go into that level of detail. However, “algorithmic cruelty” may be unintentional, but it is not a new phenomenon and in any project there are plenty of opportunities during the design and implementation of any project to think through the potential adverse impacts or pitfalls.

David Crystal at an ISKOUK conference in 2009 talked about the problem of avoiding inappropriate automated search engine placement of advertisements, for example ads for a set of kitchen knives alongside a story about a fatal stabbing. There was a certain naivety with early automated systems, but it did not take long for the industry in general to realise that unfortunate juxtapositions are not unusual incidents. Most people who have worked in semantics have plenty of anecdotes of either cringeworthy or hilarious mismatches and errors arising from algorithmic insensitivity to linguistic ambiguity.

Facebook’s latest thoughtlessness arises more from a failure to respect their users than through lack of sophistication in their algorithm (there doesn’t seem to be anything particularly complex about selecting photos and bunging some automated captions on them). Simply offering users the choice to look or not look or giving users the tools to build their own would have spared much heartache.

The origins of UX championed by people such as Don Norman and Peter Morville and Louis Rosenfeld placed user needs front and centre. Good design was about seeing your users as real people with physical and emotional needs as human beings, and designing to help their lives go more smoothly, rather than designing to exploit them as much as possible.

We don’t know where you live…

    Start a conversation 
Estimated reading time 2–4 minutes

Earlier this month the Estonian government opened its “digital borders”, allowing registrations for “e-citizenship” (there are two interesting pieces about this in New Scientist: E-citizens unite: Estonia opens its digital borders and Estonia’s e-citizen test is a test for us all).

Does the new form of citizenship mean the end of the nation state?

The Estonians appear to be creating a new category of “nationality” and the move has prompted a flurry of debate on whether or not this heralds the end of the nation state. “Nationality” has always been a problematic and somewhat fluid concept. Although people are often emotional about their nationality, in practice it is largely an artificial administrative device. Birthplace, at least in developed countries, tends to be well known and is often formally and officially recorded, so has been relatively administratively straightforward. The Estonian move is interesting because it takes away the requirement of some kind of physical presence for citizenship, so gaining a second “e-nationality” is far simpler than going to live somewhere else, making it a very attractive option.

A new category of citizen

The new category of “e-citizens” will not have the same rights (or, presumably responsibilities) as “traditional” citizens – immediately adding a layer of complexity to information management around citizenship. According to The Economist, Estonia’s chief information officer, Taavi Kotka, has stressed that the [new form of] ID is a privilege, not a right. E-citizens can have their e-citizenship removed if they break the law, for example.

The creation of a new category of citizenship in itself should not threaten the nation state. There are six different types of British nationality, for example, as a consequence of the UK’s colonial past.

One question that may arise is how many e-citizenships a single individual can hold at once? How much will it cost Estonia to police and manage their new citizens and will all countries want to or be able to offer such services? Any outsourcing e-citizenship and identity management services to technology companies will have huge security and surveillance implications.

A citizenship marketplace?

A “marketplace” for e-citizenships may arise, with countries and even cities, or perhaps even other administrative entities, competing to offer the best services or biggest tax breaks to attract wealthy e-citizens. Revenues will likely flow into places like Estonia and away from wherever the new e-citizens live. The location of your e-citizenship could become more important than the place you were born, even if you still reside there. How your e-citizenship (where taxes will be paid) and your place of residence interact (where services like roads and schools need to be provided) could become a highly politicized. The immediate challenge to existing nation states is how they will decide to co-operate with each other over their e-tax revenues.

Adventures in Semantic Theatre

ship sailing into the full moon on the horizon
Estimated reading time 5–8 minutes

I have been investigating the idea of using semantic techniques and technologies to enhance plays, along with the Montreal Semantic Web meetup group. There have been far fewer Semantic Web projects for the humanities than the sciences and even fewer that have examined the literary aspects of the theatre. Linked Open Data sets associated with the theatre are mostly bibliographic, library catalogue metadata, which treat plays from the point of view of simple objective properties of the artefact of a play, not its content: a play has an author, a publisher, a publication date, etc. Sometimes a nod towards the content is made by including genre, and there has been work on markup of scripts from a structural; perspective – acts, characters, etc. There are obvious and sound reasons for these kind of approaches, meeting bibliographic and structural use cases (e.g. “give me all the plays written by French authors between 1850-1890”; “give me the act, scene, and line references for all the speeches over ten lines long by a particular character”; “give me all the scenes in which more than three characters appear on stage at once”).

Modelling literary rather than physical connections

Once we started discussing at the meetups how we could model the content itself, especially in a qualitative manner, we quickly became embroiled in questions of whether or not we needed to create entire worldviews for each play and how we could relate things in the play to their real world counterparts.

One of the plays we are working on – Ocean Opera by Alex Gelfand (to be performed at the Montreal Fringe Festival this June) – included the Moon as a character. How and by what relationships could we link the Moon of the play to the Moon in the sky, and then how could we link it to other fictional and literary Moons?

Another play we analysed – Going Back Home by Rachel Jury – was a dramatization based on real people and historical events. It seemed obvious these should be linked to their real counterparts, and would a simple “is a fictional representation of” suffice? How should we relate depictions of historical events in the play to eyewitness accounts from the time or to newspaper reports?

Should we define the world view of each play? Would it matter when defining relationships if there were events in the play that were counterfactual or scientifically impossible?

How could we capture intertextuality and references to other plays? Should there be a differentiation between quotations and overt references by the author to other texts and less explicit allusions and shared cultural influences?

Artistic Use Cases

One of the most appealing aspects of this project to me is that we have no strict commercial or business requirements to meet. A starting point was the idea of a “literary search engine” that ranked relevance not according to information retrieval best practice, but under its own terms as art, or perhaps even defined its own “relevance within the world of the play”. In other words, we would be trying to produce results that were beautiful rather than results that best answered a query.

However, there are also a number of very practical use cases for modelling the literary world of a play, rather than just modelling a play as an object.

Querying within a play

Navigating within the text by answering such queries as ‘in which scenes do these two characters appear together’ answers one set of use cases. The BBC’s Mythology Engine was designed to help users find their way around within a lot of brands, series, and episodes, and characters and events were modelled as central.

An equivalent set of queries for literary aspects would be “how many scenes feature metaphors for anger and ambition” or “which monologues include references to Milton”.

Querying across many plays

If you extend such use cases across a body of plays, recommendation scenarios become possible. For example, “if you liked this play which frequently references Voltaire and includes nautical metaphors, then you might also like this play…” and there are clear commercial implications for the arts in terms of marketing and promotion, finding new audiences, and even in planning new work.

These kind of “metaphorical use cases” could also serve as a rich seam for generating interesting user journeys through a literary archive and as a way of promoting serendipitous discovery for students and researchers.

Storyline use cases

A lot of work that has been done at the BBC has been based around the concept of an ‘event’, and the relationship of events to storylines. This is particularly relevant for many practical and creative aspects of writing, compiling, broadcasting, archiving, and re-using content. For example, being able to distinguish the name of the journalist from the names of people who are mentioned within the story, and to distinguish between more and less significant people within a story according to whether they are mentioned as part of the main event or only in association with consequent or secondary events.

Literary and metaphorical use cases might take a similar approach but decompose the events in a story in terms of the emotional development of the characters.

Fictional worlds use cases

One of the ideas that I find the most appealing, but is the hardest to pin down, is the idea of modelling the internal ontological world of a work of fiction. In a fictional ontology, you can have relationships that make no sense in the ‘real’ world, so modelling them cannot rely on the kind of sense-testing and meeting of requirements that we use so much in commercial contexts.

In discussions, some people reacted very strongly against the idea of even attempting to model fictional worlds, which I found fascinating, while others immediately saw the idea as just another aspect of literary creation – an artistic endeavour in its own right.

There is an epistemological tangent in ontological thinking that goes into a debate about realism versus anti-realism that I haven’t fully got to grips with yet.

Where next?

I am at the very early stages of thinking through all this, and not sure where it will go, but am enjoying starting to gather a community of interest. If you would like to know more, I have written in more detail about it all on the project blog:

The value of forgetting

    Start a conversation 
Estimated reading time 3–4 minutes

Two years ago I was thinking a lot about social media and bereavement and I wrote a post (I friend dead people – Are Social Media Mature Enough to Cope with Bereavement?). Today, by strange coincidence, I happened upon this post: AI resurrector lets people Skype their dead relatives. As the post points out, this appears to be an incarnation of an episode of Black Mirror by Charlie Brooker, so apart from worrying about which other of his dystopias people are going to invoke next, I was again prompted to think about forgetting and remembrance as information processes.

In the past, human beings have found it very easy to forget and have struggled to remember. Oral histories and stories preserved by poets and carvings in stone to record conquests and kings were early memorializations and were important precisely because so little was recorded. Pre-Renaissance librarians and archivists were often more concerned with gathering and preserving scant records than with information overload, or even systematic organization of knowledge, simply because the volume of materials they had to work with was limited. As printing technologies developed and more informational records were paper-based, archivists had to balance the urge to preserve with practical considerations such as the costs of space required to store documents. During the 20th century, the massive surge in the volume of paper documents generated meant that we had to start thinking carefully about what we would deliberately forget.

The digital age seemed to suggest that somehow storage would become so cheap and search engines so intelligent that we would be able to save everything and find it again without a worry and many people seemed to see this as a good thing – archival management becomes a lot easier if you do not bother to select and manage a collection. Professional archivists have pointed out the pitfalls of this attitude and on a personal level, so far, we have – by and large – used our PCs, cameraphones, scanners, etc., to generated huge unmanageable collections of data without regard for what we want to remember and what we want to forget. The urge to “just keep everything” is strong. Charlie Brooker’s dystopias are valuable in showing us the psychological pressures we will have to deal with in this new world.

Our traditions of marking anniversaries, building memorials, and remembering our past have led us to equate memorializations with respect and love, against a background where most things get forgotten. However, as humans we need to forget pain and grief, we need to “let go” and “move on”, otherwise we cause ourselves psychological problems, so we need to be careful with our digital memorializations as extensions of our social networks (for example Facebook video memorials). They may seem like works of love and respect, but there is a danger they will lead people into unhealthy obsession with the past. We have, after all, never before lived in a world where it is harder to forget than it is to remember.

Update: More Google ‘forget’ requests emerge after EU ruling

For Claire – not forgotten.

The Information Master – Louis XIV’s Knowledge Manager

Estimated reading time 4–6 minutes

I recently read The Information Master: Jean-Baptiste Colbert‘s Secret State Intelligence System by Jacob Soll. It is a very readable but scholarly book that tells the story of how Colbert used the accumulation of knowledge to build a highly efficient administrative system and to promote his own political career. He seems to have been the first person to seize upon the notion of “evidence-based” politics and that knowledge, information and data collection, and scholarship could be used to serve the interests of statecraft. In this way he is an ancestor of much of the thinking that is commonplace not only in today’s political administrations but also in all organizations that value the collection and management of information. The principle sits at the heart of what we mean by the “knowledge economy”.

The grim librarian

Jean-Baptiste Colbert (1619-83) is depicted as ruthless, determined, fierce, and serious. He was an ambitious man and saw his ability to control and organize information as a way of gaining and then keeping political influence. By first persuading the King that an informed leadership was a strong and efficient leadership, and then by being the person who best understood and knew how to use the libraries and resources he collected, Colbert rose to political prominence. However, his work eventually fell victim to the political machinations of his rivals and after his death his collection was scattered.

Using knowledge to serve the state

Before Colbert, the scholarly academic tradition in France had existed independently from the monarchy, but Colbert brought the world of scholarship into the service of the state, believing that all knowledge – even from the most unlikely of sources – had potential value. This is very much in line with modern thinking about Big Data and how that can be used in the service of corporations. Even the most unlikely of sources might contain useful insights into customer preferences or previously unseen supply chain inefficiencies, for example.

Colbert’s career was caught up with the political machinations of the time. He worked as a kind of accountant of Cardinal Mazarin, but when Mazarin’s library was ransacked by political rivals and his librarian fell out of favour, Colbert restored the library and built a unified information system based on the combination of scholarship and administrative documentation, ending the former division between academia and government bureaucracy.

Importance of metadata

Colbert also instinctively grasped the importance of good metadata, cataloguing, and an accurate network of links and cross references in order to be able to obtain relevant and comprehensive information quickly, issues that remain even more urgent than ever given the information explosions modern organizations – and indeed nations – face. This enabled him to become a better administrator than his rivals and by becoming not only the source of political expedient information but also the person who knew how to use the information resources most effectively, he was able to gain political influence and become a key minister under Louis XIV.

A personal vision

I was struck by how much his vast library, archive, and document management system was the result of his own personal vision, how it was built on the dismantling and rebuilding of work of predecessors, but also how, after his death, the system itself fell victim to political changes and was left to collapse. This pattern is repeated frequently in modern information projects. So often the work of the champion of the original system is wasted as infighting that is often not directly connected to the information project itself leads to budget cuts, staff changes, or other problems that lead to the system decaying.

Soll argues that the loss of Colbert’s system hampered political administration in France for generations. Ironically, it was Colbert’s own archives that enabled successive generations of political rivals to find the documents with which to undermine the power of the crown, showing the double-edged nature of information work. It is often the same collections that can both redeem and condemn.

Secrecy or transparency?

Another theme that ran throughout Colbert’s career, with both political and practical implications, was the tension between demands for transparent government and the desire for a secret state. Much of the distinction between public and private archives was simply a matter of who was in control of them and who had set them up, so the situation in France under the monarchy was different to the situation in England where Parliament and the Monarchy maintained entirely separate information systems. In France, an insistence on keeping government financial records secret eventually undermined trust in the economy. Throughout his career Colbert was involved in debates over which and how much government information should be made public, with different factions arguing over the issue – arguments that are especially resonant today.

On being the only girl in the room

    1 comment 
Estimated reading time 3–5 minutes

Perhaps it is because I am settling into a new culture, or perhaps it is because my new time zone has altered the nature of what I see in my Twitter feed, but there seem to have been a spate of articles lately about sexism faced by women working in technology, which makes me very sad. This was on my mind when I received from a former colleague a copy of a report we had co-authored. As I read the list of names, I was struck by how wonderful a group of guys they were, how intelligent, creative, and technically knowledgeable, and what a pleasure it had been to be the only girl in the room. Those guys were utterly supportive, thoughtful, generous of spirit, and full of interest in and encouragement of my contributions.

I am from an editorial background and I don’t really write code, but never once in that group did I experience any kind of tech snobbery. Whenever there was something that I didn’t know about, or unfamiliar acronyms or jargon, someone would provide a clear explanation, without every being patronising, appearing bored or impatient, or making any assumptions about what anyone “ought” to know. I was never made to feel I had asked a stupid question, said something foolish, or that I did not belong. At the same time, these men were always keen and interested to hear my perspectives, and to learn from my experiences. The group dynamic was one of free and open exchange of ideas and of working collaboratively to find solutions to problems. All contributions were valued and everything was considered jointly and equally authored.

I didn’t remain the only girl in the room. I was learning so much in the meetings that I invited my (now former) colleague to join us, bringing a new set of expertise and skills that were welcomed. I had not a moment of concern about inviting a younger and even less technical female colleague in to the group, because I knew she would be made welcome and would have a fantastic opportunity to learn from some brilliant minds.

Of course I have encountered much sexism in my career, but it is not necessary and it is not inevitable. I hate the thought of young women being put off technology as a career because of fears of sexism and discrimination. I know this happens a lot – it happened to me, although I found my way into tech eventually. I do not know whether there is “more” sexism in technology – a charge some of the post I have read have levelled – than there is anywhere else, but I do know that there is sexism in all industries, so you might as well ignore it as a factor and choose a career based on aspects like intellectual stimulation or good career prospects. Technology certainly offers those. I personally have encountered sexism in so-called “female friendly” industries such as publishing and teaching, and I am quite sure it is suffered by nurses, waitresses, actresses, pop singers…. Since I have been working in technology, I have often been the only girl in the room but almost always that room has been a fascinating, welcoming, and inspiring place to be.

This is not primarily written for the specific individuals nor for all the other fantastic guys in tech I have met or worked with (there are so many I can’t possibly name them all), although I hope they enjoy it. This post is intended to promote positive male role models and examples of decent male behaviour for boys and young men to follow, and as a mythbuster for anyone who thinks sexism and geekiness are somehow intrinsically linked.

It is also written for women, as a reminder that although we must speak out against sexist and otherwise toxic behaviour when we encounter it, approval and affirmation are very powerful motivators of change, so we also help by shouting about and celebrating when we find fabulous guys in tech and in life who are getting it right.