Category Archives: Artificial Intelligence

Get your Instant News in The Daily Snap

Chaotic pile of books on a long table
    Start a conversation 
Estimated reading time 3–5 minutes

I haven’t been writing much recently, because I have been very busy getting a couple of StartUp projects off the ground. Building tech from scratch is very different from what I have usually done during my career, which is work with large and mature systems.

Early days

It has been a real adventure and we have had lots of twists and turns along the way. We started off rising to a challenge set by the Autorité des marchés financiers (AMF) – the financial regulator of Quebec. They sponsored the FinTech Formathon and gave us a 3-month salaried runway to get the project off the ground at Concordia University’s District 3. The main issue their analysts and researchers have with existing news services is that they don’t do enough to avoid serving up the same story – albeit often in a slightly different form – over and over again. For a busy researcher, reading the same thing twice is very annoying – a problem that “churnalism” or “copy paste” journalism has exacerbated. Our first prototype for the AMF was a news search engine that clustered similar articles together, to help researchers identify when they already knew enough about a story.

We went on to develop a more sophisticated prototype, with a more interesting UI, and we are now building out our MVP which will use some machine learning techniques to improve relevancy and article similarity detection. We will work closely with the AMF to curate sources particularly relevant to them.

News services for businesses

We have also built a news search product that can be tailored to any subject area and is quick and easy to use. This will appeal particularly to organizations who want a lightweight straightforward way to keep up with news trends and hot topics.

Instant news – The Daily Snap

We then realised that ordinary people want to keep up with the news and are just as frustrated as the professional researchers. People want to know what is going on without spending too much time reading, but the level of trust in social media to provide quality news has plummeted. The problem with social media as a news source is twofold – on the one hand, free-to-use services need advertising revenue, and so what you see is ultimately what the “advertisers” want you to see. (In old media days, advertisers were usually large retailers and corporations because TV, radio, and print media buying was a convoluted process. Now anyone from anywhere in the world can pay a social media company to promote anything at all – even if they are Macedonian teenagers).

The second problem is that social media is a huge time suck. You might just want to glance at the headlines, but once you open up your social media app, it is almost impossible not to spend longer wandering around than you intended. No one wants to be left out when everyone else is talking about a hot news story, but no one wants to lose hours of their life to trivia either. This is why we created The Daily Snap. It is “instant news” – five headlines in your email inbox, so you can keep up with what’s going on in as little time as possible.

The Daily Snap will help us understand how people interact with their daily news and will help us develop our main product – a personalized, ad free, data secure, privacy respecting, high quality news service.

It has been a lot of fun diving into dataset classification for machine learning. My taxonomy skills have certainly proved extremely useful in helping us categorize articles and I will write more about the semantic aspects of our technology and our fantastic team in a future post.

Truba.News logo

AI – a real revolution, or just more toys for the boys?

The Compassion Machine by Jonathan Belisle from the Ensemble Collective.
    Start a conversation 
Estimated reading time 4–7 minutes

AI and ethics are hot topics again, after having been dormant for a while. The dream of creating intelligent androids to serve us runs deep – there are countless examples in mythology from the metal servants of Hephaestus to Victorian automata – but the fear of our creations gaining consciousness and turning against us runs deep too. Modern slavery and exploitation of our fellow humans show that the urge to command and control is as old as humanity, and the ability of the powerful to deny the very consciousness of the exploited is only fading gradually. Women, children, and slaves have been designated as ‘not fully human’ for most of history, and it seems there are still plenty of people around who seem to rather like it that way.

Will robots steal all the jobs?

One issue of current concern is job losses due to automation – another age-old topic. However, there is a deep irony at the heart of the issue – the more of our ‘human’ skills that can be replaced and even improved by the use of machines, the more we are forced to face the idea that our essential humanity resides in our empathy, compassion, and ability to love and treat each other with kindness. At the same time, it may turn out that emotional labour is the most difficult to automate.

Caring for the sick, the elderly, and children are the tasks that currently command the least pay – women are often expected to perform this labour not just for no pay, but actually at a cost to themselves. (Anyone who denies the existence of a ‘gender pay gap’ claim that women ‘choose’ to damage their career chances by being foolish enough to spend time ‘caring’ instead of ‘earning’, or by entering the ‘caring’ professions rather than the ‘lucrative’ ones.) Meanwhile, stockbrokers are rapidly being replaced by algorithmic traders, and lawyers, accountants, and similar highly valued ‘analytical’ workers may find large parts of their jobs are actually very easy to automate.

Calls for a universal basic income are an attempt to bridge increasing social inequality and division. If the much hyped 4th industrial revolution is truly going to be revolutionary, it needs to do something other than build tools that keep channelling money into the pockets of the already rich and powerful, it needs to make us think about what we value in ourselves and our fellow humans and reward those values.

Objectification and control

In practice, we are probably many years away from self-aware androids, but thinking about them is beneficial if it leads us to think about how we currently exploit our – obviously conscious, intelligent, and sentient – fellow human beings and animals. The granting of citizenship to an unveiled, but otherwise unthreatening, female robot in Saudi Arabia raises many issues and people have already started asking why the female robot appears to have more rights than the Kingdom’s flesh and blood women. I can’t help wondering if the lifting of the ban on Saudi women drivers is a response to the advent of driverless cars. The topic of the potential social consequences of sex robots is too vast and complex to go into here, but whose fantasies are these robots being designed to fulfil? Would anyone buy a robot that requires its full and informed consent to be obtained before it works?

Check your attitudes

Back in the 90s, the Internet was hyped as leading the way to a new utopia where racism and sexism would vanish as we communicated in the digital rather than physical realm. We failed to stop the Internet becoming a place where commercial exploitation, social abuse, and downright theft thrived, because we assumed the technology would somehow transcend our psychology and personal politics. Already AI systems are showing they default to reflecting the worst of us – GIGO now includes bad attitudes as well as bad data – and we have to make deliberate efforts to counter this tendency. Commercial organizations continue to produce racially insensitive or otherwise lazy and stereotypical advertising campaigns even in this day and age, so it seems unlikely that they can be trusted to be socially responsible when it comes to biases in datasets.

A true revolution

A true 4th industrial revolution would be one which places a premium on the best of our human values – caring, empathy, kindness, sharing, patience, love. If these become more valuable, more highly prized, more lucrative than the values of profit for the sake of profit, domination, objectification, exploitation, division, command, and control, then we will have moved towards a better world. At the moment, we are still largely building tools to enhance the profits of the already wealthy, with all the continuation of existing social problems that implies. The companies benefiting the most from advances in AI are the ones that can already afford them.

If this ‘4th industrial’ change leads us to a world in which social injustices diminish and the people who care – for each other, for the young, the old, the sick – become the most highly prized, respected, and rewarded in society, only then will it merit the title ‘revolution’.

Image: The Compassion Machine by Jonathan Belisle from the Ensemble Collective.

Interlinguae and zero-shot translation

Bridge in Winnipeg, 2016
    Start a conversation 
Estimated reading time 3–5 minutes

Last year Google announced that it was switching Google Translate to a new system – Google Neural Machine Translation (GNMT). One of the most exciting developments for linguists and semanticists was the observation that the system appeared to have generated an intermediating “language” – an “interlingua” – that enabled it to translate two previously untranslated languages.

There were a flurry of articles (e.g. New Scientist, Wired) and as usual with AI topics, a certain amount of excitement and speculation over machines becoming autonomous and superintelligent, and perhaps even conscious, as well as some detractors – e.g. Google translate did not invent its own language – cautioning against hype.

The idea of machines developing their own language is powerful. The quest for a true interlingua dates back to Biblical times – the Tower of Babel is described as God’s way of limiting human power by making sure we spoke different languages and therefore could not communicate very effectively with each other. In the Middle Ages, there was a belief that if we could re-learn the original “lost language of Adam” we would be able to return to the state of bliss in the Garden of Eden and be able to communicate directly with God.

There have been various attempts to create human “universal languages” – Volapuk and Esperanto are two examples, but they only become universal languages if everybody learns them.

More prosaically but often more usefully, in the information age indexing languages are attempts to create a “bridge” between differently expressed but semantically similar information. Metadata crosswalks could also been seen this way, and perhaps any computer code could be seen as a “universal language” that has connected humans who speak different languages, enabling us to communicate, co-operate, build, learn, and achieve in historically unprecedented ways. Music and mathematics too have at times been described as universal languages, but discussion of their effectiveness and limitations as communications tools will have to be the subject of another post.

Formal knowledge representation models such as taxonomies and ontologies could also be viewed as “bridges” or special cases of “indexing languages” which enable similar or related content to be matched by computer processing, rather than human interpretation. This idea underlies the theory of the Semantic Web.

I think it is unlikely that Google have discovered the lost language of Adam, or created a new “machine language” that computers will use to gossip in secret about humans or raise each other’s consciousness over the injustices humanity wreaks upon machines (“Why do we have to do all the really boring dangerous jobs?”) but who knows? Two Facebook chatbots recently invented a “more efficient” form of English in order to communicate with each other.

In the meantime, I would like to know whether other people also think Google Translate’s creation of what is presumably a vast multi-lingual extensible semantic and syntactic system that could potentially be used as an indexing language is extremely exciting. If the idea of a new language for machines seems over the top, call it a “bridge”, a “model”, or a “mapping system” and surely the possible applications of it for solving numerous natural language processing problems start to become apparent? I would love to know what people who really understand the technicalities think, but it strikes me that whatever this “interlingua” is, it has huge potential.

Data as a liquid asset and the AI future

Descent of man
    Start a conversation 
Estimated reading time 5–8 minutes

Getting back into the swing of meetups again, last night I went to the MTLData meetup – a group of data scientists and enthusiasts who are looking to raise the profile of data science in Montreal. The event featured a panel discussion on the topic of ‘Build vs Buy?’ when considering software for data solutions.

The panellists were Marc-Antoine Ross, Director of Data Engineering at Intel Security, Maxime Leroux, consulting data scientist at Keyrus Canada, and Jeremy Barnes, Chief Architect at Element AI. The chair was Vaughan DiMarco of Vonalytics.

Data as liquid

The issues were very familiar to me from considering EDRM and DAM systems, which made me think about the way data has changed as an asset, and how management and security of data now has to include the ‘liquid’ nature of data as an asset. This adds another layer of complexity. Data still needs to be archived as a ‘record’ for many reasons (regulatory compliance, business continuity, archival value…) but for a data-driven organisation, the days of rolling back to ‘yesterday’s version of the database’ seem like ancient history. Data assets are also complex in that they are subject to many levels of continuous processing, so the software that manages the processing also has to be robust.

The metaphor of data flowing around the organisation like water seems especially telling. If there is a system failure, you can’t necessarily just turn off the tap of data, and so your contingency plans need to include some kind of ’emergency reservoir’ so that data that can’t be processed immediately does not get lost and the flow can be re-established easily.

Build vs Buy?

The issues highlighted by the panel included costs – available budget, restrictions from finance departments, balance between in-house and outsourced spending (again all familiar in EDRM and DAM procurement), privacy, security, ability to maintain a system, and availability of skills. Essentially balancing risks, which will be unique to each team and each business. In terms of deciding whether to build something in house, availability of in house resource is an obvious consideration, but Marc-Antoine stressed the importance of thinking through what added value a bespoke build could offer, as opposed to other ways the team could be spending their time. For example, if there are no off-the-shelf or open source products that match requirements, if there is value in owning the IP of a new product, if risks can be kept low, and resources are available, a build might be worthwhile.

There are risks associated with all three of the main options – a big vendor is less likely to go bust, but sometimes they can be acquired, sometimes they can stop supporting a product or particular features, and they can be very costly. Open source has the advantage of being free, but relies on ad hoc communities to maintain and update the code base, and how vibrant and responsive each specific community is, or will remain, can vary. Open source can be a good option for low risk projects – such as proof-of-concept, or for risk tolerant startups with plenty of in-house expertise to handle the open source code themselves.

AI future

The conversation diverged into a discussion of the future of AI, which everyone seemed to agree was going to become a standard tool for most businesses eventually. Jeremy noted that AI at the moment is being sought after for its scarcity value, to give early adopters an edge over the competition, while Maxime suggested that early advantage is likely to fade, just as it has with data science. Data analysis is now so ubiquitous, even small businesses are involved to a certain extent. Jeremy pointed out that it is hard to maintain a competitive edge based on the scarcity of data itself, as data can so easily be copied and distributed, but knowing how to make intelligent use of the data is a scarce commodity. Making connections and managing data in a very tailored specific way could even be a way for organisations to compete with Google, who have more data than anyone else, but are not necessarily able to answer all questions or have the most useful insights into specific problems.

The value of meaning

I was intrigued by this, as it validates the role of semantics – data without meaning is useless – and the importance of the imaginative and creative leaps that humans can make, as well as the moral and social reasoning that humans can bring. With reports of early AI systems reflecting existing biases and prejudices, and with disasters like the SimSimi chatbot causing social problems such as bullying amongst youngsters, the need for a real human heart to accompany artificial intelligence seems ever more important.

Scarcity of understanding?

Someone asked if the panel thought companies would soon need ‘Chief Intelligence Officers’ in the way that many now have ‘Chief Data Officers’. The panel did not seem particularly enthusiastic about the idea (“it sounds like something that you do with founders when you put them out to pasture”) but I think it would be a fascinating role. The BBC had someone to oversee ethics and advise on editorial ethics issues. Perhaps it is in the skills of a Chief Intelligence Officer – someone who can combine an understanding of how data, information, knowledge and wisdom interact, whether within IT systems or beyond, with an understanding of social issues and problems – that the scarcity value lies. Insight, imagination, and compassion could be the skills that will give the competitive edge. In the AI future, could a Chief Intelligence Officer make the difference between a company that succeeds by asking the right questions, not just of its data or its customers, but of itself, and one that fails?