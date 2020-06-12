Protester in California last May.SOPA Images / SOPA Images / LightRocket via Gett

The false and the true follow certain patterns. Sort of like a unique code. The problem is that they are so complex, particularly in fake news, that even the false can be confused with the real. It is increasingly difficult to discern between the two. According to a report by consulting firm Gartner, in 2022 we will consume more hoaxes than true information. However, some algorithms have already managed to trace this kind of magic formula and determine certain characteristics. This is the case of research carried out by the University of Granada and Imperial College London, which has made artificial intelligence understand the emotions that language gives off or the sociological impact of a tweet.

Juan Gómez, member of the research and professor of Computer Science at the University of Granada, recognizes that the complexity of the messages makes it difficult to find these structures of truthfulness and falsity. “There are simple and eye-catching visual resources, such as emoticons and capital letters, which are relevant clues to identify fake news; but its engineering also evolves. In other words, the training data that we use in a certain context now cannot be applied ”. As artificial intelligence capabilities evolve, the hoax machinery does so even faster.

In this situation, Claire Wardle, research director of FirstDraft, runs away from a single concept of disinformation. In his opinion, at least we are faced with seven different scenarios, ranging from fabricated or manipulated news to satire – it does not intend to harm, but it has a high potential for deception. “If we are going to really tackle the problem we are in, we must understand its seriousness and we must understand what we are fighting against,” he says. This is the inner battle between machine learning and the programmers who train them. Provide deeper information and variables so that they reach that universal code of lies.

Metadata, content, thematic organization, context and coherence are some of the signals that Ricardo Baeza-Yates, director of Data Science at Northeastern University and professor of Informatics at the Universitat Pompeu Fabra, has incorporated into algorithmics to prevent disinformation. Try machine learning to learn if a text respects semantic congruence. If the mentioned facts exist. Or if there is a logical relationship as a whole. It’s not enough just to track bots and authorship. Another question is its precision. “We can afford between 60% and 80%. I think it is a reasonable percentage. If you ask 20 different people which news is more credible, neither will there be unanimity among them ”, ditch.

The researchers insist on the inconvenience of placing the responsibility for verification solely on technology. Their main advantage is that they have a greater detection capacity. Baeza-Yates gives an elementary example: the html code. “It is a valuable signal to identify these false structures and that it is not exactly available to everyone.” Even excess coherence represents a determining marker; and these algorithms immediately sound the alarm. As he explains, noise and inconsistency are characteristics of the human being.

The era of deep learning

A study by the MIT Initiative on the Digital Economy, which had analyzed some 126,000 Twitter threads, found that the truth takes approximately six times as long as the lie to reach 1,500 people. It spreads further and faster. To improve the algorithms’ fake news traceability, at least as Gomez interprets it, the time has come for deep learning to shine. “It may hold the key to some more solid structures. We have found that deep learning techniques, such as those that process natural language, improve statistics. ”

This scenario may be closer than expected. Not so long ago, like about 10 years ago, spam was collapsing inboxes and now it is more controlled thanks to the improvement of filters, which have evolved in the hands of deep learning. The problem, despite the fact that artificial intelligence improves its antibulo effectiveness, is that the creators of this information will continue to refine the technique. In the words of Baeza-Yates, it will be like computer viruses, that a new one appears year after year and we do not know how to treat it. “It is an eternal battle between bad and good. As with tax evasion. There is always a subterfuge through which disinformation will end up creeping in. ”

The scope for improvement of the algorithms is very wide, even with all the recent progress. A somewhat narrow margin if we take into account the considerations of Baeza-Yates. His level of success depends on the data, so someone has to be better than the machine to teach him. “If we are unable to find more complex articles, we will not be able to train the algorithm to detect more and more sophisticated hoaxes,” he concludes.

The evolution of fake news itself also reduces the future facing machine learning. Gómez highlights that the initial idea with which they were created was based on changing the opinion about a fact. Once it cost, the jump has been toward keeping a community on alert. Loyalty of the sympathizers. “Many texts are for own consumption. How do we control this? How does artificial intelligence manage to learn it? ”He asks himself. Nobody pretends that technology is the only one responsible for differentiating between false and true, but it is a tool that helps to decide. Little by little he lets himself be fooled less often, however hard we make it.

