That virtual assistant that turns on the television or plays the music if you indicate it with a voice command is not just the result of the work of engineers and computer scientists. The same goes for that automatic voice that attends you every time you call your bank or insurance. Both tools, as well as many others based on the interaction between user and machine through human oral or written language, also work thanks to another discipline, still little known but increasingly relevant in the buoyant sector of the technology industry: linguistics computational.
In this field, specialists in artificial intelligence, big data and other branches of engineering work closely with philologists and translators. They provide specific skills that allow complex and difficult to reproduce aspects of language in programming code to be handled and transmitted to machines, such as understanding an emotion or context. That is, what allows us to identify, for example, the difference between a compliment and an offense, between a joke and a reproach. More and more companies and institutions realize that they need to incorporate these profiles into their teams. “They are vitally important,” explains Luis Alfonso Ureña, president of the Spanish Society for Natural Language Processing.
The sector is growing: according to a study promoted by the Secretary of State for Digital Advancement (SEAD) in 2018, three out of four companies dedicated to language technologies in Spain had hired personnel in the previous 12 months. And more than half increased their customer volume. Experts consulted for this report assure that computational linguistics can also open up new job placement opportunities for recent graduates in letters.
Carmen Torrijos finished hers in Translation in 2010. “I didn’t even know that computational linguistics existed,” he says. Now, this sector is his usual field of work. She is currently employed as a linguist at the Institute of Knowledge Engineering, a private R&D center located at the Autonomous University of Madrid, where she has been for almost six years. “I was a translator specialized in technology. I entered here to translate texts ”, he says. Then, “a little by chance,” he began working on projects focused on language technologies. And he discovered that in his sector the tasks can be varied and useful for very different companies and organizations.
One of these tasks is the training of the algorithms that govern the functioning of the voice assistants, so that they recognize more and more phrases and respond correctly to the requests made in them. But there are also others such as the design of chatbots or the categorization of linguistic resources, that is, the parts of which a speech is composed, such as verbs and adjectives, so that computers can detect them and know how to capture their structure and meaning.
It is about managing and transmitting to the machines complex aspects of language that are difficult to reproduce in programming code, such as understanding an emotion or a context. That is, what allows us to identify, for example, the difference between a compliment and an offense.
Torrijos, who since 2018 has also been a graduate in Hispanic Philology, works mainly with linguistic corporations. In other words, sets of exploitable texts to extract valuable statistical information if machines are given the rules to understand them, such as “the clinical narrative that doctors collect on cancer patients,” he explains.
In the day to day of professionals like her, the border between humanistic and scientific disciplines is completely dissolved. “The specificity of the sector lies in the need to find mixed profiles,” reads the SEAD study. However, the companies consulted for the report indicate that they are still a scarce commodity. Torrijo says that he adapted to that on the go, in a self-taught way, although he acknowledges that “a little training on programming helps a lot and is necessary.”
“More bytes and fewer bricks ”
There are already suitable environments for this. Professor Amelia Sanz, coordinator of the official master’s degree in Digital Literature at the Complutense University of Madrid, explains that this course – taught by professors from the Faculties of Philology and Computer Science in equal parts – serves precisely so that “students become trujamanes, the new bilinguals capable of understanding programming languages and specialists in natural languages and their cultures ”.
The teacher assures that the occupation rate of the alumni of the master’s degree, launched in 2014, is close to 100%. “Of course, the area of computational linguistics that develops conversational agents (chatbots) is one that offers the most opportunities,” he points out. But the possibilities may be even greater for publishers interested in the digital conversion of their products, or for companies that create and design materials for online teaching. And there is also space in literary, artistic and historical research, as well as in museology. “Now all cultural objects like books or paintings are studied, viewed and read on the screen: they are digital.”
Sanz ensures that the demand for profiles of this type is so high that the current number of students per course (between 20 and 30 each year) is not enough to respond to all the companies and institutions that request them. He says that more initiatives such as that of the Complutense are emerging – there are already other masters in this line, for example in the universities of Barcelona, the Basque Country and Pablo Olavide in Seville – but he believes that in Spain one should bet with more conviction for this sector. “This country needs more bytes and fewer bricks,” he says.
Lose the fear
Working with technological tools and programs together with professionals such as engineers and computer scientists is an enriching aspect, according to the computer scientists consulted. “They have a very different way of thinking from ours, and I like that,” says María José García, who works at the Meaning Cloud company and is particularly dedicated to “extracting information and meaning from unstructured and relevant content for companies ”, such as social conversations, articles, comments or files.
“[Los ingenieros] they are able to simplify and structure things that we make complex much more logically. That way of thinking has helped me a lot, not only to work, but to live, ”adds García, a philologist by training, laughing. For Torrijos, “you have to learn to understand each other”, which at first “is not easy at all”, but then “a very interesting exchange” is generated and “you learn a lot from each other”.
Both encourage aspiring translators and philologists to consider following in their footsteps and not being afraid to face issues that might seem complicated to some, such as programming. “We have to get rid of a little the complexes that humanities have many times in front of science and technology,” says Torrijos.
Like them, Professor Amelia Sanz is clear that the sector has a way ahead. “Literatures will be digital or they will not be”, he maintains. In his opinion, to preserve them, the innovation of the humanities professions will be key. “We have to bring Federico García Lorca to all screens and in all ways. We need it. And our students know how to do it ”.
A sector that speaks in feminine
Computational linguistics is a sector that can open a door for women to an industry, technology, highly masculinized, according to the specialists consulted for this report. The SEAD study indicates that in the area of language technologies in 2017 there were 16% more men than women. But the perception that companies and research centers have is that the gender gap is smaller in this field compared to others.
Their presence is increasingly relevant, assure different professionals. “That is fundamental, because it means that we are going to make machines, software and artificial intelligence speak in women,” says Professor Amelia Sanz. In the Master in Digital Letters that he directs at the Complutense University of Madrid, the students are “90%” girls. “These young women are going to reach the decision-making bodies. They will feminize and humanize computing and management, ”he predicts.
Faith of errors
In an earlier version of this article, it was said that the Institute of Knowledge Engineering is at the Complutense University of Madrid. It is actually at the Autonomous University of Madrid.