AI translation no final solution for non-English researchers. Tech still needs human touch
Human communication has evolved rapidly in the last century. Science, too, has adopted various technologies to communicate with diverse people. Researchers aspire to have their work reach every corner of the world, and the robust infrastructure of online servers, open-access journals, and social media facilitates that. But many researchers, especially non-native English speakers, still struggle to communicate more effectively and develop reasonable proficiency with the de facto global language of science.
‘Language tax’ is a term used in the context of global economics. It refers to the cost incurred to compensate other countries for translating and teaching English. Extrapolated to the research world, language tax could take different forms: Time spent by researchers learning English as a second language, using translation tools, or the cost of using editing and proofreading services. Globally, half of all translation services purchased involve the English language, with a market size of more than $30 billion, and that has only been growing at a steady rate since 2020.
Why publish in reputed English journals?
Over the past decade, confidence in AI translation has grown tremendously with the technology leapfrogging in terms of accuracy. Think of an era before real-time translation apps and robust mobile keyboard plugins. This is significant as more than 60 per cent of the world’s research output comes from countries where English is not the primary language, especially the Global South, comprising India, Japan, Korea, and China.
While most research is now reported and published in English, there have been key exceptions. In the early days of the Covid pandemic, research and clinical reports were published in local languages in countries such as Italy to spread awareness quickly. But publishing in a language understood by all amplifies its impact. Researchers have more reasons to publish high-quality papers in reputed English-language journals: To disseminate knowledge, further careers, protect intellectual property, and build a reputation in their fields.
To do this, non-native speakers either collaborate with colleagues fluent in English or seek help from professional editing services. Say, a professor in Korea relies heavily on professional writing and editing services, which are expensive, especially when processing larger manuscripts that are over 10,000 words. Cost becomes an important determinant for some research work not getting due recognition and languishing in an inferior journal.
Lost in translation
Translation makes research accessible. We now live in the era of companies building open-source models that are publicly available for everyone. Take OpenAI’s chatbot ChatGPT, for example. You can ask it to write an article or give inputs on a totally unknown topic. Nevertheless, it still has quite a lot of drawbacks, and if you follow the conversations on Twitter, you will find that researchers are sceptical of such systems when it comes to reporting and working with facts.
Low confidence in the systems’ capabilities and a lack of a clear understanding of the English language itself — by humans and AI systems alike — deter their adoption. One may still prefer to get their manuscript proofread by a human interpreter after using translation services. This is understandable, given that a researcher’s primary focus is to complete their study, achieve quality results, and report their findings accurately in a research paper.
Concerns that technology would fail to understand and do justice to the semantics and context continue to be the biggest deterrents to its widespread adoption. The concerns are legitimate because technological platforms are not deterministic systems with zero-percent error capabilities. However, things are set to get better over time as language pairs and the data on which these models are trained have undergone a considerable amount of pre-processing and manual annotations.
Tech as an enabler beyond translation
It is important to understand that translation is just one piece of technology within a larger ecosystem of natural language understanding. Where do we go from here? Many tools are becoming popular in the industry ever since the world was thrust into the golden age of AI.
Grammar checkers, or software programmes like Grammarly, help identify and correct grammar and syntax errors in writing. Consulting style guides such as the Chicago Manual of Style ensure that writing adheres to the conventions and standards of academic or professional writing.
While these tools and methods are certainly helpful, they should not be relied upon blindly.
Technologists and researchers across the globe are showing a lot of interest in understanding language better and dedicatedly working toward creating robust models to make writing easier. The improvement in accuracy, and therefore adoption of these models, could well signal a considerable drop in the language tax in the next decade.
Nishchay Shah is Chief Technology Officer at Cactus Communications. Views are personal.
(Edited by Humra Laeeq)