Machine translation

How Accurate Is Google Translate: A Look at Research and Practice

Google Translate is a key player in machine translation, but how accurate is it? Let’s look at research studies and best practices to help you make better use of it.
Machine translation blog category featured image | Phrase

Since the dawn of ages—the fall of the Babel Tower, for those who believe in it—people have needed to convey ideas in languages they haven’t mastered. This is known as “translation” and stands for the act of rendering text from one language to another.

For that, one would usually rely on someone else, who not only mastered both the “source” (original) language and “target” (translated) language but could also be trusted to translate the meaning of a message as it was originally intended.

This process can be costly and often impractical. Finding reliable translators can be challenging, costs soar, and above all, productivity may be low—but it all used to work for centuries when no alternative solutions were available. The advent of computing tools in the 20th century changed everything.

From statistical models to deep learning in translation technology

Computers brought enormous improvements in translation with their ability to store already translated sentences and match them with new texts to translate. Different computer-assisted translation (CAT) tools were then developed, which significantly improved productivity. Nonetheless, even with heavily involving a CAT tool, matching pairs had to be reviewed in context, and non-matching pairs still needed translation.

Machine translation (MT), which automatically translates non-matching pairs without the need for a professional translator, has taken translation capabilities even further.

Also referred to as automatic translation, MT quickly became available to anyone thanks to the internet, and many technology vendors began to offer machine translation services for free. It seemed to solve all the problems posed previously—instant translation done for free.

However, there are still some questions swirling around the use of machine translation software. How best to implement machine translation into translation workflows? How much can you trust the translation produced by a machine? To what extent does MT properly convey the original meaning of the content? In other words, is it accurate?

The present and future of machine translation study cover.jpg | Phrase

Free download

Your up-to-the-minute guide to machine translation

Learn about new technologies to improve machine translation output quality, the latest on MT post-editing pricing models, and how to best shop for machine translation.

Download guide

What is Google Translate?

Google, one of the leading MT providers today, has had a critical impact on making MT a viable productivity tool for both personal and business use. Its core machine translation service, Google Translate, is one of the most used free MT tools worldwide.

Google launched its public translation service in 2006, supporting only a small number of language pairs. Originally based on statistical machine translation (SMT) technology, it used translated material from the United Nations and European Parliament to build its database of language pairs.

Fast forward to today, and you can see that Google Translate is used by around 500M users, translating around 100B words every day.

What technology does Google Translate use in the backend?

Google Translate uses the frequency of word pairs between two languages as a database for its translation results. Initially, it relied on statistical MT technology, which uses a set of existing translations (corpora) to create statistical models for translating specific words within sentences.

While efficient, the shortcomings of SMT—above all the cost related to creating corpora and the low-quality results for specific language pairs—led Google to introduce Google Neural Machine Translation (GNMT) in 2016. Neural machine translation (NMT) is the latest MT technology, with over 100 languages supported in 2022.

Instead of running a set of predefined rules from the start, neural networks—inspired by the way human brains work—can handle complete sentences as examples of inputs (source text) and outputs (translated text) to predict the translation result. This has led to improved accuracy, customization, cost efficiency, and scalability.

Does Google Translate use English as an intermediary step?

While using SMT, Google Translate had to use an intermediary language, also known as a “pivot language,” between the source and target text. Apart from a few exceptions, Google used English for this purpose.

Since switching to GNMT, Google Translate is able to translate directly from one language into another, without the use of an intermediary language.

How has Google Translate’s accuracy improved over time?

You’re here because you want to know how accurate Google Translate is. A research study found that Google Translate worked well for many European languages but not as accurate for some languages spoken in Asia. The top 10 languages for translation accuracy with English in Google Translate were (from best to worst):

  • German
  • Afrikaans
  • Portuguese
  • Spanish
  • Danish
  • Greek
  • Polish
  • Hungarian
  • Finnish
  • Chinese

A couple of years later, a reevaluation using the same input as the original study showed a 34% improvement in Google Translate’s accuracy.

How do you assess Google Translate’s performance?

At Phrase, we believe that machine translation is a powerful productivity tool that can help businesses reach global audiences quickly and efficiently. In our quarterly Machine Translation Report, we bring together the latest performance data for all of the major MT engines used in real workflows in Phrase TMS, our enterprise-ready translation management system.

Google Translate has continuously been at the top of the list of most used machine translation engines that don’t require a setup, i.e. are fully managed in Phrase TMS. When looking at how Google Translate performs in terms of accuracy and quality, there are 2 criteria to consider from the start: language pair (source vs target language) and content type (domain).

According to the latest MT Report, the top 3 language pairs used in machine translation projects in Phrase TMS are:

  • English-Spanish
  • English-French
  • English-German

When it comes to content type per language pair, Google Translate has achieved the highest performance scores in the following domains:

  • Medical for English-Spanish
  • Medical for English-French
  • Software development for English-German

The result is based on anonymized machine translation post-editing data collected in a period of 6 months. To gather precise MT quality results, we’ve filtered translation segments to reflect the required post-editing effort as closely as possible: Either MT was used and post-edited, or the linguist translated from scratch despite the availability of MT—suggesting that the MT quality was too low for post-editing.

Can you trust Google Translate?

As language and translation are both dynamic categories that intrinsically reflect processes—not static phenomena—accuracy should also be seen as a relative concept. Accuracy in translation will depend on the original intention of the author and the destination of the message. For example, expectations of accuracy in grammar, style, and register for an email will greatly vary from the expectations of accuracy for a novel.

A frequently heard opinion is that Google Translate’s free MT service is accurate enough for most users because they need to translate simple messages—and what matters most is that the audience is able to grasp the sense of it rather than the complete “native” message. It can then be considered accurate enough because expectations are low.

As a rule of thumb, Google Translate’s free MT tool mostly lacks accuracy:

  • When used as a dictionary to translate single words: Google Translate struggles to produce an accurate result, i.e., as intended by the author, because of the many meanings a single word can have; this is true for English as well as for other widely spoken languages
  • When translating familiar expressions that don’t have a direct equivalent in the target language
  • When non-verbal expressions are an important part of the message, e.g., when being ironic
  • When grammatical rules aren’t properly used in the source language or used differently in the target language, such as the subjunctive mood in English

For business purposes, when a large amount of content needs translation across domains, Google offers its Cloud Translation connected system. Companies can either set it up themselves or rely on a translation management system (TMS) to fully manage it from day one. Cloud Translation offers customization features for domain and context-specific terms as well as the possibility to train custom translation models.

Google’s Cloud Translation offering makes an official statement that it doesn’t use any content submitted for translation for any purpose other than providing the translation service. Nevertheless, it’s unclear how the company uses the information submitted to the free version of Google Translate—or if the data influences business decisions in any way.

Machine transtion report key visual | Phrase

Get quarterly insights from our MT experts

Find out how leading machine translation engines perform for different content types using the latest data in our interactive Q1 2023 Machine Translation Report.

View report

Does Google Translate have any major competitors?

While Google Translate may be the first name that will pop up when discussing machine translation, there are several competing machine translation tools on the market—each of them offering a specific approach to MT.

Here are some of Google Translate’s major competitors to consider when looking for the best machine translation engine:

Amazon Translate

Amazon Translate is part of Amazon Web Service, a subsidiary of Amazon, providing on-demand cloud computing platforms and APIs for both individuals and businesses. It’s based on NMT technology as well.

Amazon Translate supports translation between 75 languages.

DeepL

DeepL is a German-based online MT service that was launched in 2017. It uses a proprietary algorithm with NMT technology and can process DOCX, PPTX, and PDF files while retaining footnotes, formatting, and embedded images.

DeepL supports 26 languages, forming 650 target-to-source combinations.

Systran Translate

Systran is a translation technology company founded in 1968 by a researcher at the California Institute of Technology. It’s one of the first companies to start developing MT software. Its original objective was to improve the translation of Russian into English.
Starting with rule-based MT technology, it developed hybrid RbMT/SMT technology and has since then switched to NMT.

With Systran Translate, you can translate into 50 languages.

Microsoft Translator

Microsoft Translator is a multilingual MT cloud service provided by Microsoft. As part of Microsoft Cognitive Services, it’s integrated with multiple consumer, developer, and enterprise products.

Microsoft Translator supports over 100 languages.

Tencent

Tencent Machine Translation is the main MT offering by Chinese technology giant Tencent. The solution combines both NMT and SMT models.

Tencent Machine Translation supports over 160 different language pairs.

How does Amazon Translate compare to Google Translate?

Both Amazon Translate and Google Translate are based on NMT technology. According to various comparisons, Google Translate often tends to be slightly more accurate. Nonetheless, the differences are negligible.

Since there are no professional translators involved in the MT process, both translation tools have their limitations—it all depends on the type of content you want to translate, your language pairs, as well as the specific requirements you have for the tool.

Will Google Translate ever be perfect?

Translation is not just about converting words from one language into another. If it were so, a dictionary would be the only necessary tool of the trade, and we all have seen the very poor (and sometimes very funny) results of working that way. This is because a message isn’t only made of words—it also contains context, intention, non-verbal aspects, etc.

That said, Google Translate has been rapidly advancing over the years, but it still can’t do much that human translators can:

  • Ask questions
  • Understand context
  • Catch irony
  • Translate creatively
  • Make considered choices
  • Do research
  • Observe consistency
  • Guarantee completeness
  • Deliberately leave out or include information
  • Add glosses/notes

No one knows if and when technology can reach the human level of semantic acuteness, but that’s exactly the goal for many. Quantum computing, for example, aims to increase the number of operations and data that can be processed, so one day it may be able to learn without human interaction and get a better understanding of the creation of language.

How best to use Google Translate?

Google Translate has grown into a strong productivity tool that can save you time and spare you the hassle of looking for a good translator. Generally speaking, you can use Google Translate for texts that don’t need to be perfect in terms of style and consistency, i.e. for anything that won’t make or break your brand:

  • Low-visibility or low-traffic content, such as internal documentation, website footers, social media posts for sentiment analysis, etc.
  • Repetitive technical content that only needs to be actionable, like instruction manuals, for end-users to access key information to solve a problem
  • User-generated content like product reviews, for which consumers generally don’t expect high quality
  • Quickly perishable content, like chat or email support messages, customer enquiries, etc.
  • Large bulks of content with a short turn-around, such as hundreds of product descriptions that need to go live quickly
  • Frequently amended content like feature and information updates

Nevertheless, if you decide to rely exclusively on Google Translate, you may run a considerable risk of your translation lacking important information, meaning, or grammar. To avoid those pitfalls, it’s key to review and adjust your MT output. This process is known as machine translation post-editing (MTPE). Depending on the level of accuracy you want to achieve, you can apply light or full post-editing. Both approaches will give you the benefits of using MT output while ensuring that your message reaches the intended goal from the start.

As a general guideline, the below cases require machine translation post-editing:

  • Product titles: They are highly informative and concise, they tend to contain proper names and polysemous words, and their word order is usually relatively free, which can cause ambiguity.
  • Translations between language pairs of dissimilar syntax, like Japanese and Spanish, because the reordering of words and phrases to well-formed sentences becomes more challenging for machine translation engines.
  • Product descriptions: They need to be well-crafted and clearly state the product’s features or benefits without room for ambiguity.
  • Content of medium visibility that needs to be as accurate as possible: knowledge base, FAQs, alerts, etc.
  • Back-end SEO meta information such as image alt texts and captions: While their visibility is low, a human needs to ensure that the target-language keywords are present.

All in all, like all other free MT services, Google Translate’s free MT tool is quite handy when you want to translate relatively simple pieces of text quickly. However, for an accurate translation that properly conveys the original meaning, you’ll want to consider post-editing as the most effective way to use machine translation in the long run.