Which Machine Translation Provider Is Best for Which Language Pair?

Machine translation is widely used to increase efficiency in translation projects. Linguists can save a significant amount of time by post-editing auto-translated content instead of translating from scratch. How can you determine which machine translation provider is best for which language pair? Here’s how to find out.

Machine translation (MT), and post-edited MT have become increasingly popular, both with localization managers trying to speed up their software translation process, and with translators themselves. Since translators are often paid by the word, an increased output while maintaining high translation quality is always welcome.

Phrase customers can use the Autofill feature to translate new content automatically and set up a review process for manual post-editing by human translators.

If you are managing your organization’s localization process in Phrase, you can define which machine translation provider is used for which language pair. That leaves the only question remaining: how do you choose the best MT provider for each language pair? In this article, we gathered information and recommendations that can help with this choice.

Which Machine Translation Providers Offer Which Languages?

(As of July 2020)

DeepL: DeepL currently supports the following 11 languages: English, German, French, Spanish, Portuguese, Dutch, Italian, Polish, Russian, Japanese, and Chinese. See here.

Amazon Translate: Currently available in 55 languages. See the complete list here.

Microsoft Translator: Currently available in 74 languages. See the complete list here.

Google Translate: Currently available in 109 languages. See the complete list here.

How Do You Judge Machine Translation Provider Quality?

Let’s imagine you are translating a language pair that is supported by all four machine translation providers. You still need to find out which service will provide the best results, meaning, which translations are semantically correct, match your tone of voice, and are the fastest to post-edit for human translators.

Unfortunately, there is no clear winner. There is no way to say that one language pair is always best translated by DeepL, Amazon, Microsoft, or Google Translate. Which machine translation provider delivers the highest quality strongly depends on your source copy: the vocabulary that is used (e.g. legal documents vs. marketing material, travel industry vs. logistics), the tone of voice (formal, informal), and other factors. Here are two suggestions for how to test your content with different MT providers. The first would be to let translators evaluate machine translations in a blind test. The second is to analyze your post-editing score.

Let Translators Evaluate Machine Translations

Prepare a set of test content from your product. Provide the source content and the machine translations from all available MT providers and let translators evaluate these translations in a blind test to determine which one provides the best basis for their post-editing process.

We applied this method to evaluate MT providers for our own content on phrase.com. We asked our in-house translators for German, French, and Russian to rank the different machine translation results from their favorite to their least favorite results in a blind test. Here are our results:

Analyze Your Post-Editing Score

There are a wide range of approaches to calculating the quality of machine-translated content, BLEU, TER, and HTER to mention only a few. A rather simple one that you can apply in no time to your own content is the post-editing score method which measures the share of content that has been translated correctly and remained unchanged during post-editing. The higher the post-editing score, the higher the quality of the machine translation.

Just compare the machine-translated copy with the post-edited, final copy and count the characters or words that were edited. There are several free tools (e.g., Diffchecker, countwordsfree) that you can use for this kind of text comparison and difference calculations. The share of edited content can be calculated either by characters or by word. We are referring to the post-editing score based on words in our examples.

We calculated the post-editing score for our website content on phrase.com for the language pairs: English to German, English to French, and English to Russian.

For our content and languages, the scores of the different MT providers are very close. As mentioned above, these results heavily depend on your source content. The post-editing scores for your content could look very different.

Summary: How to Pick the Best MT Provider for Each Language Pair

  1. Check which MT providers are available for your language pair.
  2. If you work with internal translators or have a fixed-team of external translators, ask them to evaluate which MT provider they prefer. It is the translators who need to work with the machine-translated content, so adopting their preferred provider will save time for everyone.
  3. If you are working with a changing set of translators, calculate the post-editing score for your language pairs and choose the provider with the highest score.

Next Step: Update Your Machine Translation Settings in Phrase

Update your machine translation settings in Phrase according to your findings. Just add the language pairs and select the best-performing MT provider.

3 (60%) 5 votes

Best Practices for Successful Localization

Download your FREE EBOOK copy to get:

  • Insider insights from product managers
  • Tips and tricks from localization specialists
  • Concise chapter summaries for easy review
  • Our ultimate checklist to track your progress