Machine translation

Which Machine Translation Provider Is Best for Which Language Pair?

Learn how to determine the most suitable machine translation engine for a specific language pair in your translation project.
Machine translation blog category featured image | Phrase

Machine translation (MT), and post-edited MT have become increasingly popular, both with localization managers trying to speed up their software translation process, and with translators themselves. Since translators are often paid by the word, an increased output while maintaining high translation quality is always welcome.

Phrase customers can use the pre-translation feature to translate new content automatically and set up a review process for manual post-editing by human translators.

If you are managing your organization’s localization process in Phrase, you can define which machine translation provider is used for which language pair. That leaves the only question remaining: how do you choose the best MT provider for each language pair? In this article, we gathered information and recommendations that can help with this choice.

Which machine translation providers offer which languages?

(As of July 2020)

DeepL: DeepL currently supports the following 11 languages: English, German, French, Spanish, Portuguese, Dutch, Italian, Polish, Russian, Japanese, and Chinese. See here.

Amazon Translate: Currently available in 55 languages. See the complete list here.

Microsoft Translator: Currently available in 74 languages. See the complete list here.

Google Translate: Currently available in 109 languages. See the complete list here.

How do you judge machine translation provider quality?

Let’s imagine you are translating a language pair that is supported by all four machine translation providers. You still need to find out which service will provide the best results, meaning, which translations are semantically correct, match your tone of voice, and are the fastest to post-edit for human translators.

Unfortunately, there is no clear winner. There is no way to say that one language pair is always best translated by DeepL, Amazon, Microsoft, or Google Translate. Which machine translation provider delivers the highest quality strongly depends on your source copy: the vocabulary that is used (e.g. legal documents vs. marketing material, travel industry vs. logistics), the tone of voice (formal, informal), and other factors. Here are two suggestions for how to test your content with different MT providers. The first would be to let translators evaluate machine translations in a blind test. The second is to analyze your post-editing score.

Let translators evaluate machine translations

Prepare a set of test content from your product. Provide the source content and the machine translations from all available MT providers and let translators evaluate these translations in a blind test to determine which one provides the best basis for their post-editing process.

We applied this method to evaluate MT providers for our own content on We asked our in-house translators for German, French, and Russian to rank the different machine translation results from their favorite to their least favorite results in a blind test.

Analyze your post-editing score

There are a wide range of approaches to calculating the quality of machine-translated content, BLEU, TER, and HTER to mention only a few. A rather simple one that you can apply in no time to your own content is the post-editing score method which measures the share of content that has been translated correctly and remained unchanged during post-editing. The higher the post-editing score, the higher the quality of the machine translation.

Just compare the machine-translated copy with the post-edited, final copy and count the characters or words that were edited. There are several free tools (e.g., Diffchecker, countwordsfree) that you can use for this kind of text comparison and difference calculations. The share of edited content can be calculated either by characters or by word. We are referring to the post-editing score based on words in our examples.

We calculated the post-editing score for our website content on for the language pairs: English to German, English to French, and English to Russian.

Post-Editing Score (Words) - English to German | Phrase

Post-Editing Score (words) - English to French

Post-Editing Score (words) - English to Russian

For our content and languages, the scores of the different MT providers are very close. As mentioned above, these results heavily depend on your source content. The post-editing scores for your content could look very different.

Summary: How to pick the best MT provider for each language pair

  1. Check which MT providers are available for your language pair.
  2. If you work with internal translators or have a fixed-team of external translators, ask them to evaluate which MT provider they prefer. It is the translators who need to work with the machine-translated content, so adopting their preferred provider will save time for everyone.
  3. If you are working with a changing set of translators, calculate the post-editing score for your language pairs and choose the provider with the highest score.

Next step: Update your machine translation settings in Phrase

Update your machine translation settings in Phrase according to your findings. Just add the language pairs and select the best-performing MT provider.