How Phrase Supercharges DeepL: Smarter Machine Translation at Scale

Machine translation (MT) is evolving fast. DeepL has quickly become a preferred engine for many localization teams thanks to its fluency, accuracy, and ease of use. But while high-quality MT is a powerful starting point, it’s only one part of a successful localization workflow.

As more enterprises rely on AI-assisted translation to meet growing content demands, the question is no longer whether MT is good enough. It’s now about how to scale it, govern it, and ensure it consistently meets brand and quality standards across markets.

We recently teamed up with DeepL for a joint webinar to show how teams can get the best out of DeepL within the Phrase Localization Platform and turn raw MT into production-ready content at scale.

Hosted by DeepL’s Director of Localization, Morana Perić, the session featured Phrase’s Director of Localization, Francesca Sorrentino, and Solutions Architect Giorgio Vassallo, alongside DeepL’s Localization Operations Specialist, Mattia Doneda.

Together, they explored how Phrase and DeepL work in tandem to bring speed, scale, and control to multilingual content workflows.

DeepL shared how to set up for success from MT profile configuration to tone of voice settings and glossary management.The Phrase team focused on operationalizing MT output across large-scale localization workflows, with built-in automation, quality checks, and AI-powered refinement.

In this article, we’ll share highlights from Phrase including five capabilities that help transform machine output into business-ready content.

Overview

Using Phrase to govern and scale MT

At Phrase, we localize Phrase content…using Phrase. It’s a hands-on approach we refer to internally as “Drink Your Own Champagne” This mindset that ensures every part of the platform is tested, optimized, and refined through daily use.

As Francesca Sorrentino, Director of Localization at Phrase, explained during the webinar:

“One of our main focuses is to make the most of the platform, so beta test everything, all the new capabilities, push the boundaries of the product, and create compelling use cases for our product teams and to share with our customers.”
Francesca Sorrentino, Director of Localization, Phrase

The internal localization program covers a wide range of content, from product UI strings and help center documentation to Phrase’s own website. The team uses Phrase Strings to manage and translate software content, Phrase TMS for larger file-based projects, and Phrase Orchestrator to automate workflows between systems like GitHub, WordPress, and Zendesk.

That deep integration across Phrase tools has surfaced real performance gains.

The team also uses DeepL extensively, particularly for high-volume content where speed and consistency matter (such as dynamic support articles or product interface updates).

By combining DeepL with Phrase’s automation and quality features (including Phrase QPS, Auto Adapt, and Auto LQA), they’re able to move faster while maintaining the quality bar required for customer-facing content.

This in-house testing ground not only validates how Phrase features work together, it also helps shape them. Many of the capabilities demonstrated in the webinar have been directly informed by real use cases within the company.

Introducing Phrase QPS: A confidence score for MT output

One of the key challenges with machine translation isn’t generating it, but knowing whether you can trust it. That’s where Phrase QPS (Phrase Quality Performance Score) comes in.

Phrase QPS is our built-in confidence score for machine-translated segments. Calculated automatically during pre-translation, it reflects how closely a given segment matches quality expectations, based on model predictions, linguistic patterns, and previous performance data.

Phrase displays the score directly inside the editor, right alongside each translated segment, so linguists don’t have to guess whether the machine output is usable.

Each segment is scored on a scale from 0 to 100, giving translators an at-a-glance sense of the expected post-editing effort. A high score suggests that the MT output is likely accurate, fluent, and ready for delivery. A low score indicates that the segment may need closer review or revision.

“It’s essentially Phrase telling you how good it thinks the provided machine translation is. Similar to how you get fuzzy matches or 100% matches from a translation memory… the higher the number, the better the translation is.” – Giorgio Vassallo, Solutions Architect at Phrase

This scoring system helps linguists prioritize their time and attention. Segments scoring above a configured threshold—often set around 70 or higher—may only require light editing or quick validation. Scores below that threshold suggest more substantive issues, helping editors quickly identify what needs a deeper look.

Importantly, Phrase QPS isn’t just there for visibility, but also to drive automation. Teams can define actions based on score ranges, such as locking high-quality segments to prevent unnecessary edits or routing lower-quality segments into human review workflows.

“Right off the bat, it can give the linguist an idea of how much work is expected of them to post-edit the segment,” Giorgio added.
“But another thing we can do is called QPS routing…”

That next step—automating what happens based on Phrase QPS—is where the real power comes in.

QPS routing via Phrase Orchestrator

While Phrase QPS provides visibility into MT quality, its true value emerges when combined with automation. Using Phrase Orchestrator, teams can act on QPS scores automatically, without the need for manual triage or intervention.

Orchestrator is Phrase’s no-code automation engine. It allows users to define event-based workflows that respond dynamically to content properties, such as file type, language, or in this case, QPS scores. By integrating Phrase QPS into these workflows, teams can optimize post-editing effort and ensure that human time is only spent where it’s truly needed.

“What this does is it basically locks the segments with a certain score—say anything with 70 or above, so that the linguists don’t touch them,” Giorgio explained.

“However, for anything below that, it leaves them open. So this way, I can make sure that my linguists are only working on the segments that really need to be fixed, without having to work on the entire document.”

This simple rule—lock high, edit low—has a significant impact on throughput. Instead of manually reviewing every single segment, linguists can focus exclusively on low-confidence translations. It’s especially effective for high-volume or frequently updated content, where time is limited and full post-editing isn’t realistic.

In combination with other automation features like Auto Project Creation (APC) and MT profile-based pre-translation, Phrase QPS routing helps teams scale faster without compromising the quality thresholds they’ve set.

Want to know how this works in a live setup? DeepL shares how we use this approach in our Zendesk dynamic content workflow (Make sure you check out DeepL’s own blog on this for more).

Built-in QA checks for source–target consistency

When large volumes of content move through machine translation and post-editing, small issues can easily compound: misplaced tags, missing numbers, broken formatting… To help surface these kinds of inconsistencies, Phrase includes a set of configurable quality assurance (QA) checks that run automatically during the editing process.

The system flags potential issues inline within the Phrase editor, allowing linguists to review and address them as part of their normal workflow. Out of the box, the QA engine checks for:

Tag mismatches
Empty or untranslated segments
Character limit violations
Inconsistent punctuation or spacing
Regex-defined patterns like code snippets, links, or brand terms

“QA checks detect errors in translation by comparing the source and target segments and finding patterns in them… we can set it up so that it looks, for example, at character limitations, missing numbers, empty targets, or anything really.” – Giorgio Vassallo, Solutions Architect at Phrase

These checks are particularly useful when automation is layered into a workflow, especially when combining machine translation, custom glossaries, and AI-driven post-editing. Errors that might not be visible at a glance can be caught programmatically before they reach reviewers or customers.

Phrase also supports the creation of custom QA categories using regular expressions, offering a way to tailor checks for specific content types or technical rules.

“If there’s a category here that is not listed that you need, then you can create your own using regular expressions… It looks for a specific pattern in the source, and if it matches a specific pattern in the target, then it’ll throw an error.” – Giorgio Vassallo, Solutions Architect at Phrase

This provides teams with an additional layer of verification. This is especially valuable when working across multiple languages, formats, and systems.

Auto Adapt: Tailoring MT output with validated prompts

High-quality MT may get the message across, but when brand tone, formatting, or inclusivity standards are on the line, even small details matter.

Auto Adapt is a post-MT refinement layer powered by large language models (LLMs). It applies validated prompts to each segment, allowing teams to automatically tweak output for things like tone, phrasing, or structural consistency (and importantly, does so without manual editing).

“You can use Auto Adapt to instruct the system to apply a specific style to the translation,” Giorgio Vassallo explained.
“For example, fixing formatting, making the language inclusive, or following brand guidelines.”

These prompts can be as simple or specific as needed—from “formalize contractions” to “apply UK spelling” to “enforce non-gendered language.”

Once configured, Auto Adapt runs in the background as part of the workflow, ensuring consistency at scale across all content types.

This is especially useful when working with MT profiles that already define tone or glossary use. Auto Adapt becomes the final layer, refining the machine output to meet the organization’s expectations before it reaches a human or goes live.

Auto LQA: AI review for enterprise-grade quality assurance

While Auto Adapt adjusts MT output based on defined prompts, Auto LQA takes the next step: reviewing that output to assess its overall quality.

Auto LQA uses AI to evaluate translated content against configurable quality criteria like accuracy, grammar, punctuation, or style, and returns both an overall score and issue-specific annotations. It acts as a fast, scalable complement to manual review.

“Auto LQA gives you a final quality score based on what it found in the segment. If there are no issues, it passes. If there are problems, it can flag them and even categorize them as major or minor.” – Giorgio Vassallo, Solutions Architect at Phrase

Rather than asking reviewers to read every sentence, Auto LQA helps focus attention where it’s most needed. Segments that pass can be approved automatically; those with flagged issues can be routed for review, or addressed via prompt-based revision.

Combined with Phrase QPS and QA checks, Auto LQA completes the quality stack, ensuring that teams can monitor translation quality at scale without adding bottlenecks.

Why It Matters: Putting MT to work at scale

DeepL delivers exceptional MT quality, but production-ready content requires more than just fluent output. Phrase makes it possible to take that raw MT and shape it to the needs of real-world localization: reviewed, adjusted, verified, and delivered across teams, tools, and markets.

Together, capabilities like Phrase QPS scoring, segment routing, QA checks, Auto Adapt, and Auto LQA give teams the confidence to use MT at scale, while staying in control of style, quality, and consistency, meaning less manual effort, faster turnaround, and a localized experience that still sounds like you.

Explore more

Want to see how these capabilities fit into your workflows?

Learn more about advanced machine translation management features within Phrase, and explore how you can adapt, review, and scale MT with greater confidence.

For setup guidance, Zendesk-specific workflows, and more on MT profiles and tone settings, head to DeepL’s website to read more.

Watch the full webinar

View webinar

From MT to market: How Phrase supercharges DeepL with quality, automation, and AI

Using Phrase to govern and scale MT

Introducing Phrase QPS: A confidence score for MT output

QPS routing via Phrase Orchestrator

Built-in QA checks for source–target consistency

Auto Adapt: Tailoring MT output with validated prompts

Auto LQA: AI review for enterprise-grade quality assurance

Why It Matters: Putting MT to work at scale

Explore more

Watch the full webinar

From ambition to action: How language leaders can tame the AI beast

Making LLMs work for scalable, brand-consistent multilingual content

How to build a scalable WordPress i18n workflow

Keep exploring

Making LLMs work for scalable, brand-consistent multilingual content

Localizing Unity games with the official Phrase plugin

Mastering TM and Phrase QPS Thresholds for Localization Automation