What’s the Best Database Structure to Keep Multilingual Data?

There are many ways of designing databases to accommodate localized, multilingual data. Let’s explore the most common ones and try to choose the best one.

There are many ways to design our databases to accommodate localized data. Modeling can be simple or sophisticated depending on our app’s needs, and there are general considerations for any database that works with a localized app.

A Simple Approach: Localized Columns on the Model

The most basic approach to localizing relational database data is localizing within the model itself. This is as simple as having title_en and title_fr columns instead of a title column, for example.

CREATE TABLE `articles_i18n_simple` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `published_at` datetime NOT NULL,
  `title_en` varchar(2000) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  `title_fr` varchar(2000) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  `body_en` text COLLATE utf8mb4_unicode_ci,
  `body_fr` text COLLATE utf8mb4_unicode_ci,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

We could then query our table like normally do (without i18n) and then refine our data in the application layer, extracting the title and body that match the active locale in the app. This is a good solution for small apps, especially ones where we’re sure that our supported locales won’t change.

A Sophisticated Approach: Separate Translation Tables

While our simple approach above is quick and easy to use, it’s a bit inflexible. If we added Hebrew to the above table, we would have to add new Hebrew versions for every localizable column. A solution that scales better is having translations in a separate table per model.

CREATE TABLE `articles_i18n` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `published_at` datetime NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
CREATE TABLE `articles_i18n_translations` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `article_id` int(11) NOT NULL,
  `locale` varchar(5) COLLATE utf8mb4_unicode_ci NOT NULL,
  `title` varchar(2000) COLLATE utf8mb4_unicode_ci NOT NULL,
  `body` text COLLATE utf8mb4_unicode_ci,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

Of course, this makes our queries a bit more complex and expensive, since we may have to join or perform two queries to get all our model data.

SElECT articles_i18n.id AS id, published_at, title, body
FROM articles_i18n
INNER JOIN articles_i18n_translations ON article_id = articles_i18n.id
WHERE locale = "fr";

With that complexity, however, comes flexibility: we can have as many locales as we want, and it wouldn’t change the basic structure of our models.

General Considerations for Database i18n

Unicode

Notice that the tables and text fields above are defined with utfmb4_unicode_ci. This collation gives us full, case-insensitive Unicode – the storage and sorting of our text data will include all characters in all languages supported by the Unicode standard. Other collations can cause problems with certain characters, so do make sure you know which locales your app is supporting when selecting a collation if you’re not going with utfmb4_unicode_ci.

Dates and Times

It’s a good idea to store dates and times in UTC, and when reading the data, add or subtract a time zone offset to get the date or time in the user’s time zone. Optionally, if you want to log the time zone the data was stored in, include a second time zone offset column for each date/time column, while still keeping the date and time column themselves in UTC.

Further reading

If you’re interested in how to speed up your ActiveRecord SQL queries for your internationalized database, have a look at our guide.
For a relaxed i18n experience, check out Phrase. Built by developers for developers, Phrase is a one-stop-shop for all your i18n needs. With its GitHub, GitLab, and Bitbucket sync, and flexible CLI and API, you can tailor your workflow to automatically push translation strings to Phrase. Your translation team picks up the strings in a well-designed web interface, localizes them, and saves them. You can then seamlessly pull these translations down to your dev machine and get back to the creative code you love. Phrase supports a comprehensive collection of platforms. And if you happen to be working in iOS or Android, you can push your translations over-the-air (OTA) without app reviews. Yes, that’s your developer’s heart smiling, and that’s why we’re here. Check out all of Phrase’s features and try Phrase for free.

Keep exploring

Photo-realistic sheet music featuring developer-style translation code in place of musical notes. The staff lines show snippets like t('auth.signin.button') and JSON structures, combining the aesthetics of musical notation with programming syntax to illustrate the idea of “composable localization.”

Blog post

Localization as code: a composable approach to localization

Why is localization still a manual, disconnected process in a world where everything else is already “as code”? Learn how a composable, developer-friendly approach brings localization into your CI/CD pipeline, with automation, observability, and Git-based workflows built in.

A woman in a light sweater sits in a home office, focused on her laptop, representing a developer or content manager working on WordPress localization tasks in a calm, professional environment.

Blog post

How to build a scalable WordPress i18n workflow

WordPress powers the web, but translating it well takes more than plugins. Discover how to build a scalable localization workflow using gettext, best practices, and the Phrase plugin.

Blog post

Localizing Unity games with the official Phrase plugin

Want to localize your Unity game without the CSV chaos? Discover how the official Phrase Strings Unity plugin simplifies your game’s localization workflow—from string table setup to pulling translations directly into your project. Whether you’re building for German, Serbian, or beyond, this guide shows how to get started fast and localize like a pro.

Blog post

Internationalization beyond code: A developer’s guide to real-world language challenges

Discover how language affects your UI. From text expansion to pluralization, this guide explores key i18n pitfalls and best practices for modern web developers.

A digital artwork featuring the Astro.js logo in bold yellow and purple tones, floating above Earth's horizon with a stunning cosmic nebula in the background. The vibrant space setting symbolizes the global and scalable nature of Astro’s localization capabilities, reinforcing the article’s focus on internationalization in web development.

Blog post

Astro.js localization part 2: dynamic content localization

Learn how to localize your Astro.js website with static and dynamic content translation. Explore Astro’s built-in i18n features and Paraglide for handling UI elements, navigation, and dynamic text seamlessly.