Software localization
Detecting a User’s Locale in a Web App
Whether we're developing a simple blog, or a sophisticated, modern single-page application (SPA), oftentimes, when considering i18n in a web application, we hit an important question: how do we detect a user's language preference? This is important because we always want to provide the best user experience, and if the user has defined a set of preferred languages in his or her browser, we want to do our best to present our content in those preferred languages.
In this article, we'll go through three different ways of detecting a user's locale: through the browser's navigator.language
s (on the client) object, through the Accept-Language
HTTP header (on the server), and through geolocation using the user's IP address (on the server).
Client-side: The navigator.language
s Object
Modern browsers provide a navigator.language
s object that we can use to get all the preferred languages the user has set in his or her browser.
The language settings in Firefox
Given the settings above, if we were to open the Firefox console and check the value of navigator.languages
, we would get the following:
The codes for the locales match the ones in our browser settings
navigator.languages
is available in all modern web browsers and is generally safe to rely on. So let's write a reusable JavaScript function that tell us the preferred language(s) of the current user.
function getBrowserLocales(options = {}) { const defaultOptions = { languageCodeOnly: false, }; const opt = { ...defaultOptions, ...options, }; const browserLocales = navigator.languages === undefined ? [navigator.language] : navigator.languages; if (!browserLocales) { return undefined; } return browserLocales.map(locale => { const trimmedLocale = locale.trim(); return opt.languageCodeOnly ? trimmedLocale.split(/-|_/)[0] : trimmedLocale; }); }
getBrowserLocales()
checks the navigator.languages
array, falling back on navigator.language
if the array isn't available. It's worth noting that in some browsers, like Chrome, navigator.language
will be the UI language, which is likely the language the operating system is set to. This is different than navigator.languages
, which has the user-set preferred languages in the browser itself.
✋🏽 Heads up » If you're supporting Internet Explorer, you will need to use the navigator.userLanguage
and navigaor.browserLanguage
properties. Of course, you will also need to replace all instances of const
with var
in the code above.
Our function also has a convenient languageCodeOnly
option, which will trim off the country codes of locales before it returns them. This can be handy when our app isn't really handling the regional nuances of a language, e.g. we just have one version of English content.
With languageCodeOnly: true, we get the languages without countries
Server-Side: The Accept-Language
HTTP Header
If the user sets his or her language preferences in a modern browser, the browser will, in turn, send an HTTP header that relays these language preferences to the server with each request. This is the Accept-Language
header, and it often looks something like this: Accept-Language: en-CA,ar-EG;q=0.5
.
The header lists the user's preferred languages, with a weight defined by a q
value, given to each. When an explicit q
value is not specified, a default of 1.0
is assumed. So, in the above header value, the client is indicating that the user prefers Canadian English (with a weight of q = 1.0
), then Egyptian Arabic (with a weight of q = 0.5
).
We can use this standard HTTP header to determine the user's preferred locales. Let's write a class called HttpAcceptLanguageHeaderLocaleDetector
to do this. We'll use PHP here, but you can use any language you like; the Accept-Language
header should be the same (or similar enough) in all environments.
<?php class HttpAcceptLanguageHeaderLocaleDetector { const HTTP_ACCEPT_LANGUAGE_HEADER_KEY = 'HTTP_ACCEPT_LANGUAGE'; public static function detect() { $httpAcceptLanguageHeader = static::getHttpAcceptLanguageHeader(); if ($httpAcceptLanguageHeader == null) { return []; } $locales = static::getWeightedLocales($httpAcceptLanguageHeader); $sortedLocales = static::sortLocalesByWeight($locales); return array_map(function ($weightedLocale) { return $weightedLocale['locale']; }, $sortedLocales); } private static function getHttpAcceptLanguageHeader() { if (isset($_SERVER[static::HTTP_ACCEPT_LANGUAGE_HEADER_KEY])) { return trim($_SERVER['HTTP_ACCEPT_LANGUAGE']); } else { return null; } } private static function getWeightedLocales($httpAcceptLanguageHeader) { if (strlen($httpAcceptLanguageHeader) == 0) { return []; } $weightedLocales = []; // We break up the string 'en-CA,ar-EG;q=0.5' along the commas, // and iterate over the resulting array of individual locales. Once // we're done, $weightedLocales should look like // [['locale' => 'en-CA', 'q' => 1.0], ['locale' => 'ar-EG', 'q' => 0.5]] foreach (explode(',', $httpAcceptLanguageHeader) as $locale) { // separate the locale key ("ar-EG") from its weight ("q=0.5") $localeParts = explode(';', $locale); $weightedLocale = ['locale' => $localeParts[0]]; if (count($localeParts) == 2) { // explicit weight e.g. 'q=0.5' $weightParts = explode('=', $localeParts[1]); // grab the '0.5' bit and parse it to a float $weightedLocale['q'] = floatval($weightParts[1]); } else { // no weight given in string, ie. implicit weight of 'q=1.0' $weightedLocale['q'] = 1.0; } $weightedLocales[] = $weightedLocale; } return $weightedLocales; } /** * Sort by high to low `q` value */ private static function sortLocalesByWeight($locales) { usort($locales, function ($a, $b) { // usort will cast float values that we return here into integers, // which can mess up our sorting. So instead of subtracting the `q`, // values and returning the difference, we compare the `q` values and // explicitly return integer values. if ($a['q'] == $b['q']) { return 0; } if ($a['q'] > $b['q']) { return -1; } return 1; }); return $locales; } }
This long bit of code is actually not very complicated. In the only public method, detect()
, our class does the following:
- Gets the raw string value of the
Accept-Language
header, e.g."en-CA,ar-EG;q=0.5"
- Uses the helper method
getWeightedLocales()
to parse the header string into an array that looks like[['locale' => 'en-CA', 'q' => 1.0], ['locale' => 'ar-EG', 'q' => 0.5]]
. - Uses the helper method
sortLocalesByWeight()
to sort the above array from highest to lowestq
value. - Plucks the
locale
values from the sorted array, returning an array that looks like['en-CA', 'ar-EG']
.
We can now use our new class to get a nice, consumable array of locale codes based on the Accept-Language
HTTP header.
<?php $locales = HttpAcceptLanguageHeaderLocaleDetector::detect(); // => ['en-CA', 'ar-EG']
Server-side: Geolocation by IP Address
Sometimes the Accept-Language
header won't be present in the requests to our server. In these cases we might want to use the user's IP address to determine the user's country, and infer the locale or language from this country.
✋🏽 Heads up » Geolocation should be used as a last resort when detecting the user's locale, as it can often lead to an incorrect locale determination. For example, if we see that our user is coming from Canada, do we assume that his or her preferred language is English or French? Both are formal and widely-used languages in the country. And, of course, the user could belong to an Arabic-speaking minority, or be a Spanish-speaking visitor.
Using MaxMind for Geolocation
In order to determine the user's country by the request's IP address, we'll use the MaxMind PHP API and the MaxMind geolocation database. MaxMind is a company that offers a few IP-related products, and among them are two that are of interest to us here:
- The GeoIP2 Databases — these are MaxMind's commercial geolocation databases and are low-latency and subscription-based. You may want to upgrade to these if you want more up-to-date or faster databases.
- The GeoLite2 Databases — these are MaxMind's free geolocation databases, and while reportedly less accurate than their commercial counterparts, they're more than enough to get started with. We'll be using a GeoLite2 database here. Do note that you will need to credit Maxmind on your public web page and link back to their site if you use one of their free databases.
To install the database, just sign up for a free MaxMind account. You'll receive an email with a sign-in link. Follow the link and sign in. Once you do, you should land on your Account Summary page.
Click the Download Databases link on the Account Summary page
This will take you to a page with the list of free GeoList2 databases. Grab the country binary database from there.
We want the country binary database for our purposes
Place the file you downloaded somewhere in your project.
We'll also need the MaxMind PHP API to work with the database. We can install that with Composer.
composer require geoip2/geoip2:~2.0
Peter Kahl's Country-to-Locale Package
We'll need one more package before we get to our code. In order to determine the locales or languages of a country, we'll use Peter Kahl's country-to-locale
package. We can install it using Composer as well.
composer require peterkahl/country-to-locale
The IP Address Locale Detector Class
With our setup in place, we can get to our own class, IpAddressLocaleDetector
.
<?php require '../vendor/autoload.php'; use GeoIp2\Database\Reader; use peterkahl\locale\locale; class IpAddressLocaleDetector { const MAX_MIND_DB_FILEPATH = __DIR__ . '/GeoLite2-Country_20200121/GeoLite2-Country.mmdb'; private static $maxMindDbReader; public static function detect() { $ipAddress = static::getIpAddress(); try { $record = static::getMaxMindDbReader()->country($ipAddress); $locales = locale::country2locale($record->country->isoCode); $normalizedLocales = str_replace('_', '-', $locales); return explode(',', $normalizedLocales); } catch (Exception $ex) { return null; } } private static function getIpAddress() { return $_SERVER['REMOTE_ADDR']; } private static function getMaxMindDbReader() { if (static::$maxMindDbReader == null) { static::$maxMindDbReader = new Reader(static::MAX_MIND_DB_FILEPATH); } return static::$maxMindDbReader; } }
Our class is relatively straightforward. Much like HttpAcceptLanguageHeaderLocaleDetector
, it has one public method, detect()
, which does the following:
- Get the request's IP Address from the global
$_SERVER
array. - Feeds this IP address to the MaxMind database
Reader
'scountry
method, which attempts to geolocate a country based on the IP address. - Uses Peter Kahl's
locale::country2locale()
to get the languages of the given country. - Normalizes the acquired locales, so that
"en_CA,ar_EG"
becomes"en-CA,ar-EG"
. - Returns the locales it normalized as an array, e.g.
["en-CA", "ar-EG"]
.
📖 Go deeper » The MaxMind Reader
has many more methods. Check out the official API documentation if you want to dive a bit deeper into the info available in the MaxMind databases.
Server-side: Cascading Locale Detection
Given the two server-side detection strategies we covered above, we can write a little detect_user_locales()
function that can attempt the HTTP header strategy first.
<?php require './HttpAcceptLanguageHeaderLocaleDetector.php'; require './IpAddressLocaleDetector.php'; function detect_user_locales() { $locales = HttpAcceptLanguageHeaderLocaleDetector::detect(); if (count($locales) == 0) { $locales = IPAddressLocaleDetector::detect(); } if (count($locales) == 0) { // fall back on some default locale, English in this case $locales = ['en']; } return $locales; }
If HTTP Header detection fails, detect_user_locales()
will try IP geolocation detection. If the latter bears no fruit, the function will fall back on some default locale.
If handled carefully, detecting the user's locale can help provide a better user experience in our web apps. Thankfully, the navigator.languages
object and Accept-Langauge
HTTP header are available to reduce our guesswork when it comes to locale detection.
If you and your team are working on an internationalized web app, check out Phrase for a professional, developer-friendly i18n platform. Featuring a flexible CLI and API, translation syncing with GitHub and Bitbucket integration, over-the-air (OTA) translations, and much more, Phrase has your i18n covered, so you can focus on your business logic.
Check out all Phrase features for developers and see for yourself how it can streamline your software localization workflows.