12 Reasons Why Software Localization Can Be Hard

Multiple languages, geographic regions and Unicode are just some of the reasons why software localization (L10n) is so dang hard. Check out all 12 here.

Are you about to embark on your software localization voyage? Or perhaps you’re already in the middle of assigning locales and hreflang to your internal pages. Whatever stage you’re at in your project, there will be a few moments when you feel overwhelmed. That’s because rolling out sophisticated software to multiple languages and regions isn’t easy. In fact, it’s tough. Real tough. Here are at least 12 reasons why software localization (L10n) is so dang hard:

1. There’s a Lot of People in the World

Just contemplating the global population can lead any Product Manager’s head to spin at the enormity of the task before them. As globalization increases, the number of internet subscribers rises. Currently at over 3 billion – that’s a lot of potential customers.

Mobile phone subscribers have already taken over that number and are set to reach 4.77 billion by 2017. That means that well over half the world’s population has access to the internet – and potentially access to your software. Not only will you need to target them in the right language and regional dialect, but you’ll also need to reach them on the appropriate platform. And through the most popular search engines where they live.

While English is undisputedly the most popular language on the web, just 26.3% of all internet transactions are carried out in English. If you don’t have a plan in place for the other three quarters of internet users, you’re missing out on a huge market share.

2. Who Speak Thousands of Languages

Just watching the Olympics opens your eyes to the amount of countries there are in the world. Many of which you never hear anything about for another four years. There are generally agreed to be 195 countries in the world, if you don’t count certain disputed territories. But that doesn’t mean there are 195 languages or dialects. Or even double that amount. At the last count according to the BBC, there were an estimated 7,000 languages in the world, although not all of them widely spoken. 

Even if you take a popular language like Spanish, you still can’t offer a one-size-fits-all solution to your L10n rollout. In fact, specifying a language, but not a country is one of the most commonly made software localization (L10n) mistakes.

Why? Because the Spanish spoken in Buenos Aires is different from that of Barcelona, which is different from that of Bogota. Developers must therefore be as precise as they can and use a full locale property instead of just a language. This must contain both the country code and the language where it’s spoken, such as en-GB or fr-FR.

3. Different Regional Dialects

Let’s go back to the example of Spanish in Barcelona. You can get away with using a standard “es” for Spain, as 98% of the population there speak Castilian Spanish. But if you want to localize further, in Barcelona around 60% of the population also speak Catalan.

The key to creating trust and thus creating more buyers, lies in speaking to your customers in their own language. Catalonia has long been seeking independence from Spain. It’s a proud territory of people who like speaking their mother tongue. Fail to fully localize your software into Catalan, and you’re fighting against the local competition speaking your customers’ language. Not only local competition, but other international giants who are conducting their L10n project well.

4. Not All Languages Take Up the Same Amount of Space

Any good web designer is aware of the importance of space. How necessary it is that a website looks clean, easy to navigate and doesn’t clutter your audience with too much information. White spaces in the right places can make your website breathe and zero the user in to a particular point of interest. Making it easier for them to Buy Now, Click Here and Request a Quote or View More.

But here’s the thing about space when it comes to software localization (L10n). Not all languages take up the same amount of it. So, when you start localizing your software, you’ll need to keep in mind the necessary physical space that other languages need. Your software must be as flexible as possible to accommodate large-scale international localization (L10n).

Let’s take an example. A sentence as common as “try our new product line” is five simple words in English. Which become “私たちの新しい製品ラインを試してみてください” in Japanese, or “essayez notre nouvelle gamme de produits” in French. If you’re working with limited space between headers, navigation bar and images, your translated words won’t fit within the framework of your design. Which means that you won’t deliver an optimal user experience (UX).

5. Some Languages Read from Right to Left or Vertically

Let’s take Arabic and Hebrew to start with. Both these languages read from right to left and many East-Asian languages use characters that are often vertically displayed. What does this mean for your software localization (L10n) project? That you’ll need to be prepared to accommodate complex text flow and have a viable plan for these differences.

How do you do that? Your developers will need to use strings that allow single characters to be placed under one another. Or a direction string that loads a different stylesheet based on the language of the locale.

6. Creating Strings is Difficult

On that note, another reason that software localization (L10n) is so dang hard is the fact that creating strings is far from easy. Especially when contemplating these language variations. The problem is often made more difficult by the fact that not many developers are familiar with how translators work.

If they create concentrated strings that split sentences into several keys, your translators will have to guess at what’s coming next. Seeing as translations can’t be done without context or carried out word-for-word that will seriously slow your project down, as they go back and correct their work.

On top of that, the sentence structure is often completely different in other languages, so it’s always best to create strings that are complete sentences.

7. There’s The Cultural Issue

Once you have the different language versions of your site using the right locales with the language and the country code, you haven’t quite finished the job. Software localization (L10n) is about more than just international wordsmiths and a spectacular team of developers behind the scenes. Real, targeted localization requires thorough research of your destination market, which often means having a local team or consultant on the ground.

Local testers will ensure that your color scheme is optimal for your target audience and that your images and units of measure are appropriate. For example, a picture of holidaymakers enjoying the beach may work well in Germany, but could be prohibited in Saudi Arabia. Dates and measurements are not displayed the same way from England to the US to Australia.

Your customers live in different hemispheres with different climates, national holidays, ideals and beliefs. You’ll need to target your message taking into account local vocabulary and nuances, as well as seasonal promos and local offers.

Going back to the sticky issue of images. It’s not only about choosing culturally appropriate ones – what happens with images that contain text? If you can’t separate the text from the images, you’ll need to pay your translators extra for taking on this task. So if possible you should avoid using images that contain text.

8. Lack of Unity in Programming Software

Displaying different languages in a software product is commonly referred to as internationalization (i18n) and is a well understood software engineering issue. Most development environments therefore now support i18n. However, there is a lack of unity in programming software regarding the best way to do this.

Every programming language relies on a different form of i18n which can lead to more work for your developers. While some languages avoid this altogether and leave it up to libraries and frameworks to solve, others require a different approach. If you’re using a PHP framework, for example, then you must use PO files for managing your translations.

9. Not Everyone Supports Unicode

If you’re starting your software localization (L10n) project out from scratch, you’ll adopt Unicode from the start. But not every site was designed with full scale localization in mind and not every platform supports Unicode.

What does this mean for your translations? If your source code is managing strings that use a datatype that doesn’t support Unicode, your translations will likely break if you use a wrong character when encoding. If your server is configured in English and your customers are searching in Chinese, your characters will become corrupted.

How can you avoid this? By always using UTF-8. In 99% of cases, using UTF-8 will fix this issue because it standardizes encodings across server and browser. Each layer in your stack (including your HTTP server and database) should use UTF-8. Only in cases where you’re working with a lot Asian languages, UTF-16 may need to be applied.

10. There’s a Lot of Stakeholders Involved in L10N

Software localization (L10n) isn’t just a job for developers. It’s not a job for translators, or marketers, or designers, or product managers, or cultural anthropologists either. A successful localization project involves a lot of stakeholders. And they should all be able to communicate harmoniously across one solid platform.

When you have a team working together who don’t understand each other’s functions and roles, it can be very hard to coordinate a seamless software localization (L10n) project. So leave email trails and spreadsheets behind you.

A large scale software localization project can only be managed by using an efficient and robust translation management software that includes awesome collaboration functions and tools to keep everyone on the same page.

11. Not Everyone is a Translator

It can be hard for programmers, product managers and non-linguists to understand the work involved in translating and why it can’t be left to machines. Just take a look at banking giant HSBC. This experienced multinational was forced to rebrand its entire global operations, after attempting to launch its U.S. campaign overseas.

The hugely successful slogan at home – “Assume Nothing” was translated on a wide scale in many countries as “Do Nothing.” Which was far from the inspiring message intended.

How could this happen? Because translations can’t be done out of context or word for word. If your translators are working without context, how will they know if what they’re translating should be a verb or a noun if it’s a single word?

In many cases, the entire source wording needs to be rewritten, especially when it comes to jokes, idioms and cultural nuances. When programmers throw split strings at translators with lack of context or explanations, they make their jobs dang near impossible.

12. Not Everyone is a Programmer

Programmers must understand the necessity of context (the more the better) and the need for simplifying the process for non-technically minded members. This may be by including screen shots, adding an extra column to your spreadsheet, or providing localization notes and code comments.

It may seem time-consuming at first, but avoiding confusion and back and forth will only make things more efficient. Glossaries and style guides aside, programmers can provide context information directly in their source files.

The Takeaway

Software localization (L10n) is so dang hard for many reasons and it can seem like an unmanageable task at times. But if you can unite the right resources with the right people, you can reduce the cost and increase the efficiency of your L10n project. Avoid the caveats that many product managers fall into and learn from those who have gone boldly before to make your worldwide rollout a success.