Software localization

Ruby I18n: Translate Ruby Projects with gettext & PO Files

Ruby i18n from A to Z: Learn how to translate Ruby applications, what GetText and PO files are, about their format and how to translate with fast_gettext.
Software localization blog category featured image | Phrase

Welcome to the world of Ruby internationalization (Ruby i18n) and localization (Ruby l10n). In our previous article, I've explained how to translate Ruby applications with the R18n gem which has a somewhat different approach than the widely-used i18n solution. However, there is yet another technology that you may stick with when doing Ruby localization for your application. Meet GNU GetText and PO files. GetText is a mature and battle-tested solution initially released by Sun Microsystems more than 25 years ago. GetText provides a set of utilities that allow localizing various programs and even operating systems. In this article, you will see how to translate Ruby applications with the help of the fast_gettext gem written by Michael Grosser. The gem boasts its speed and supports multiple backends for storing translations (various types of files and even databases). Today we will discuss the following topics:

  • What types of files GetText supports and what their specifics are
  • Creating a sample application
  • Storing translations in PO files
  • Performing simple translations
  • Adding pluralization rules and gender information
  • Parsing and manipulating PO files
  • Using YAML files

Let's get started, shall we?

Introduction to GetText and fast_gettext

So, GetText is a quite complex solution to localize various kinds of programs. We are not going to discuss all ins and outs of GetText in this article, but you may find full documentation online gnu.org. In this section, we will briefly discuss the idea of this system and the supported file formats. GetText not only provides you with the right tools to perform localization but also instructs how the files and directories should be organized and named. Under the hoods, GetText uses two types of files to store translations: .po and .mo. PO means Portable Object and those are the files edited by a human. Translations for given strings are provided inside as well as some metadata and pluralization rules. Each PO file dedicates to a single language and should be stored in a directory named after this language, for example, en, ru, de etc. We will mostly work with the PO files in this article, though later you will learn that the fast_gettext gem also supports YAML files (which is probably a relief for a Ruby developer). MO means Machine Object and those are binary files read by the computer. They are harder to maintain and we are not going to stick with these files. Another thing worth mentioning is that fast_gettext has a concept of a text domain. In a simple case, there is only one domain, usually named after your program. But for more complex projects there may be multiple domains. Your PO files should be named after the text domain, therefore the approximate file structure would be:

  • en
    • domain1.po
    • domain2.po
  • ru
    • domain1.po
    • domain2.po

We'll see this in action later, but for now, let's create a sample Ruby application that we are going to translate with GetText. This application is going to be very similar to the one created in the previous article.

Sample Application

So, our small project will be called Bank. It will allow instantiating new accounts with a specified balance and information about the owner. Create the following file structure:

  • bank
    • bank.rb
    • lib
      • locale
      • account.rb
      • errors.rb
      • locale_settings.rb
  • runner.rb

The bank folder is going to contain all the files for the project, whereas runner.rb will be used to actually boot the program. Here are the contents of the bank.rb file:

require_relative 'lib/locale_settings'

require_relative 'lib/errors'

require_relative 'lib/account'

module Bank

end

Nothing fancy, we are just including some files and defining an empty module. This module will be used to namespace our classes. Next, errors.rb:

module Bank

  class WithdrawError < StandardError

  end

end

This error will be raised when the money can't be withdrawn from an account (for example, where there is just not enough money). Last but not the least is the account.rb:

module Bank

  class Account

    attr_reader :owner, :balance, :gender

    VALID_GENDER = %w(male female).freeze

    def initialize(owner:, balance: 0, gender: 'male')

      @owner = owner

      @balance = balance

      @gender = check_gender_validity_for gender

    end

    def transfer_to(another_account, amount)

      begin

        withdraw(amount)

        another_account.credit amount

      rescue WithdrawError => e

        puts e

      else

        puts "Money sent!"

      end

    end

    def credit(amount)

      @balance += amount

    end

    def withdraw(amount)

      raise(WithdrawError, 'Not enough money for withdrawal') if balance < amount

      @balance -= amount

    end

    private

    def check_gender_validity_for(gender)

      VALID_GENDER.include?(gender) ? gender : 'male'

    end

  end

end

So, we have three attributes: owner (name or full name), account's balance (default to 0) and owner's gender (to properly display some informational messages—it is important for some languages). Note that the initialize method has arguments defined in a hash-style which is only supported in newer versions of Ruby. You may indeed stick to the traditional format. When setting the gender, we check that it has a proper value. This is done inside the check_gender_validity_for private method that employs the VALID_GENDER constant. Also, there are a credit and withdraw interface methods to perform money transactions. Note that we do not allow to directly modify the balance attribute in order to check if, for example, there is enough money on the balance. Lastly, there is a transfer_to method that enables us to transfer money between accounts. This method has a begin/rescue block that checks whether the transaction succeeded. Now you may flesh our the runner.rb file to see the program in action:

require_relative 'bank/bank'

john_account = Bank::Account.new owner: 'John', balance: 20, gender: 'male'

kate_account = Bank::Account.new owner: 'Kate', balance: 15, gender: 'female'

john_account.transfer_to(kate_account, 10)

This is pretty much it, our preparations are done. Now it is time to move to the next part and add support for multiple languages. I will stick with Russian and English but you may indeed make a different choice.

Integrating fast_gettext

Start off by installing a new gem on your PC:

gem install fast_gettext

It has no special requirements so the installation should succeed without any problems. Next, require the corresponding module inside bank.rb:

require 'fast_gettext'

# ...

Now we should load our translations (that will be added later) from the locales directory and set a text domain. Our program is very simple, so having one domain is enough, though fast_gettext does support multiple domains as well. Let's add some code to the locale_settings.rb file:

module Bank

  class LocaleSettings

    def initialize

      FastGettext.add_text_domain('bank', path: File.join(File.dirname(__FILE__), 'locale'), type: :po)

    end

  end

end

The name for the text domain will be bank. We are also specifying a path to our translations and set the file type to PO. Next provide the list of supported locales and set the text domain:

# ...

FastGettext.text_domain = 'bank'

FastGettext.available_locales = %w(ru en)

Alternatively, you may set FastGettext.default_text_domain setting to bank. Now let's list all the available locales and ask the user to choose one:

puts "Select locale code:"

FastGettext.available_locales.each do |locale|

  puts locale

end

change_locale_to gets.strip.downcase

change_locale_to is a private method that checks whether the chosen locale is supported or not:

# ...

private

def change_locale_to(locale)

  locale = 'en' unless FastGettext.available_locales.include?(locale)

  FastGettext.locale = locale

end

If the locale is not supported, we revert to English. Here is the full code for the locale_settings.rb file:

module Bank

  class LocaleSettings

    def initialize

      FastGettext.add_text_domain('bank', path: File.join(File.dirname(__FILE__), 'locale'), type: :po)

      FastGettext.text_domain = 'bank'

      FastGettext.available_locales = %w(ru en)

      puts "Select locale code:"

      FastGettext.available_locales.each do |locale|

        puts locale

      end

      change_locale_to gets.strip.downcase

    end

    private

    def change_locale_to(locale)

      locale = 'en' unless FastGettext.available_locales.include?(locale)

      FastGettext.locale = locale

    end

  end

end

Now just load the settings inside the bank.rb file:

# ...

module Bank

  LocaleSettings.new

end

After translations are loaded, they are cached, therefore the fast_gettext has a very nice performance (at least ten times faster than I18n::Simple, according to the docs). All right, now that the user is able to select a locale, we need to prepare some translations, therefore proceed to the next section!

Creating PO Files

So, as you already know PO means Portable Object. Those files are separated into directories for different locales and named after the text domain. Our text domain is bank and supported locales are Russian and English, so here is the file structure for the locales directory:

  • locales
    • en
      • bank.po
    • ru
      • bank.po

PO files can look somewhat strange, especially if you got used to YAML format. You may find specifications for these files at gnu.org website. Every PO file starts with a header entry that contains information about the file, the author, last revision date and pluralization rules. Let's add the header for the en/bank.po file:

# Bank application

# Copyright (C) 2017

#

#, fuzzy

msgid ""

msgstr ""

"Project-Id-Version: version 0.0.1\n"

"PO-Revision-Date: 2017-10-19 18:00+0300\n"

"Last-Translator: Ilya Bodrov"

"Language-Team: My Team"

"MIME-Version: 1.0\n"

"Content-Type: text/plain; charset=UTF-8\n"

"Content-Transfer-Encoding: 8bit\n"

"Plural-Forms: nplurals=1; plural=1;\n"

As you see, here we are specifying the version of the file, author's name, content type and encoding. Plural-Forms will be filled with a proper value later. Add the same header to the ru/bank.po file:

# Bank application

# Copyright (C) 2017

msgid ""

msgstr ""

"Project-Id-Version: version 0.0.1\n"

"PO-Revision-Date: 2017-10-19 18:00+0300\n"

"Last-Translator: Ilya Bodrov"

"Language-Team: My Team"

"MIME-Version: 1.0\n"

"Content-Type: text/plain; charset=UTF-8\n"

"Content-Transfer-Encoding: 8bit\n"

"Plural-Forms: nplurals=1; plural=1;\n"

Alright, the files are created and we may flesh them out by adding some translations.

Performing Simple Translations

So, for starters let's display a simple message to the user after a new account is instantiated. Add the following files to the en/bank.po file (after the header entry):

msgid "New account instantiated!"

msgstr ""

msgid can be treated as a key for the message, whereas msgstr contains the translation. In this example I've left the translation empty—this means that the key will be displayed instead. This is not the case for the Russian language, of course. Tweak the ru/bank.po file:

msgid "New account instantiated!"

msgstr "Новый счёт создан!"

Here, as you see, I am providing translation for the given string. Of course, if you get used to i18n gem and YAML format, you may write your keys in a different way, for example:

msgid "account_instantiated"

msgstr "New account instantiated!"

Now, in order to perform translations, let's include a new module inside the Account class:

module Bank

  class Account

    include FastGettext::Translation

    # ...

  end

end

To look up a translation by its key, use a method with a very minimalistic name _:

    def initialize(owner:, balance: 0, gender: 'male')

      @owner = owner

      @balance = balance

      @gender = check_gender_validity_for gender

      puts _('New account instantiated!')

    end

Quite simple, eh!? If for some reason you'd like to use a different locale when performing a specific translation, you may wrap it in a with_locale block:

      FastGettext.with_locale 'ru' do

        puts _('New account instantiated!')

      end

Great, we've just translated our first message! Let's perform yet another translation. Add the following line to the en/bank.po file:

msgid "not enough money for withdrawal"

msgstr "[ERROR] This account does not have enough money to withdraw!"

As you see, this is our error that is raised when the account does not have enough money. Also, add Russian translation:

msgid "not enough money for withdrawal"

msgstr "[ОШИБКА] На счету недостаточно средств для снятия!"

Now utilize the _ method again:

# ...

def withdraw(amount)

  raise(WithdrawError, _('not enough money for withdrawal')) if balance < amount

  @balance -= amount

end

Using Interpolation

We have seen how to perform simple translations with fast_gettext, but the question is how do we add an extra layer of complexity and utilize interpolation in our translations? All in all, it is quite a simple task to achieve. Suppose, we'd like to display information about the user's account listing its owner and balance. Interpolation in PO files is performed by using a construct like text %{interpolation} more text. So, the interpolated values should be wrapped with the %{}. Tweak the en/bank.po file:

msgid "account owner info"

msgstr "Account owner: %{owner} (%{gender}). Balance: $%{balance}."

Do the same for the ru/bank.po:

msgid "account owner info"

msgstr "Владелец счёта: %{owner} (%{gender}). Текущий баланс: $%{balance}."

Interpolated values are provided in a pretty odd-looking way:

# account.rb

# ...

    def info

      _('account owner info') % {

          owner: owner,

          gender: gender,

          balance: balance

      }

    end

This % method uses the provided hash and interpolates the given values. Note that the keys inside the hash must be named after the placeholder inside the PO file. Now you may see this method in action by adding the following line to the runner.rb:

# ...

puts john_account.info

Using Gender Information

So far so good: our project is nearly translated. Now suppose we would like to display a more detailed information inside the transfer_to method when the transaction succeeds. For instance, I'd like to say who transferred money to whom and what was the amount. We could stick with only interpolation as it was done in the previous section, but unfortunately, that's not enough for the Russian language (and for a handful of other languages). The thing is in Russian some words are written differently for different genders, like "перевёл" ("transferred") for a male, but "перевела" (again, "transferred" in English) for a female. Luckily, there is a way to overcome this problem in PO files by using a scope. Add the following lines to the ru/bank.po file:

msgid "male|transferred"

msgstr "%{sender} перевёл %{recipient} $%{amount}"

msgid "female|transferred"

msgstr "%{sender} перевела %{recipient} $%{amount}"

Note now I prefix the keys with a male and female scope and provide different translations. Of course, the scope can be used in many other cases, not only to provide gender information. For English the messages will be absolutely identical in both cases:

msgid "male|transferred"

msgstr "%{sender} sent %{recipient} $%{amount}"

msgid "female|transferred"

msgstr "%{sender} sent %{recipient} $%{amount}"

Now in order to work with the scope, use the s_ method (yeah, all those methods have some seriously short names):

    def transfer_to(another_account, amount)

      begin

        withdraw(amount)

        another_account.credit amount

      rescue WithdrawError => e

        puts e

      else

        puts s_("#{gender}|transferred") % {

            sender: owner,

            recipient: another_account.owner,

            amount: amount

        }

      end

    end

This is it!

Pluralization Rules

Another painful I18n topic is pluralization. Some languages (like English) have simple pluralization rules, whereas others (like Russian or Polish) have much complex rules and therefore need more translations for various cases. Suppose we'd like to just say how many dollars is on the balance of a given account. For English, that'll be either "1 dollar" or "5 dollars". For Russian, however, we have three possible cases: "1 доллар", "2 доллара", "10 долларов". To take care of these scenarios, you need to properly set Plural-Forms in the header of each PO file (luckily, the following page lists pluralization rules for all the languages). For English everything is quite simple:

"Plural-Forms: nplurals=2; plural=(n != 1);\n"

For Russian the formula is somewhat complex:

"Plural-Forms: nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2);\n"

Unfortunately, when things come to adding translations for pluralized string, it becomes a bit messy. You need to provide not only msgid, but also msgid_plural and, of course, translations for each possible case. Firstly, modify the en/bank.po:

msgid "dollar"

msgid_plural "dollars"

msgstr[0] "%{amount} dollar"

msgstr[1] "%{amount} dollars"

Now ru/bank.po:

msgid "dollar"

msgid_plural "dollars"

msgstr[0] "%{amount} доллар"

msgstr[1] "%{amount} доллара"

msgstr[2] "%{amount} долларов"

Now use yet another short-named method n_ while providing an interpolated value:

# account.rb

# ...

def balance_info

  n_('dollar', 'dollars', balance)  % { amount: balance}

end

The first argument passed to the n_ method is the singular form, then plural form and then the count. To see this method in action, add yet another line to the runner.rb:

puts john_account.balance_info

Parsing PO Files

If you have large PO files that need to be parsed (for example, to understand how many messages are left untranslated), you may stick with a simple gem called POParser. Install it on your PC:

gem install PoParser

Next, require it, open a file and parse it:

require 'poparser'

content = File.read('example.po')

po = PoParser.parse(content)

The po variable will now contain an instance of the PoParser::Po class:

<PoParser::Po, Translated: 68.1% Untranslated: 20.4% Fuzzy: 11.5%>

You are able to grab all the entries from the PO file all get only the untranslated ones:

po.entries

po.untranslated

It is even possible to directly add new entries to a PO file by creating a proper hash and using the add method:

new_entry = {

              translator_comment: 'comment',

              reference: 'reference comment',

              msgid: 'untranslated',

              msgstr: 'translated string'

            }

po.add(new_entry)

After you are done editing the file, save it:

po.save_file

All in all, POParser is a pretty convenient tool and you may learn more about it by referring to the official documentation.

Working With YAML Files

As I already mentioned before, fast_gettext also supports YAML files that can be more convenient for some developers. In order to start using YAML files instead of PO, simply change the :type setting passed to the add_text_domain method:

# locale_settings.rb

module Bank

  class LocaleSettings

    def initialize

      FastGettext.add_text_domain('bank', path: File.join(File.dirname(__FILE__), 'locale'), type: :yaml)

    end

  end

end

Note that the YAML files do not need to be separated into folders. They also have a somewhat different format. For example, locale/en.yml:

en:

  pluralisation_rule: '(n != 1)'

  dollar:

    one: '%{amount} dollar'

    other: '%{amount} dollars'

locale/ru.yml:

ru:

  pluralisation_rule: '(n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2)'

  dollar:

    one: '%{amount} доллар'

    other: '%{amount} доллара'

    plural2: '%{amount} долларов'

Of course, when sticking to YAML files you will probably want to name your keys in a snake case, so these lines:

msgid "not enough money for withdrawal"

msgstr "[ERROR] This account does not have enough money to withdraw!"

will turn to something like:

not_enough_money_error: "[ERROR] This account does not have enough money to withdraw!"

Therefore, do not forget to tweak your calls to _, n_ and s_ methods accordingly.

Stick with Phrase!

Writing code to localize your application is one task, but working with translations is a totally different story. Having many translations for multiple languages may quickly overwhelm you which will lead to the user’s confusion. But Phrase can make your life as a developer easier! Grab your 14-day trial now. Phrase supports many different languages, including Ruby, and formats, including YAML and PO. It allows to easily import and export translation data and search for any missing translations, which is really convenient. On top of that, you can collaborate with translators as it is much better to have professionally done localization for your website. If you’d like to learn more about Phrase, refer to the Phrase Localization Suite.

Conclusion

In this article, we have seen how to translate Ruby applications with the fast_gettext gem. You have learned what GetText is, what PO files are, and what their format is. We've also discussed how to perform translation with fast_gettext, and how to add pluralization rules and gender information. Lastly, we have talked about POParser which may simplify working with PO files. Note that fast_gettext has even more features. For instance, you may use database to store your translations and there is also a plugin for Ruby on Rails framework. You may also find more usage examples of the gem by browsing test cases on GitHub. As always, thank you for staying with us. Until the next time!