Software localization
Using Google Translate in Python Applications
Software development projects, including those in Python, often struggle with delivery as soon as an application needs to be made multilingual. For simple applications, for example, you might be using spreadsheets and a machine translation engine like Google Translate. However, if you have a complex application, with lots of text for translation, and want to rely on Google Translate, using spreadsheets for localization might turn into a nightmare: Copying and pasting text back and forth would take quite some time and most likely slow down your development process—but what if there is a better way to localize your Python app?
By creating a simple Python script, you can speed up the machine translation process from the start. There are multiple Python packages that offer these capabilities, but with the exception of Google's official API, all other packages are neither stable nor supported by Google. Hence, if you are looking for a more efficient way to use Google Translate for machine translation in your Python project, Google's official API can be a solid choice.
Google provides two different versions of the Cloud Translation API:
- Basic (v2), a good fit for small applications, especially those with "casual" user-generated content
- Advanced (v3), optimized for customization based both on context and type of content
To keep things simple, this localization tutorial will stay focused on the Basic version. Shall we start?
Setup
First and foremost, you need to create a project via Google Cloud Console. You'll need a project number or ID when calling the Translation API. Feel free to check out the Google Cloud Translation setup guide to get more detailed information on:
- Enabling billing
- Enabling Cloud Translation API
- Setting usage quotas
- Setup authentication
By default, you can make a POST call to the following endpoint API using any programming language:
https://translation.googleapis.com/language/translate/v2
Just pass in the access token as the authorization bearer. The bearer token is passed as an HTTP header. Here is an example of how it will look like when using curl:
curl -X POST \ -H "Authorization: Bearer <access_token>" \ -H "Content-Type: application/json; charset=utf-8" \ -d '{"q":["Hello world", "Welcome to Phrase blog"],"target":"de"}' \ https://translation.googleapis.com/language/translate/v2
You need to include a JSON body with the following fields:
{ "q": ["Hello world", "Welcome to Phrase blog"], "target": "de" }
Fortunately, Google provides its own Python SDK to make things easier for developers.
It is optional—but highly recommended—to create a virtual environment before you continue. Create the virtual environment as follows (replace ./myenv
based on your own preference):
python3 -m venv ./myenv
Ubuntu / macOS
If you use Ubuntu or macOS, activate it using the following command:
source ./myenv/bin/activate
Windows
Those who rely on Windows should use the following command:
.\myenv\Scripts\activate
Run the following command to install the Python SDK:
pip install google-cloud-translate==2.0.1
Once you have installed the package, you can call it normally to translate any text.
Implementation
In this section, we will implement:
- A function that calls Google Translation API and returns the corresponding translation
- Code to translate a single string
- Code to translate a list of strings
Create a new Python file called test_translation.py.
Import
Add the following import statement at the top of the file.
from google.cloud import translate_v2 as translate
Translate function
Next, create a new function called translate_text
. The function accepts the following parameters:
- text, a single string or a list of strings for translations
- target_language, the target language with an ISO 639-1 language code (en, de, th, etc.)
- source_language, an optional source parameter with a ISO 639-1 language code; Google will auto-detect the language if set to "none"
Inside the function, instantiate a translate.Client()
object as follows:
def translate_text(text, target_language, source_language=None): translate_client = translate.Client() ...
Continue by calling the translate
function and return the output.
def translate_text(text, target_language, source_language=None): translate_client = translate.Client() result = translate_client.translate(text, target_language=target_language, source_language=source_language) return result
The output can be a dictionary or a list of dictionaries depending on your input text. The dictionary contains the following keys:
- input, the input text as a string or a list of strings
- translatedText, the translated text
- detectedSourceLanguage, the detected source language if source_language is set to "none"
Translating a single string
Now, you can call the function normally in the same Python file as follows:
... result = translate_text('Hello, world!', 'de') print(result['translatedText']) # Hallo Welt!
Translating a list of strings
In addition, you can translate a list of strings by passing the list to the same function as follows:
... results = translate_text(['Hello, world!', 'What are you doing!'], 'de') for result in results: print(result['translatedText'])
The output will be a list of dictionaries instead.
Other useful functions
Google Translation API also comes with additional functions for:
- Getting a list of supported languages
- Detecting the language of an input string
Getting a list of supported languages
For the full list of supported languages and the corresponding ISO 639-1 language codes, you can use the get_languages
function:
... languages = translate_client.get_languages() for language in languages: print(u"{name} ({language})".format(**language))
It returns a list of dictionaries with the following fields:
- name, name of the language
- language, the ISO 639-1 language code
Detecting the input language
Furthermore, the Python SDK comes with a detect_language
function that can identify the language of an input text. You can use it as follows:
... result = translate_client.detect_language(text) print("Confidence: {}".format(result["confidence"])) print("Language: {}".format(result["language"]))
The output is a dictionary with the following fields:
- confidence, the confidence score of the prediction
- language, the language code in ISO 639-1 format
Conclusion
By now, you should be able to create a simple Python script to machine translate user-generated content with Google Translate. For more sensitive types of content, consider working with human translators on a software localization platform like Phrase. It will let you:
- Build production-ready integrations with your development workflow
- Invite as many users as you wish to collaborate on your projects
- Edit and convert localization files with more context for higher translation quality
Sign up for a free 14-day trial, and see for yourself how it can make your life easier.
Last updated on October 20, 2022.