Software localization

Unifying Ruby on Rails Environments with Docker Compose

Having a reliable build process is highly critical for us at Phrase. This is how we unify our Ruby on Rails environments with Docker Compose.
Software localization blog category featured image | Phrase

Most of our code is based on Ruby on Rails and we are big fans of Test Driven Development. We deploy multiple times a day and that is why we want to spend as little time as possible on manual QA. Our test suite contains more than 14.000 unit, functional and integration tests with a code coverage of 98%. The build process of our core application has various runtime and infrastructure dependencies including:

  • A specific ruby version
  • Operating system packages (e.g. imagemagick, libxml, libmysql, etc.)
  • Gem dependencies (~150 gems defined in our Gemfile)
  • MySQL
  • ElasticSearch
  • Redis
Managing all these dependencies for our development and CI environments can be very complicated. This is why I want to outline how we use Docker Compose to solve this problem for us.

Dockerfile

First, let’s start with the Dockerfile of our application:
FROM ruby:2.4.1

# add nodejs and yarn dependencies for the frontend

RUN curl -sL https://deb.nodesource.com/setup_6.x | bash - && \

  curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | apt-key add - && \

  echo "deb https://dl.yarnpkg.com/debian/ stable main" | tee /etc/apt/sources.list.d/yarn.list

# install bundler in specific version

RUN gem install bundler --version "1.15.3"

# install required system packages for ruby, rubygems and webpack

RUN apt-get update && apt-get upgrade -y && \

    apt-get install --no-install-recommends -y ca-certificates nodejs yarn \

    libicu-dev imagemagick unzip qt5-default libqt5webkit5-dev \

    gstreamer1.0-plugins-base gstreamer1.0-tools gstreamer1.0-x \

    xvfb xauth openjdk-7-jre --fix-missing

RUN mkdir -p /app

WORKDIR /app

# install node dependencies (e.g. webpack)

# next two steps will be cached unless either package.json or

# yarn.lock changes

COPY package.json yarn.lock /app/

RUN yarn install

# bundle gem dependencies

# next two steps will be cached unless Gemfile or Gemfile.lock changes.

# -j $(nproc) runs bundler in parallel with the amount of CPUs processes

COPY Gemfile Gemfile.lock /app/

RUN bundle install -j $(nproc)
We base our image on the default ruby image in the exact version we also run on production and then first install all required system packages. Parts of our frontend uses webpack and so we also add the repositories for nodejs and yarn.
We only add the necessary files to install our rubygem and npm dependencies (Gemfile, Gemfile.lock, package.json and yarn.lock) to the image. Unless any of these files change, docker can cache all steps resulting in almost no overhead.
Instead of also adding the actual application code to the image we just mount the directory with the code when running the tests.

docker-compose.yml

And here is our docker-compose.yml file:
# docker-compose.yml

version: '2.1'

services:

  app: &app

    build:

      context: .

      dockerfile: Dockerfile.test

    volumes: [".:/app"] # mount current directory into the image

    # use tmpfs for tmp and log for performance and to allow

    # multiple builds in parallel. Both directories are mounted

    # into the image AFTER the working directory is mounted.

    tmpfs: ["/app/tmp", "/app/log"]

  dev: &dev

    <<: *app

    environment:

      RAILS_ENV: "development"

      DATABASE_URL: "mysql2://mysql/phraseapp?local_infile=true"

      ELASTIC_SEARCH_URL: http://elasticsearch:9200

      REDIS_URL: "redis://redis"

    depends_on:

      mysql: {"condition":"service_healthy"}

      redis: {"condition":"service_healthy"}

      elasticsearch: {"condition":"service_healthy"}

  server:

    <<: *dev

     command: ["bundle", "exec", "./build/validate-migrated.sh && rails server -b 0.0.0.0"]

     ports: ["3000:3000"]

  test: &test

    <<: *app

    environment:

      RAILS_ENV: "test"

      DATABASE_URL: "mysql2://mysql-test/phraseapp?local_infile=true"

      ELASTIC_SEARCH_URL: http://elasticsearch-test:9200

      REDIS_URL: "redis://redis-test"

      SPRING_TMP_PATH: "/app/tmp"

    # wait for all dependent services to be healthy

    depends_on:

      mysql-test: {"condition":"service_healthy"}

      redis-test: {"condition":"service_healthy"}

      elasticsearch-test: {"condition":"service_healthy"}

  # allow executing of single tests against a running spring server

  spring:

    <<: *test

    command: ["bundle", "exec", "./build/validate-migrated.sh && spring server"]

  elasticsearch: &elasticsearch

    image: elasticsearch:1.7.6

    ports: ["9200"]

    healthcheck:

      test: ["CMD", "curl", "-SsfL", "127.0.0.1:9200/_status"]

      interval: 1s

      timeout: 1s

      retries: 300

  elasticsearch-test:

    <<: *elasticsearch

    # place elasticsearch data on tmpfs for performance

    tmpfs: /usr/share/elasticsearch/data

  redis: &redis

    image: redis:2.8.23

    ports: ["6379"]

    healthcheck:

      test: ["CMD", "redis-cli", "ping"]

      interval: 1s

      timeout: 1s

      retries: 300

  redis-test:

    <<: *redis

  mysql: &mysql

    image: mysql:5.6.35

    ports: ["3306"]

    environment:

      MYSQL_ALLOW_EMPTY_PASSWORD: "true"

      MYSQL_DATABASE: "phraseapp"

    healthcheck:

      test: ["CMD", "mysql", "-u", "root", "-e", "select 1"]

      interval: 1s

      timeout: 1s

      retries: 300

  mysql-test:

    <<: *mysql

    tmpfs: /var/lib/mysql # place mysql on tmpfs for performance

All infrastructure services for our development environment and our integration tests (MySQL, ElasticSearch and Redis) have health checks configured and so the spring, test, dev and server services are only started when all of them are healthy.

As all services are placed in a dedicated docker network they can be accessed just by their name (e.g. DATABASE_URL=”mysql2://mysql/phraseapp_test?local_infile=true”).

There are dedicated services for the infrastructure components for the development and test environments. The spring and server services use a little script (validate-migrated.sh) to determine if migrations were already executed. If yes, we run possibly pending migrations rails db:migrate, otherwise we execute the more efficient rails db:setup.

# build/validate-migrated.sh

if rails db:migrate:status &> /dev/null; then

  rails db:migrate

else

  rails db:setup

fi

Running tests

We can then trigger a full test run like this:

docker-compose build test

docker-compose run test bundle exec "rails db:setup && xvfb-run rails spec"

docker-compose down
As we are dealing with a completely empty database we first need to load the database schema (rails db:setup ) before we can execute our tests (rails spec).
We use xvfb-run as a wrapper because parts of our integration tests use capybara webkit and therefore need to run inside a XServer.
Docker Compose reuses dependent services so we need to delete all of them after we are done with docker-compose down. This can cause problems when running multiple tests in parallel but we can use the --project-name flag of docker-compose with e.g. a randomly generated name to fully isolate all test runs.
To make things simpler we created a wrapper script which automatically runs tests isolated and also cleans up afterwards:
#!/bin/bash

set -e

# random name with timestamp as prefix to give a cleanup script more information

PROJECT_NAME=$(TZ=UTC date +"%Y%m%d%H%M%S")$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 8| head -n 1)

function cleanup {

  # capture the original exit code

  code=$?

  echo "cleaning up"

  # ignore errors on docker-compose down

  set +e

  docker-compose --project-name ${PROJECT_NAME} down

  # exit with original exit code

  exit $code

}

# run cleanup when the script exits

trap cleanup EXIT

docker-compose --project-name ${PROJECT_NAME} build test

docker-compose --project-name ${PROJECT_NAME} run test "$@"

Running tests as developer

Because the setup and teardown part introduces quite some latency, this approach is not really practical for test driven development. This problem can be solved with the spring service which we also configured in the docker-compose.yml.
As a developer, I can start a long running spring process with all infrastructure dependencies inside a docker container
docker-compose up --build spring
and then run
docker-compose exec spring xvfb-run spring rspec spec/models/user_spec.rb

to execute selected tests with a much lower latency. Similar to the wrapper script above we can create a shell alias to make things simpler:
alias dc-spec='docker-compose exec spring xvfb-run spring rspec'

dp-spec spec/models/user_spec.rb

Running the development server

Running the development server is as simple as executing

docker-compose up server

By exposing port 3000 of the container we can access the server running inside docker at http://localhost:3000.

Conclusion

Using Docker Compose for the build and development process of a Ruby on Rails application has many advantages. The actual dependencies for development and CI environments can be boiled down to docker and docker-compose. Tools such as rbenv or rvm are no longer needed.
This approach also avoids typical dependency issues when installing gems which need to be compiled such as nokogiri or capybara webkit. Some of our developers use OSX, others different Linux distributions and some even Windows. With Docker Compose we now no longer need to maintain separate documents for how to setup the development environment. Instead of executing manual steps this process is now also fully automated.
As the docker-compose.yml is checked into our code base we can test our application against newer versions of Ruby, MySQL, ElasticSearch or Redis by just updating the specific versions in that file. There is no need to install any of these new dependencies in either the development environment or on our CI servers. Also, this makes sure that all environments use the very same well-defined versions of all components.

Outlook

Our full test suite currently takes almost an hour to run sequentially, so our developers never run the whole suite locally. Instead, we frequently push our local changes to a remote branch on GitHub and our CI infrastructure automatically builds all these pushes. To reduce these feedback cycles it is necessary for us to run our tests in parallel and on multiple CI servers.
In one of our next articles, we will dive deeper in how we do this based on the setup presented above.