Internationalizing Bike Index
For more context on this project, see the feature announcement on Bike Index’s blog.
Unless your Rails app has been internationalized since its inception, internationalizing it minimally entails three broad efforts:
- Adding the ability to detect and set the locale for a given web request.
- If using the default Rails i18n framework, externalizing user-facing strings, moving them from views (mainly but not exclusively) to YAML.
- Translating your now-externalized strings to other languages.
For Bike Index, we did some research into the approaches taken by other internationalized open-source Rails projects – in particular, Discourse and GitLab. This work was useful in developing a mental model of the work to be done, although naturally we deviated with them where different needs or constraints demanded it.
Locale detection
There are a few ways to detect a user’s locale:
- from a
locale
query param (settable via a UI element), - from a database value tied to the user’s account (settable via a user preferences UI),
- from the
ACCEPT_LANGUAGE
header set on a request (settable via the user’s browser preferences), and - inferring a locale from the user’s geocoded location.
To minimize complexity, we implemented only (1) through (3).
# app/controllers/application_controller.rb L81-87 (f3cb4792)
def set_locale
if controller_namespace == "admin"
return I18n.with_locale(I18n.default_locale) { yield }
end
I18n.with_locale(requested_locale) { yield }
end
[source]
# app/controllers/application_controller.rb L65-79 (f3cb4792)
def requested_locale
return @requested_locale if defined?(@requested_locale)
requested_locale =
locale_from_request_params.presence ||
current_user&.preferred_language.presence ||
locale_from_request_header.presence
@requested_locale =
if I18n.available_locales.include?(requested_locale.to_s.to_sym)
requested_locale
else
I18n.default_locale
end
end
[source]
Translation management
Translation management is a “buy vs. build” decision point. The central questions to engage with here are
- Do we want developers to be the gatekeepers to updating translations? (In our case, no.)
- Do we want to accept translation contributions from users via the site UI? (In our case, a nice-to-have but infeasible for a v1.)
- Are we able to invest the resources into building our own translation management solution? (In our case, likely not.)
That left us pricing a variety of translation management services we’d seen used elsewhere and researching their feature sets, including Transifex, LingoHub, and Phrase.
All involved committing to a monthly subscription that ranged from $19 to $180 per month, plus the cost of translation, which we estimated would total $8,000-$10,000 for an initial Dutch translation.
Some more digging surfaced Translation.io, which is lightweight, focused on Rails (and Laravel) projects, and pushed all the right buttons for us:
- It’s free for open-source projects
- It automatically integrates translations by Google Translate (imperfect but a cost-effective 80-90% solution), further reducing the costs involved
As a non-profit, we’re relatively price-sensitive and don’t want to use funds inefficiently, so the potential savings gave Translation.io a big leg up in our deliberations.
Its most significant feature-gap relative to its alternatives – automated GitHub PRs to sync translations – could be implemented with some shell script integrated into our build pipeline, so we had a clear winner.
String externalization
The key decision for this stage is what format to use for translation files, the choices being YAML (the Rails default) and GetText (broadly popular beyond the Rails ecosystem).
GetText has several advantages over the Rails default, especially for large projects – the most compelling arguably being that strings don’t need to be externalized from templates to a translation file. Instead, the source string lives in the template but is merely wrapped in a special method.
But, as is often the case in a Rails context, the defaults are collectively better optimized on the needs of a moderately-scaled project like Bike Index than the alternatives, even if those alternatives are in one sense or another individually better.
There is pre-existing tooling that both mitigate the disadvantages of the Rails default i18n framework and amplify its benefits, so we chose to not stray too far from Rails conventions in order to leverage as much open-source prior art as possible. Additionally, the YAML approach allows non-developers (marketing?) to edit source copy without diving into the source code.
String externalization is by far the most time-consuming and labor-intensive part of a translation project.
We automated as much as possible using a variety of code-gen and text wrangling tools:
- i18n-js: Lightweight generator of client-side translation file(s)
- rails-i18n: Generated translations of model attributes, etc
- i18n-country-translations: Pre-fab translations of country names
- money: Currency localization
- haml-i18n-extractor: Externalize strings from Haml to YAML
- vim-i18n: Externalize strings from ERB to YAML
- i18n-tasks: Rake tasks for maintaining translation files (normalizing translation files, detecting missing keys, etc.)
Pretty Good Practices
Some learnings and conventions that emerged while scanning through and extracting strings from ~15,000 lines of template, controller, and React code:
Retain semantic completeness
To aid translation, it’s desirable to externalize strings in a form as close to semantically complete as possible.
As a corollary: Use coarse conditional branching in templates. Duplication of user-facing strings is desirable whenever its alernative is to break up a string into units that, in isolation, might lose their context or encode an assumption about word order that may not hold in another language.
For example, instead of
- when = expedited ? "soon" : "eventually"
= "You'll #{when} receive your delivery"
prefer
- if expedited?
= "You'll soon receive your delivery"
- else
= "You'll eventually receive your delivery"
The former externalizes with the following structure:
soon: spoedig
youll_receive_delivery: U ontvangt %{wanneer} uw levering ontvangt
The resulting copy in Dutch is
U ontvangt spoedig uw levering ontvangt
but the translation we’d want here might be:
youll_soon_receive_delivery: U ontvangt binnenkort uw levering
Mailer translation format
Mailers can be namespaced by mailer name, email name, and email format as
follows (note the .text
and .html
in the translation keys):
# config/locales/en.yml
geolocated_message:
html:
is_located_at_html: "Is located at: <strong>%{address}</strong>"
text:
your_bike_is_at: "Your %{bike} is at: %{address}"
-# app/views/organized_mailer/geolocated_message.html.haml
%p= t(".html.is_located_at_html", address: @organization_message.address)
-# app/views/organized_mailer/geolocated_message.text.haml
= t(".text.your_bike_is_at",
bike: @bike.type,
address: @organization_message.address)
Controllers
Controllers typically define user-visible strings on flash
messages. We
defined a translation
helper method in a ControllerHelpers
mixin that method
wraps I18n.translate
and infers the scope in accordance with the
convention of scoping translations by their lexical location in the code base.
Both :scope
and :controller_method
can be overriden using the corresponding
keyword args. Note that base controllers should be passed :scope
or
:controller_method
explicitly. See the translation
method docstring for
implementation details
# app/controllers/concerns/controller_helpers.rb L145-157 (ee68edb1)
def translation(key, scope: nil, controller_method: nil, **kwargs)
if scope.blank? && controller_method.blank?
controller_method =
caller_locations
.slice(0, 2)
.map(&:label)
.reject { |label| label =~ /rescue in/ }
.first
end
scope ||= [
:controllers,
controller_namespace,
controller_name,
controller_method.to_sym
]
I18n.t(key, **kwargs, scope: scope.compact)
end
[source]
JavaScript
Client-side translations are defined under a :javascript
keyspace in en.yml
.
# config/locales/en.yml
javascript:
bikes_search:
The translation method can be invoked directly as I18n.t()
in your JS and
passed a complete scope:
<span className="attr-title">{I18n.t("javascript.bikes_search.registry")}</span>
Equivalently, a curried instance of I18n.t
can be initiated locally (by
convention, bound to t
) with the local keyspace set as needed:
// app/javascript/packs/external_registry_search/components/ExternalRegistrySearchResult.js
const t = BikeIndex.translator("bikes_search");
// . . .
<span className="attr-title">{t("registry")}</span>;
A client-side JS translations file is generated when the prepare_translations
rake task is run. See PR #1353 for implementation details.
Pre-Deployment Translation Syncing
When building master, we check for un-synced translations and, if any are found, stop the build and open a PR to master with the translation updates.
# .circleci/config.yml L102-104 (465b3072)
- run:
name: Sync translations (only on master by default)
command: bin/check_translations
[source]
# bin/check_translations L73-88 (e36adcd5)
output "${YELLOW}" "Creating Update Pull Request"
if [[ "${TRANSLATION_BRANCH}" != "master" ]]; then
PR_URL="Related PR: ${CIRCLE_PULL_REQUEST}"
fi
git push -u origin "${BRANCH}"
hub pull-request \
-m "[i18n] Translation update: Build ${CIRCLE_BUILD_NUM}
Merge to unblock CI job ${CIRCLE_BUILD_NUM}: ${CIRCLE_BUILD_URL}
${PR_URL}"
output "${GREEN}" "Translation update PR created."
exit 1
[source]