Teaching CakePHP to be Multilingual (part 1)

The international community is an essential part of Firefox. Whether it’s coming up with ideas, writing code, or just using the browser every day, Firefox is influenced by people around the globe. In an effort to embrace this relationship we decided that Remora, the next version of addons.mozilla.org, should be localizable. I see that Mike shaver published the first in a series of articles on how volunteers could localize AMO3 into different language already. I’d like to build off the definitions he provided, but get into the coding and integration with CakePHP aspect a little bit more.

Since the technical complexity is such a broad topic, I’ve divided the subject into a few sections which I’ll post separately. The rest of this post will focus on what the end goal was and what our plan of attack was. The second will cover localizing static content, and the third, dynamic content.

To add to Mike’s descriptions, when I write “static content”, I’m referring to strings on the page, like “Home”, “About”, or the “Search” button at the top of this page. These are strings that are actually in the template files. Dynamic content refers to information coming from a database, like authors, descriptions and titles. They both produce the same effect for a visitor (a localized page), but they take very different paths to get there.

When looking at localization options, we had three main goals: speed, robustness, and friendliness. The site we’re replacing is getting several hundred hits per second, so our solution had to be fast, but we also wanted something tried and true that wouldn’t break down or trip over itself. The last goal, friendliness, refers to a low barrier of entry for localizers. Whatever we give a localizer has to be simple to use.

Anyone who has searched for the answer to an i18n or l10n question knows there is always more than one way to reach their goal. The, now defunct[1], CakePHP wiki used to offer two tutorials for handling i18n.

For static content, the i18n tutorial used separate template files (.thtml) for each language, which can become awkward to maintain and can be difficult for localizers to translate if they aren’t familiar with HTML and PHP markup. When editing large pieces of text it can be important for a localizer to be able to add additional formatting, however, our pages consist of many short strings, mainly titles and navigation elements. We didn’t feel the additional complexity would benefit the project.

The i18nv2 tutorial leverages the PEAR::Translation2 package for dynamic content. This appeared promising at first, but we realized it couldn’t support the relationships we were planning. Translation2 uses two keys to lookup a value. A rough example would be that you can lookup an item (value 1) from a category (value 2). We actually needed a three key lookup, eg. getting a specific parameter (value 1) from an item (value 2) in a category (value 3).

The CakePHP tutorials point in the right direction, and will work for many sites, but they didn’t quite fit with Remora. I’ll dig into our implementation for localizing static content in the next post.

As a quick reminder, all of the source code in the Remora project is viewable in our SVN Repository.

[1] At the time of original writing, the CakePHP wiki was still available online. The dead links have been removed from this post.

Categories: AMO

2 responses

  1. Jim Plush wrote on :

    How are you caching the strings? The fastest method I’ve found is to use something like the APC extension which allows you to store true global variables in memory. apc_store/apc_fetch. It uses memory but the performance is amazing. Another alternative I’ve used where extensions couldn’t be installed is SQLite, again vastly faster than xml or even native php array files.

    from the pear site the Translation package says:
    “CacheMemory Decorator Example

    “This decorator provides a memory cache layer. It does NOT persist through requests, only in the current execution of the script. You can turn off prefetch if you want small network load (but it will increase the number of queries to the database) ”

    APC storage is nice because it persists across all requests.

  2. Wil Clouser wrote on :

    The static strings are pulled from the binary .mo files and cached natively in apache. (This is all automatic when using gettext).

    The dynamic strings just come straight from the database. Cake has some basic query caching built into it that could be used, but I’m pretty sure it’s just cached to a file on disk. APC is definitely a good move here if you need to cache the translation results.