-
Another thing about <!ENTITY> and then some on localization
Challenge 3: Some languages with multiple forms of a word
In some languages, words can have multiple forms depending on context. What if the word for tab could be written as tab, tabs, tab(x), or [prefix]-tab where each form might be used depending on what a developer hopes to communicate.
Here is an example:
<!ENTITY tabsOpen1 “You have %1 tab open”>
<!ENTITY tabsOpen2 “You have %1 tabs open”>In English, the UI is relaying to the end-user that he or she may have one tab open or more than one tab open. Using Polish again, we can see that there are multiple forms depending on the context: one kartę; two, three or four karty; or zero or five or more kart. In fact, the Polish grammar rule is much more complex than my explanation and I am sure I am missing all the rules, but you get the point.
See below:
<!ENTITY tabsOpen1 “Masz %1 kartę otwartą“>
<!ENTITY tabsOpen2 “Masz %1 karty otwarte“>
<!ENTITY tabsOpen3 “Masz %1 kart otwartych“>See the problem? Option three can and will never be used because the code only provides for 1 tab or [x] tabs. So, Polish localizers are forced to create an artificial form like ” otwórz kart: %1″. Once again, this is not really a pattern of natural spoken Polish. It reads more as a representation of the database or something.
Challenge 4: Localization in the broader sense
Sometimes in our UI, colors, icons, spacing allocated to certain words like “Firefox”, and more are hard-coded, limiting the ability for a localizer to change them to make more sense or work well in their localizations. If those elements are not hard-coded, they can still be hard to change. In those cases, a localizer can file a bug asking a developer to provide more options.
For instance, let’s say a developer uses the colors red and green to indicate success or failure when a user submits a password. These colors might not mean anything in certain localizations. A bug is then filed and a developer works to extend the options available to that localizer so it is more meaningful. But this can be laborious, and is definitely not scalable. Moreover, this new exception forces all other localizations to translate a new entity, even though it may not have the same level of importance (if any at all) in their home language.
Other issues to think about include languages that use right-to-left writing or languages that present their characters vertically rather than horizontally. The examples are numerous and we can go through all of them, but I think you get my point. Feel free to add your examples to the comments section of this post.
Next time, I’ll present a small piece of what could be the next generation of l10n. You might think of it as Localization 2.0 or L20n.



















