Archive for the 'AMO' Category

AMO Update r10238

Monday, February 11th, 2008

Yes, we are over 10,000 commits in our subversion repository. This last update for AMO trunk includes the following fixes, among others:

  • Update sk locale from bug 367271
  • Improving install experience for non-browser apps (bug 401272, r=clouserw)
  • adding GUID to categories RSS feed to enable feed readers to distinguish fresh items from old ones (bug 411834)
  • merging new strings from Thunderbird install experience fix (r9576, bug 401272) into all other locales
  • fixing “all versions” RSS feed, bug 392183
  • Fix bug 394590
  • Fix bug 378782
  • Total download counting in maintenance script; bug 409341; r=morgamic
  • Adding pt_PT locale from bug 391197
  • Update pt-BR locale from bug 380221
  • Update zh-CN locale from bug 407472
  • fixing data sanitization for UTF-8 characters: bug 412580, r=laura
  • Adding support for application wildcards in categories; bug 408525; r=clouserw
  • adding test for UTF-8 sanitization (bug 412580)
  • fixing pagination sanitization, bug 412580, r=fligtar
  • Checking in reviewcount column and maint script from bug 408680.
  • Firefox 3 additem notices; bug 406898; r=morgamic
  • fix bug 415085
  • Fixing sanitization of discussion dates on addons detail page (bug 414541)
  • Unflag sr-flagged add-on; bug 371214; r=fwenzel
  • minor change to bin database class; bug 409341; r=morgamic
  • Checking in review count column stuff from 408680. r=fwenzel.
  • fixing memcaching for select queries that start with whitespace (bug 416403, r=morgamic)
  • Update fr locale from bug 366239

I want to thank everyone on the AMO team for their hard work, especially localizers who have worked really hard to port AMO to their native language. 2008 is already turning out to be a great year — let’s keep it up!

Second thoughts on dynamic content

Thursday, December 20th, 2007

I was looking at one of the AMO v3.2 mockups today. There are strings like “See All Interface Tweaks Add-ons” that we’ve avoided up till now, but this isn’t the first time they’ve been proposed. The problem we’re having is that a string like that is from two different sources - static and dynamic data. “Interface Tweaks” is the name of one of our categories so it’s stored in the database, and the rest of the string is static, so it’s in a .po. The static string would look something like:

See All %s Add-ons

and the dynamic string would look like:

Interface Tweaks

In English, these combine and all is well, but if the second value affects the structure of the first in a dynamic way, we can’t support the phrase on AMO.

When I wrote the original code, I think I had two things in mind:

  • Categories would be changing more often
  • Localizers wouldn’t have direct access to SVN

Categories have changed a bit in the past (and they’re in mid change right now, actually), but other than the convenience of a near-instant change on the site, it doesn’t seem that beneficial to change them via the web. The second point is a big one though - by giving localizers direct access to SVN, they can update strings whenever they need to without our meddling and getting in the way. That’s a big time saver for everyone.

So, now I’m reconsidering the separation of some of the interface translations (add-on types, applications, and categories) from the rest of the static content. Looking at where we currently stand, it seems like it’s more of a hassle to describe the separation and what it does than it would be to just drop everything in the .po file. Plus we’d get the benefit of strings like “See All Interface Tweaks Add-ons.”

AMO: Developer Replies to Reviews

Friday, September 21st, 2007

Recently, I worked on a nice little feature for AMO: Letting developers reply to user reviews.

The idea is, when you get a review as an add-on publisher, you may find a spot or two in it that you feel like replying to. In the previous version of AMO, users started discussions by just adding another review (with a random rating), effectively rendering parts of the rating system useless. Also, the developers were not allowed to rate their own add-ons and could thus not reply to any of the questions.

In AMO version 3, we addressed this issue by having our editor team moderate reviews to ensure “good” reviews and thus useful ratings. For discussions around individual add-ons, we introduced a forum system to provide a more appropriate means of discussing support questions and similar while not diluting the review/rating system. This, however, left reviews as a one-way communication with no way for developers to address questions raised in their add-ons’ reviews.

This issue is now fixed: As of today’s AMO update, developers can now reply once to each of the published reviews of their add-ons.

AMO review reply, example

The reason why we only allow one reply is so that discussions are held in the discussion forums (which is where they should be, as mentioned above), while not forcing developers to keep possible allegations undisputed. Some of you may notice parallels to the rating system Ebay uses for sellers.

Before you start replying to a bunch of reviews now: Remember, the reviews of your add-on are a place for your users to give feedback and point out good or bad things (in their opinion) about your add-on. Developer replies will also be moderated by editors, just like regular reviews. So, as always, play nice, even if you disagree with the opinion explained in the user review. If you get a bad review once, don’t fret: Rather than “yelling” at the user (which will most likely be deleted by an editor anyway), you should take it as an encouragement to make your add-on better. After all, the user considered your add-on interesting enough to download and test it and give you feedback on the points they found improvable. This is how great add-ons start!

We hope you enjoy the new feature! If you find any problems with it, feel free to drop by #amo on irc.mozilla.org or to file a bug as usual.

Tips for Localization

Thursday, July 26th, 2007

Wil posted a blog that is a great resource for people looking to localize their sites.  For those of you who went to our OSCON presentation, this is a great follow-up to the concepts we presented in our talk.

Download Counts Halted

Saturday, June 30th, 2007

The download controller was modified on Thursday to prepare for the release of the 1.5.0.12 -> 2.0.0.4 major update. During release cycles, AMO takes abnormally high load which sometimes causes interruptions in service.

To avoid this situation we agreed to cache public download hits from the AMO install buttons. This does two things:

  • Relieve application load by allowing the hardware load balancer to cache file requests — which are ultimately redirects to releases.mozilla.org
  • Relieve the database by not having constant inserts on the download table — which causes extra load because of indexes that were put on the table to make count updates work correctly

The benefits can be seen below. First, traffic for a web node:

traffic for AMO

Second, a CPU usage graph from the same node:

cpu usage

You can see the positive effect on load near noon on Wed.

The status of download counts is tracked in bug 384084. The plan is to create a background maintenance script that parses AMO logs via a cron job in off-peak hours. Counts will be updated once a day.

In addition to this, we are taking the opportunity to do a couple of things. For our summer goals, we plan on improving statistics for developers by offering:

  • Actual update ping counts in bug 384086
  • An improved API for aggregating add-on statistics and integrating it into your blog or external sites

We plan on resuming download counts early next week and update pings should be available by mid July. Thanks for your patience as we make adjustments to accomodate server load, and have a great 4th of July!

Triple Play (that’s what they say in baseball, right?)

Saturday, June 2nd, 2007

Wednesday was the first Firefox release since AMO 3.0 (Remora) launched in late March. It’s expected that traffic to Mozilla websites will increase following a release, but it’s usually in the range of 1.5 times normal traffic. Thursday, traffic to our San Jose facility tripled normal traffic, breaking 600 Mb/s. (The historical graph below is averaged down and doesn’t show that high, but Justin will be giving more details from IT’s point of view soon.)


When Firefox is updated, a separate update check for each add-on installed is performed, causing AMO to get quite a bit of traffic. Thursday, however, much of the traffic AMO was seeing was not from the update check - it was from real people searching, downloading, and browsing the site.

We had over 2000 user accounts created Thursday alone, over three times a normal day. 1 out of every 17 people that saw the What’s New page after updating Firefox clicked on the “Firefox Add-ons” link. Both sessions and pageviews on addons.mozilla.org tripled on Thursday - not including update and blocklist pings. The number of add-on downloads more than tripled from 2 days before.

While all of this increased activity is great news for us, it wasn’t so great for our app cluster. We had a number of issues throughout the day ranging from memcache hitting connection limits and refusing connections, database server issues, and app servers dying in a domino effect. A huge thanks to IT for keeping the issue under control the whole day, especially mrz who was on call and did not get to sleep.

We were able to make a number of changes Thursday and quickly push them to production, such as directing search to the shadow/read-only database server, adding to the list of areas of the site we can disable if necessary, and adding more memcache servers. We’ll be evaluating what we can do to prepare for this if it happens again.

Tagging in SVN

Tuesday, May 1st, 2007

So we decided to use SVN because it’s cooler and more modern. It’s not so bad, especially for webapps. We like the atomic commits, webdav, yadda yadda. You know why it’s better than CVS we aren’t going to regurgitate that to you.

But when deploying we ran into some issues with SVN merge. If you always do updates in a batch and never have an “oh crap update this now” update then you’re fine. But as we all know, stuff happens. So when we have tried to merge the entire tree after merging single files manually we’ve had some interesting results — conflicts in certain places and an overall headache.

The typical way of tagging is using an svn cp and placing a snapshot of a particular trunk revision into the tags directory. This SVN tagging approach using svn merge was supposed to be analogous to our previous way of tagging prod checkouts with CVS.

However, as clouserw put it, “the recommended strategy is only useful when merging the complete trunk to a tag, and when individual revisions/files are involved, svn eats itself”. So to avoid the merge issues with SVN that we found using svn merge we have come up with a new method that uses SVN externals instead.

Both methods have weaknesses. SVN merge has the “eating itself” problem that presents a lot of conflicts and makes merging painful because of all the manual conflict resolution. When merging the entire tree, though, it is desirable because it doesn’t have the weaknesses below. Using SVN externals, you run into some interesting issues but you don’t get the SVN eating itself problem:

  • Any locally modified files not in SVN (even when ignored) will kill a prod update in the event where an external must be updated (you get stuff like “svn: Failed to add directory ’site/app/tests’: object of the same name already exists”)
  • SVN externals cannot be a single file (they must be a directory so SVN can store info in an explicit .svn directory related to that checkout)

Aside from those two issues, just remembering the trunk revision we want in prod has been working for us. We’d like to find a good middle ground that doesn’t have us taking extra steps to update production when deploying trunk changes to the lives site.

So, if you guys have any input please let us know. We’re always open to suggestions.

Teaching CakePHP to be Multilingual (part 3)

Wednesday, April 18th, 2007

This is the last part of a three part series. (Part 1) (Part 2)

The basic premise of our strategy for dynamic localization was to replace actual strings in the database with ints, which were foreign keys into a `translations` table that held the actual strings. The `translations` table looks like:

+------------------+------------------+------+-----+---------+-------+
| Field            | Type             | Null | Key | Default | Extra |
+------------------+------------------+------+-----+---------+-------+
| id               | int(11) unsigned |      | PRI | 0       |       |
| locale           | varchar(10)      |      | PRI |         |       |
| localized_string | text             | YES  |     | NULL    |       |
| created          | datetime         | YES  |     | NULL    |       |
| modified         | datetime         | YES  |     | NULL    |       |
+------------------+------------------+------+-----+---------+-------+

When combined, our id and locale create the primary key which will look up a textual value in the localized_string field. This method has the advantage of utilizing foreign keys (one weakness of the PEAR::Translation2 method). The price we pay is complexity - if we want to look at any information in the database, there will be joins, and plenty of them.

Striding forward with our plan, we hit a roadblock with the way CakePHP handles associations. Specifically, CakePHP doesn’t allow us to associate models on arbitrary columns. For example, a hasMany relationship allows you to choose the “foreignKey” for one table, but the other is assumed to be the primary key. This means we can’t just setup relationships in the model and forget about them. Using CakePHP’s relationship model still seemed like the right thing to do, even if it wouldn’t be entirely automatic.

Firstly, to identify which fields in each model would be translated, we added another array to each model called $translated_fields.

Next, we had to build a relationship between whichever model we were using and the Translation model. Since we had to use custom SQL to retrieve the translations, we had to build this relationship on the fly for every object that had translations. Using app_model’s beforeFind() made this, relatively, easy. The code we used to build the relationship is fairly long, but it’s straightforward and well commented. The more notable part is the freakishly large and complex queries that come out of it. For example, this is an actual query that is run to retrieve information for an addon (forgive the strange line breaks - I’m trying to fit it in the wordpress column):

SELECT
    `Addon`.`id`,
    `Addon`.`guid`,
    IFNULL(`tr_name`.localized_string, `fb_name`.localized_string) AS `name`, `Addon`.`defaultlocale`,
    `Addon`.`addontype_id`,
    `Addon`.`status`,
    `Addon`.`icontype`,
    IFNULL(`tr_homepage`.localized_string, `fb_homepage`.localized_string) AS `homepage`,
    IFNULL(`tr_description`.localized_string, `fb_description`.localized_string) AS `description`,
    IFNULL(`tr_summary`.localized_string, `fb_summary`.localized_string) AS `summary`,
    `Addon`.`averagerating`,
    `Addon`.`weeklydownloads`,
    `Addon`.`totaldownloads`,
    IFNULL(`tr_developercomments`.localized_string, `fb_developercomments`.localized_string) AS `developercomments`,
    `Addon`.`inactive`,
    `Addon`.`trusted`,
    `Addon`.`viewsource`,
    `Addon`.`prerelease`,
    `Addon`.`adminreview`,
    `Addon`.`sitespecific`,
    `Addon`.`externalsoftware`,
    IFNULL(`tr_eula`.localized_string, `fb_eula`.localized_string) AS `eula`,
    IFNULL(`tr_privacypolicy`.localized_string, `fb_privacypolicy`.localized_string) AS `privacypolicy`,
    `Addon`.`created`,
    `Addon`.`modified`,
    IF(!ISNULL(`tr_description`.localized_string), `tr_description`.locale, `fb_description`.locale)
                AS `description_locale`,
    IF(!ISNULL(`tr_developercomments`.localized_string), `tr_developercomments`.locale, `fb_developercomments`.locale)
                AS `developercomments_locale`,
    IF(!ISNULL(`tr_eula`.localized_string), `tr_eula`.locale, `fb_eula`.locale) AS `eula_locale`,
    IF(!ISNULL(`tr_homepage`.localized_string), `tr_homepage`.locale, `fb_homepage`.locale) AS `homepage_locale`,
    IF(!ISNULL(`tr_name`.localized_string), `tr_name`.locale, `fb_name`.locale) AS `name_locale`,
    IF(!ISNULL(`tr_privacypolicy`.localized_string), `tr_privacypolicy`.locale, `fb_privacypolicy`.locale)
                AS `privacypolicy_locale`,
    IF(!ISNULL(`tr_summary`.localized_string), `tr_summary`.locale, `fb_summary`.locale) AS `summary_locale`
FROM `addons` AS `Addon`
LEFT JOIN translations AS `tr_description`
    ON ( `Addon`.`description` = `tr_description`.id AND `tr_description`.locale='en-US' )
LEFT JOIN translations AS `fb_description`
    ON (`Addon`.`description` = `fb_description`.id AND `fb_description`.locale=`Addon`.`defaultlocale`)
LEFT JOIN translations AS `tr_developercomments`
    ON (`Addon`.`developercomments` = `tr_developercomments`.id AND `tr_developercomments`.locale='en-US')
LEFT JOIN translations AS `fb_developercomments`
    ON (`Addon`.`developercomments` = `fb_developercomments`.id AND `fb_developercomments`.locale=`Addon`.`defaultlocale`)
LEFT JOIN translations AS `tr_eula`
    ON (`Addon`.`eula` = `tr_eula`.id AND `tr_eula`.locale='en-US')
LEFT JOIN translations AS `fb_eula`
    ON (`Addon`.`eula` = `fb_eula`.id AND `fb_eula`.locale=`Addon`.`defaultlocale`)
LEFT JOIN translations AS `tr_homepage`
    ON (`Addon`.`homepage` = `tr_homepage`.id AND `tr_homepage`.locale='en-US')
LEFT JOIN translations AS `fb_homepage`
    ON (`Addon`.`homepage` = `fb_homepage`.id AND `fb_homepage`.locale=`Addon`.`defaultlocale`)
LEFT JOIN translations AS `tr_name`
    ON (`Addon`.`name` = `tr_name`.id AND `tr_name`.locale='en-US')
LEFT JOIN translations AS `fb_name`
    ON (`Addon`.`name` = `fb_name`.id AND `fb_name`.locale=`Addon`.`defaultlocale`)
LEFT JOIN translations AS `tr_privacypolicy`
    ON (`Addon`.`privacypolicy` = `tr_privacypolicy`.id AND `tr_privacypolicy`.locale='en-US')
LEFT JOIN translations AS `fb_privacypolicy`
    ON (`Addon`.`privacypolicy` = `fb_privacypolicy`.id AND `fb_privacypolicy`.locale=`Addon`.`defaultlocale`)
LEFT JOIN translations AS `tr_summary`
    ON (`Addon`.`summary` = `tr_summary`.id AND `tr_summary`.locale='en-US')
LEFT JOIN translations AS `fb_summary`
    ON (`Addon`.`summary` = `fb_summary`.id AND `fb_summary`.locale=`Addon`.`defaultlocale`)
LEFT JOIN `addontypes` AS `Addontype`
    ON `Addon`.`addontype_id` = `Addontype`.`id` WHERE (`Addon`.`id` = 9)
    AND (`Addon`.`inactive` = 0)
    AND (`Addon`.`addontype_id` IN ('1', '2') )
LIMIT 1

That query went through several revisions, all of which had their pros and cons. Despite it’s intimidating length, the structure has a purpose. A few notes:

  • You can see which columns have translations in the SELECT pretty easily by looking for the IFNULL() function
  • The left outer joins allow us to get nulls back where the string doesn’t exist. Our initial query used inner joins, and when a string didn’t exist in the database, we’d get nothing back.
  • The union merges two nearly identical queries - one for the localized content, and one for the English fall back content.
  • The outer “SELECT AS” statement exists so the whole query plays nicely with cake. CakePHP’s dbo implementation assumes the model name will be the index into the array where it will store data. If that name isn’t set, it will automatically pluralize the name, which eventually causes a fatal error.

In our previous revisions of the query, we had to use a complimentary afterFind() in the Translation model to organize the strings into a structure that was more usable by the views. It automatically looked if a string in the language was null, and if so, replaced it with the English version. The query above makes this unnecessary, as that workload is passed off to MySql. With either method, the end result is an array filled with dynamically translated strings, and their locales (in case we had to fall back, the view needs to know the string is in a different language).

Our method has several shortcomings, some specific to our code, and others that are a problem with most (all?) localization efforts. We’ll probably address these in the future, but in the mean time, some trouble spots are:

  • Recursion. At this point when we do a query, beforeFind() only runs on the first model. Someone already filed cakephp’s Ticket 1183 on the issue. We’ve had to do a couple extra queries on some complex pages in order to get all the strings we need.
  • We currently only support one fallback language. It would be nice to have multiple language fallback. For example, from zh-TW -> zh-CN -> en-US. This would be easiest to implement by just looking for another language (when appropriate) with each query, but I’d like to come up with a more efficient way (in terms of query length and complexity).
  • Declinations, capitalization differences, etc. - You’ll find these in any localization project, but it’s worth mentioning.

Since it’s been a while since the previous posts, a quick summary is probably in order. At this point, we’ve successfully combined localization for static and dynamic content, using systems that are fast and robust. Things are more complex because of it, but that’s par for the course when it comes to localization. What we’ve done here is an exciting step forward for enabling people to share their firefox extensions with people around the world.

Looking back over these 3 posts, I see most of the posts have been high level ideas, and less about the actual integration into the CakePHP code. If there is sufficient interest, I can make another post describing:

  • How to detect/handle the language in the URL
  • How to initialize/remap languages
  • More on integrating the dynamic parts of the localization strategy into CakePHP (probably focusing on the view).
  • Anything else you’re curious about

If you’re less interested in reading my ramblings, and more interested in code, let me point out the SVN repository one last time.

AMO Updates

Monday, April 9th, 2007

Since the site launched a couple weeks ago, the AMO crew has been working diligently to polish off some of the rough edges and fix the bugs that have been reported. We’ve pushed an update to the site tonight that:

Thanks to everyone who reported or commented on a bug, or who got on IRC to let us know something needed to be looked at.

It’s Aliiiiive!

Monday, March 26th, 2007

The new addons.mozilla.org (AMO) is up and running through its first peak load time without major issues. We will continue to monitor site performance today to verify that our performance adjustments indeed solved our load issues.

What did we do?

  • Moved public files back to releases.mozilla.org, so those mirrors can do what they are good at — delivering files.
  • Fixed image paths so they do not vary with locale. Since images are not localized, they do not have to have varying URLs. By limiting images to a unique URL, it improves cache rates.
  • Adjusted caching rules. We had configured the load balancer to not cache all content when a user has logged in. Unfortunately that meant not caching images and other high-traffic services. We adjusted caching rules to force all images and service URLs to be cached regardless of whether or not a user is logged in.

Below are the load and traffic graphs as of 1040AM PST.

mrapp01 load

The load is much more linear, and at a lower level with our adjustments in place.

mrapp01 traffic

We have an increase in traffic, but this is expected as we have many more images and locales to support. The traffic spikes from last Friday were much higher than this, however, due to releases.mozilla.org traffic and lack of caching.

Overall I can’t say how happy I am to see some better results. Thanks to everyone who worked over the weekend with us to help troubleshoot. The battle isn’t over yet — we’re still looking for ways to tweak performance — but we can definitely see the light at the end of the tunnel.