For about 48 hours the AMO API was thrashing because of the popularity (and hunger!) of the new add-ons manager in Firefox 3 Beta 3. The dust has settled and the servers are humming happily along, so now is a good time to blog about what happened and how we’ll handle future releases successfully.
Stop. Take a deep breath. Alright, here we go.
Now that the API is functional (most major bugs have been ironed out) we got a rude awakening this week and found out exactly how much traffic the improved Add-ons Manager can generate, but it’s a nice problem to have and we’re happy it’s been well received.
Wednesday, around peak time, the API started clobbering our databases:

Shortly after we entered our peak traffic window, we had to turn off the API to keep the normal AMO working. Diagnosis found that:
Wednesday, IT and Webdev spent quite a bit of time getting the API back up. Starting with the three points above, we:
However, Thursday didn’t fare any better for the cluster. This time the slaves started to melt near peak time — forcing us to once again temporarily disable the API. Under-utilizing memcache was the main issue. Cache headers were fine, slave was utilized, app nodes were fine — just too many damn queries flying at our database servers!

So on Thursday we continued our look into what was going on. We tried to figure out why our cache hit rate was so low (60% instead of 90%). Digging through AMO, we found CACHE_PAGES_FOR, which set the expire time on memcache records when calling Memcache::set(), was set to 60 seconds. We increased this to 7200 to aggressively cache database traffic and were collectively off for valentine’s dinner.
The next day, Memcache was our valentine.


The combination of our efforts worked:
So these growing pains will help us move forward. Here is our plan of attack for scaling this beast for the Firefox 3 onslaught:
Once again it was a great team effort to get things running smoothly. Thanks to IT for helping us troubleshoot this. We’ll continue to build on this experience to ensure better reliability in future releases.
Looking back at the last three days, the Firefox 3 Beta 3 release was a success in more ways than one. It showed everyone what the web can do, but it also helped us wrap our heads around the API and how much traffic it generates. All of this will make for a better Firefox 3.0 release.
very interesting post, thanks Mike!
you guys are doing an amazing job on AMO, and the new API with FF 3 integration is terrific.
-alex
Alex Sirota on February 16th, 2008 at 4:07 amGreat post. We use cakephp as well. We are also planning on splitting our reads and writes between databases. How did you accomplish this with cakephp?
Do you have any recommendations?
Thanks
Krishnan on February 23rd, 2008 at 12:44 pmI’ve had some problems with the addons as you had talked about above. I’m glad that you guys have worked diligently to settle the issues. I’m excited for the new release and have signed up to break the record. Hope we do.
Jacksonville Website Design on May 31st, 2008 at 12:57 pm- Jared B. http://jacksonvillewebsitedesign.com/
Jacksonville, FL