[root@ip-ns01 ~]# mtr www.mozilla.com --report ip-ns01.phx.mozilla.org Snt: 10 Loss% Last Avg Best Wrst StDev 10.8.75.1 0.0% 0.4 0.4 0.4 0.4 0.0 v500.core1.phx.mozilla.net 0.0% 1.1 1.3 1.0 3.1 0.6 xe-1-1-0.border1.phx.mozilla.net 0.0% 0.7 0.7 0.7 0.8 0.0 64.124.201.177 0.0% 1.1 1.1 1.1 1.1 0.0 ge-0-3-0.mpr3.lax9.us.above.net 0.0% 9.4 13.0 9.4 44.3 11.0 xe-0-1-0.er1.lax9.us.above.net 0.0% 9.5 13.7 9.4 51.4 13.3 xe-0-1-0.mpr1.lax12.us.above.net 0.0% 94.3 21.3 9.3 94.3 27.8 xe2-3.cr01.lax01.mzima.net 0.0% 10.0 14.0 10.0 23.0 4.6 xe1-0.cr01.lax02.mzima.net 0.0% 16.9 15.7 10.2 22.8 4.8 te1-3.cr02.sjc02.us.mzima.net 0.0% 18.1 22.5 18.1 30.0 4.9 ge1-mozilla.cust.sjc02.mzima.net 0.0% 18.4 18.6 18.4 19.1 0.2 v8.core2.sj.mozilla.com 0.0% 18.4 19.7 18.2 30.8 3.9 mozcom.acelb.sj.mozilla.com 0.0% 18.5 18.6 18.4 19.5 0.3
Phoenix to San Jose, in 18ms
08-Feb-10iWPhone, an iPhone optimized blog
12-Jan-10oremj installed the iWPhone plugin for Mozilla’s WordPress-MU instance and it’s awesome.
Turns any blog into an iPhone optimized version of itself. Total readability. Nice.
Mozilla’s new Phoenix data center.
04-Jan-10Short story:
We’re building out a data center presence in Phoenix, Arizona!
This quarter we will be building out an initial six rack deployment (~80 servers) at i/o Data Center’s Phoenix ONE.
This will give Mozilla another top tier data center on the same scale as our current primary location in San Jose, California. By the end of March we will have a number of our most popular websites & services focused on delivering Firefox running out of both San Jose and Phoenix. This will let us grow in crazy ways and lessen the likelyhood of site-wide failures causing complete outages.
Long story:
Sometime towards the end of last summer I started asking if the time was right for Mozilla to build out a second full data center. We have two satellites (Amsterdam & Beijing) but both of those rely on the San Jose, CA data center.
I started by asking a couple questions:
- Are we at the scale where downtime is unacceptable?
- Are there certain websites/services that should never go offline?
- What’s the 3-5 year plan look like? How do we scale to 2000 servers? 5000?
Basically, is it time to build out another data center?
I also looked at the last three years of growth.
Since 2006, we’ve tripled the amount of data center floor space (and tripled our IT/Ops team), grew our user base 8.75 times and now push 18x the bandwidth.
Sure, in comparison to other sites, this growth is small. It’s no Facebook. But it’s still a significant amount of infrastructure that supports 350m users and the world’s most popular web browser (we’re at about one engineer to 43.7m users (or one to 100 servers)).
At the beginning of the quarter we made the strategic decision to open a second full production data center to act as a DR, or fail over, site to San Jose (it’ll be more than just DR – we’re planning on running production sites out of both locations).
For all intents and purposes this will be our first remote full production data center and I wanted to concentrate on somewhere that was a day trip away from the Bay Area. We’re using the same design philosophy we’ve used at the others -
We’re essentially lazy and never want to physically go to the data center. Instead we have built out a lot of remote management capabilities to make this work. In fact, half of the IT group doesn’t even live in California.
I spent the last couple months researching and touring data center space and, in the process, three things become apparent in their importance in choosing a data center:
- Connectivity
- Connectivity
- Power (and the ability to cool it)
Without #1 (or even #2), game over man. Carrier neutral data centers are king, Ethernet hand-offs rule and the question is less about how much space I need than it is about how much power I need.
i/o Data Center met all three of these requirements and is an easy “day trip” away. As I mentioned earlier, by the end of March we will have a number of our most popular websites & services focused on delivering Firefox running out of both San Jose and Phoenix. I hope to update on our progress throughout the quarter.
- mz
Geolocation & Zeus ZXTM 6.0
22-Oct-09Mozilla’s North American store has been down in maintenance mode (you can read about the why) but not the International Store.
The old store used to redirect non-North American users to the International Store, but unfortunately that redirect wasn’t carried over when we took the store offline.
I was inspired Sunday night to put together a ZXTM TrafficScript rule based off the code sample on Zeus’s knowledgehub (and because of bug 521914).
And then on Wednesday Zeus finally released ZXTM 6.0 with geolocation support!
What was 40 some lines of code is now like 5.
Before:
$ipaddr = request.getRemoteIP();
# Integer representation of $ipaddr >> 1
string.regexmatch( $ipaddr, "(\\d+)\\.(\\d+)\\.(\\d+)\\.(\\d+)" );
$ip = ((($1*256+$2)*256+$3)*128+$4/2);
$arr = resource.get( "geoip.dat" );
# initialize indices
$i = 0; $j = string.len( $arr )/6-1;
# $arr[$i] <= $ip < $arr[$j]
# iteratively halve the distance between $i and $j until they are adjacent
while( $j-$i > 1 ) {
# midpoint between $i and $j
$k = ($i+$j)/2;
# compare $ip with $arr[$k]
if( string.bytesToInt( string.subString( $arr, $k*6, $k*6+3 ) ) > $ip ) {
$j = $k;
} else {
$i = $k;
}
}
# Now, $arr[$i] <= $ip < $arr[$j] and $j == $i+1
# Look up the 2-character country code (returns '??' if unknown)
$ccode = string.subString( $arr, $i*6+4, $i*6+5 );
if ( ($ccode != "US") && ($ccode != "CA") ) {
}
else {
http.redirect("https://intlstore.mozilla.org/");
}
After:
$ipaddr = request.getRemoteIP();
$country = geo.getCountryCode( $ipaddr );
if ( ($country != "US") && ($country != "CA") ) {
http.redirect("https://intlstore.mozilla.org/");
}
(See Part V.)
I have a confession. We secretly did something last night (we only barely announced it to Metrics).
No, we didn’t secretly replace the fine coffee at some four-star restaurant but pretty close.
Hot off our 2 second gain in average page load times for addons.mozilla.org, we shaved another 2 seconds off by duplicating The Amsterdam Reboot platform in Singapore (as a proof-of-concept). Don’t take my word for it – take Gomez’s word:
A little background
For what seems like ages I’ve been trying to figure out how to best serve Asia-Pacific users. It’s a tough case to make because I didn’t have a method to easily measure how much bandwidth traffic I’d need or how it would change page load times or user perceptions.
But I’m a network guy and I’ve had this feeling that we need something in this region. We certainly have a growing population in the area – 5 of the top 9 countries we’ve seen > 20% growth in the last five months have been in Asia. I’ve only been lacking operational data.
Over the summer I’ve been playing and testing Voxel’s Silverlining Technology Preview, their global cloud computer platform. On October 1, it’s going to move out of Preview and their cloud platform would be generally available in all their POPs, including Singapore.
Seemed like a good way to get data…
In a matter of hours I had spun up two Zeus ZXTM and one GLB cloud servers with Voxel in Singapore. I waited till our normal Thursday night window before turning it live.
A couple take away notes on this
- We did it right. We built a proxy/caching platform as part of The Amsterdam Reboot that can easily be replicated anywhere and instantly provide real quantifiable performance benefits.
- Clouds make perfect sense to do proof-of-concepts.
- IT can move really fast. We had these three servers ready to go Wednesday morning in a couple hours..
- Mozilla’s webdev crew does an awesome job writing extremely cachable webapps. I’m seeing a 91% cache hit rate (350,000 objects, 1.5GB).
- If this is a sustainable location, pushing user-focused sites like
support.mozilla.comto Singapore are next on my list. - I’d love to run this concept in other geographies like South America. Who does clouds down there?
What I most like about this platform is that it’ll allow us to strategically get content closer to users where it most makes sense.
This is right now just a proof-of-concept. It lets me experiment and get real metrics. I’m very interested in hearing from people who actually live in the area – does this make you happier?
One more thing
I was lucky to have two providers who stepped in and provided resources to let me run this POC. Both deserve a special thanks.
2 seconds.
18-Sep-092 seconds.
That’s the amount of time we shaved off average page load times for addons.mozilla.org after last night’s work. The Amsterdam Reboot in effect!
It’s been nearly ten months since we served production traffic at these levels out of Amsterdam.
The Amsterdam Reboot
24-Aug-09Three years ago this coming December I went to Amsterdam and installed our first non-US data center location.
I remember coming back and was up late at night (fighting jetlag) setting up the Netscaler load balancers. By early January we had a CVS mirror up and running and a week later had staged www.mozilla.com and www.mozilla.org in Amsterdam. By the middle of February we had shifted European production traffic over to Amsterdam.
By May of 2007 (oh and here) we started serving addons.mozilla.org out of Amsterdam too.
Since then, the San Jose data center has grown from seven racks to twenty-four and nearly 500 servers (for the sake of this post I’m counting the 150 Mac Minis as “servers”).
Unfortunately, Amsterdam hasn’t seen the same sort of server growth and in the past half year or so we’ve had to pull sites back from Amsterdam and serve them from San Jose only. When the load balancers there could no longer handle the SSL traffic in January we stopped serving addons.mozilla.org out of Amsterdam too.
The Reboot
On September 2, we’ll reboot Amsterdam. Much like The Six Million Dollar Man, “… we can rebuild [it]. We have the technology… We can make [it] better than [it] was before. Better, stronger, faster.”
This morning we shipped a half loaded HP c7000 BladeSystems chassis out to Amsterdam. Next week both Arzhel and Derek will be in Amsterdam to deploy new servers and turn down some of the old legacy hardware.
One of the issues with the current Amsterdam deployment is that we can only really serve static sites – sites that don’t have a database behind them. It’s a bit more complicated to replicate databases to remote locations and keep them in sync with San Jose.
Based on our success with the Zeus ZXTM platform, we’re deploying a Nehalem based ZXTM cluster. Much like we originally did with
addons.mozilla.org, we’ll have a platform where we can proxy/cache any Mozilla website and serve it out of Amsterdam.We’ll also be deploying a new global load balancing system that Arzhel’s spent time staging and getting ready for production.
The Plan
During tomorrow night’s downtime window we’ll start the process of temporarily shutting down Amsterdam. We’ll pull back all websites to San Jose and stop serving web content out of Amsterdam for about two weeks.
We’re doing this in advance of the actual deployment to make sure we don’t hit any surprises.
What won’t be affected?
There are two VMware ESX servers in Amsterdam that we won’t be touching. A quick list of hosts that won’t be affected by this are:
- dn-vcs01
- gravel
- geodns02
- l10n-01
- rhino01
- sea-qm-centos5-01
- sea-qm-win2k3-01
- konigsberg
I’m excited about this. I’ve said before I’m still a network guy at heart and I really like getting content closer to consumers. I’m hoping this Reboot becomes an easily deployable platform for other parts of the globe.
My Twitter Experiment, @mozdashboard.
27-Jun-09I was recently inspired by morgamic’s Org Chart coding exercise and that got me thinking about an article I read some months ago about Zeus’ ZXTM triggers.
After a couple hours this week re-learning Perl and learning SOAP I have something that I think is cool - @mozdashboard!
I experimented a bit earlier in this week before Zandr gave me an idea of what I was really looking for with this. This version keeps state using Config::IniFiles (yeah, I said it and it was surely an easy way out) and tweets when it detects new highs.
zxtmtwitter.pl tracks the following sites:
addons.mozilla.orgversioncheck.addons.mozilla.orgfxfeeds.mozilla.com
It’ll tweet if it detects a new bandwidth high (either inbound or outbound) or a new simultaneous current connection count.
The program’s not without some faults – I’m not accounting for any counter wraps and I’m not happy with some of the hackery to poll each ZXTM node (I can’t find a way to poll one and get the cluster aggregates).
Code and .ini below the fold.
Knock on wood.
12-Jun-0910:27 < mrz> are we -sure- we're in the middle of a 3.0.11 release? 10:27 < reed> yes, #mirrors is lively 10:27 < mrz> i don't see anything melting down
IT/Ops, now supporting more timezones!
01-Jun-09Although Mozilla’s IT/Ops Team supports a worldwide user base, we’re all located in North America. That’s increasingly become a challenge as we have more employees, l10n folk and users living outside of North America.
So it’s with a lot of excitement that I get to electronically introduce Shyam, who joins IT/Ops today. Shyam lives and breathes in Singapore and is our first non-US based IT/Ops member. He will make supporting our worldwide users easier and especially help support Europe/Asia in more real-time.
If you happen to live nearby, hopefully you’ll get a chance to meet him. For all those remote, I’m sure you’ll find him online at all the usual IT/Ops hangouts.





