<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Justin's Blog &#187; Mozilla</title>
	<atom:link href="http://blog.mozilla.com/justin/category/mozilla/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.mozilla.com/justin</link>
	<description>Mozilla engineering operations...in brief</description>
	<lastBuildDate>Fri, 25 Jul 2008 03:15:40 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.6</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Build storage issues &#8211; resolved!</title>
		<link>http://blog.mozilla.com/justin/2008/06/16/build-storage-issues-resolved/</link>
		<comments>http://blog.mozilla.com/justin/2008/06/16/build-storage-issues-resolved/#comments</comments>
		<pubDate>Mon, 16 Jun 2008 13:52:44 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[Colo]]></category>
		<category><![CDATA[Infrastructure]]></category>
		<category><![CDATA[Mozilla]]></category>

		<guid isPermaLink="false">http://blog.mozilla.com/justin/?p=21</guid>
		<description><![CDATA[This is a very technical and detailed debrief.  For those who want the short version &#8211; it&#8217;s fixed    Other people, read on.
&#8211;
As many of you already know &#8211; we had some pretty serious issues over the past weeks with the storage system that supports the build/unit test environment.  We have [...]]]></description>
			<content:encoded><![CDATA[<p>This is a very technical and detailed debrief.  For those who want the short version &#8211; it&#8217;s fixed <img src='http://blog.mozilla.com/justin/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />   Other people, read on.</p>
<p>&#8211;</p>
<p>As many of you already know &#8211; we had some pretty serious issues over the past weeks with the storage system that supports the build/unit test environment.  We have resolved the issues and wanted to give everyone a run down of the issues that we found, what we have done to resolve them and what open tasks are left.</p>
<p>The issue manifested itself in a few ways.  We saw slow transfers, scsi aborts, reservation failures and VM guest level corruption.  This started as a very rare occurrence and over time became more and more frequent to the point that we could not keep a small number of i/o intensive VMs up for 1 hour and had trouble getting them off.  We started troubleshooting the issue a few weeks ago, and finally came to a total resolution early this week.  Here is a summary of the issues, how they came to be and how we resolved them:</p>
<p>*  http://now.netapp.com/NOW/cgi-bin/bol?Type=Detail&amp;Display=226424 (A filer may exhibit poor performance due to WAFL holding on to too many network<br />
buffers and not releasing them in a timely fashion.)<br />
To fix this, we had to do an upgrade to 7.2.4 &#8211; that has been completed.</p>
<p>*  NetApp LUN&#8217;s were created of the wrong LUN type.<br />
This was caused by a error in the LUN creation workflow causing the LUN to be set to the default value (Solaris).  3 out of 4 LUNs were of type Solaris causing blocks to not be written efficiently to the disk (the 4k VMWare blocks were written offset to the true disk geometry).  Reading of the LUN would cause many read aheads and at times overwhelm the filer due to the inefficient layout on disk.  To remedy this we migrated data off, re-created all of the LUNs and re-migrated the data back.</p>
<p>* NetApp igroup&#8217;s set to the wrong type.<br />
Initially Netapp advised that linux igroup type (what maps the LUN to various hosts) were OK for use with VMWare.  This was incorrect causing improper scsi reservations and iscsi timeouts.  NetApp is updating their internal documentation to reflect this change.</p>
<p>* Network setup issues<br />
Initial setup from NetApp advised us to setup the network in a specific configuration (one link to each upstream switch with a virtual interface bonding them).  After further investigation, I found this is *not* the best practice and in fact causing issues with dead HBA paths.  To correct this temporarily, we disabled one of the links, having single uplinks (still with redundant heads)</p>
<p>All of these issues created major performance degradation and block level access/corruption problems.  They have all been resolved at this point.  We still need to adjust the network interfaces to be more redundant.  </p>
<p>Special thanks to the release engineering team has been *incredibly* patient with us as we worked through this.  I know how frustrating it was and they kept a smile (well, kind of) through the situation &#8211; really helped us keep pushing forward to a solution.  Thanks also to mrz for the amazing amount of work he put into this&#8230;very dedicated to finding a solution no matter what time it was.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mozilla.com/justin/2008/06/16/build-storage-issues-resolved/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Network outage report &#8211; 3/18/08, 8:01pm PDT &#8211; 9:25 pm PDT</title>
		<link>http://blog.mozilla.com/justin/2008/03/20/network-outage-report-31808-801pm-pdt-925-pm-pdt/</link>
		<comments>http://blog.mozilla.com/justin/2008/03/20/network-outage-report-31808-801pm-pdt-925-pm-pdt/#comments</comments>
		<pubDate>Thu, 20 Mar 2008 23:39:56 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[Colo]]></category>
		<category><![CDATA[Infrastructure]]></category>
		<category><![CDATA[Mozilla]]></category>

		<guid isPermaLink="false">http://blog.mozilla.com/justin/2008/03/20/network-outage-report-31808-801pm-pdt-925-pm-pdt/</guid>
		<description><![CDATA[We had a network outage at our San Jose datacenter tonight from 8:01 pm PDT until 9:25 pm PDT on March 18.  From initial investigation, it appears that one of the switches in a blade server chassis had a software issue, causing a network-wide broadcast storm.  Overall effect was that the switching fabric [...]]]></description>
			<content:encoded><![CDATA[<p>We had a network outage at our San Jose datacenter tonight from 8:01 pm PDT until 9:25 pm PDT on March 18.  From initial investigation, it appears that one of the switches in a blade server chassis had a software issue, causing a network-wide broadcast storm.  Overall effect was that the switching fabric for our San Jose datacenter was unusable.</p>
<p>To mitigate this issue going forward, we have make two changes.  	</p>
<ul>
<li> Modified the port-channels connecting the core switches to downstream switches to better handle a port-channel member failure.
<li> We also further tuned broadcast storm protection on every switch port to limit the amount of broadcast &amp; multicast traffic any one device is allowed to send.
</ul>
<p>Furthermore, we have a priority case open with the vendor to determine the cause of the issue as we did capture debug logs.  This was in no way related to the scheduled downtime we were in, it just happened to coincide.  We apologize for any inconvenience this may have caused.  We&#8217;ll continue to follow up with the vendor to make sure this issue does not happen again.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mozilla.com/justin/2008/03/20/network-outage-report-31808-801pm-pdt-925-pm-pdt/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Call out for Mirrors</title>
		<link>http://blog.mozilla.com/justin/2008/02/19/call-out-for-mirrors/</link>
		<comments>http://blog.mozilla.com/justin/2008/02/19/call-out-for-mirrors/#comments</comments>
		<pubDate>Wed, 20 Feb 2008 05:27:48 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[Infrastructure]]></category>
		<category><![CDATA[Mozilla]]></category>

		<guid isPermaLink="false">http://blog.mozilla.com/justin/2008/02/19/call-out-for-mirrors/</guid>
		<description><![CDATA[One Mozilla&#8217;s biggest assets is our mirror network. It allows us to update over 100 million users in under 48 hours with security updates, host and push extensions, and much more &#8211; all with donated server space and bandwidth, giving us the ability to focus our efforts on supporting the development community and making all [...]]]></description>
			<content:encoded><![CDATA[<p>One Mozilla&#8217;s biggest assets is our mirror network. It allows us to update over 100 million users in under 48 hours with security updates, host and push extensions, and much more &#8211; all with donated server space and bandwidth, giving us the ability to focus our efforts on supporting the development community and making all the Mozilla products as reliable, secure and feature-rich as possible.</p>
<p>We&#8217;d like to build up our mirror network to be even stronger! I am making a call to the community to help us find other mirror sources. Already Paul Vixie from the <a href="http://www.isc.org">Internet Software Consortium</a> has stepped up and donated 3gb/s of mirror peak capacity (!). Details on what is required can be found here: <a href="http://www.mozilla.org/mirroring.html">http://www.mozilla.org/mirroring.html</a>. While we are always happy to take any mirror donation, we are specifically looking for mirrors which can handle in excess of 100mb/s during peak traffic times. Please contact me directly if you have any ideas of people/organizations/companies that might be willing to donate either bandwidth or mirror space.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mozilla.com/justin/2008/02/19/call-out-for-mirrors/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Bugzilla Improvements</title>
		<link>http://blog.mozilla.com/justin/2007/12/05/bugzilla-improvments/</link>
		<comments>http://blog.mozilla.com/justin/2007/12/05/bugzilla-improvments/#comments</comments>
		<pubDate>Thu, 06 Dec 2007 04:13:51 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[Bugzilla]]></category>
		<category><![CDATA[Mozilla]]></category>

		<guid isPermaLink="false">http://blog.mozilla.com/justin/2007/12/05/bugzilla-improvments/</guid>
		<description><![CDATA[Bugzilla basically runs Mozilla &#8211; it&#8217;s core to almost everything we do from tracking core Firefox bugs to tracking Marketing events to operating as our IT ticket queue.  Quite simply, we wouldn&#8217;t be who we are today without it.  With all its greatness, there are quite a few things that don&#8217;t quite fit [...]]]></description>
			<content:encoded><![CDATA[<p>Bugzilla basically runs Mozilla &#8211; it&#8217;s core to almost everything we do from tracking core Firefox bugs to tracking Marketing events to operating as our IT ticket queue.  Quite simply, we wouldn&#8217;t be who we are today without it.  With all its greatness, there are quite a few things that don&#8217;t quite fit the workflow that is Mozilla, and other bugs that are simply annoying.</p>
<p>So, Schrep asked me to kick off a project to address some of the issues we have with Bugzilla and really invest some time and effort to improve Bugzilla for Mozilla, and the rest of the community.  I&#8217;ve started by rounding up an initial set of improvements after talking to some of the heavy users within Mozilla, asking for their top complaints and suggestions to improve efficiency in using Bugzilla.  <a href="http://wiki.mozilla.org/Bugzilla_Fixup">Here</a> is what I have come up with.</p>
<p>They gave me plenty of things to work on, but I wanted to open it up to others.  I&#8217;ve added a section to the bottom of the wiki asking for suggestions &#8211; please keep your edits there.  If you want to vote up another&#8217;s suggestion, just add a +1 to their line.  I&#8217;ll take the top suggestions/defects and add them into the schedule.  Keep in mind we won&#8217;t be able to do everything, and are limited in terms of capacity but we are throwing some full time weight behind this to help get this moving.</p>
<p>All our changes are planned to first be applied to BMO, then ported to Bugzilla trunk, so all the code will show up in upstream versions of Bugzilla.  We hope to make a difference and move Bugzilla forward in ease of use, performance and innovation.</p>
<p>On a side note, I am looking for community Bugzilla members to help &#8211; if you are a Bugzilla developer or know someone who would be willing to help, we&#8217;ll take all the help we can get!  Contact me at justin at mozilla dot com.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mozilla.com/justin/2007/12/05/bugzilla-improvments/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>China, similar to you and me</title>
		<link>http://blog.mozilla.com/justin/2007/08/28/china-similar-to-you-and-me/</link>
		<comments>http://blog.mozilla.com/justin/2007/08/28/china-similar-to-you-and-me/#comments</comments>
		<pubDate>Tue, 28 Aug 2007 12:09:57 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[Mozilla]]></category>

		<guid isPermaLink="false">http://blog.mozilla.com/justin/2007/08/28/china-similar-to-you-and-me/</guid>
		<description><![CDATA[I&#8217;m about mid way through my first trip to China (in Beijing) &#8211; first time to the far east for that matter, and I have to say it&#8217;s a pretty interesting place.  I&#8217;ve been all over europe and north america, but what has really struck me is how Beijing is similar to many other [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m about mid way through my first trip to China (in Beijing) &#8211; first time to the far east for that matter, and I have to say it&#8217;s a pretty interesting place.  I&#8217;ve been all over europe and north america, but what has really struck me is how Beijing is similar to many other major international cities I&#8217;ve been to.  Sure it&#8217;s got it&#8217;s unique attractions, food, people and activities &#8211; but isn&#8217;t so different that I can&#8217;t function or don&#8217;t know how to fit in &#8211; in fact quite the opposite.</p>
<p>Now let me preface this by saying I am in the outer section of the city in a tech park, and haven&#8217;t had time to go into the heart of the city (which I hope to do).  But on the 12 hour (!) plane ride over, I had this notion that coming to China would be extremely exotic with very different ways of doing things.</p>
<p>Sure, the Internet access is not the best (i.e Great Firewall, international congestion, etc), food can be&#8230;adventurous (chicken neck, frog, snail, turtle, donkey, and others were all on the menu at tonight&#8217;s restaurant),  the weather &amp; pollution aren&#8217;t the best, politics aren&#8217;t in line with what I&#8217;d vote for, but all in all &#8211; it&#8217;s just a city, and a great one at that.  People eat and hang out a lot, get work done in similar fashions and live their lives.  </p>
<p>I think the differences in how people work, live, and interact in different cultures is incredibly interesting &#8211; hence why I think I am enjoying my time here so much.  The trip has really highlighted that while there are a lot of differences in the way we choose to live, we often forget just how similar we all are <img src='http://blog.mozilla.com/justin/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>More technical (read: nerdy) posts later on the Great Firewall, Internet access, colo&#8217;s, and more.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mozilla.com/justin/2007/08/28/china-similar-to-you-and-me/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Blog.mozilla.com &#8211; blogs for the people</title>
		<link>http://blog.mozilla.com/justin/2006/10/11/blog.mozilla.com-blogs-for-the-people/</link>
		<comments>http://blog.mozilla.com/justin/2006/10/11/blog.mozilla.com-blogs-for-the-people/#comments</comments>
		<pubDate>Thu, 12 Oct 2006 04:26:39 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[Mozilla]]></category>

		<guid isPermaLink="false">http://blog.mozilla.com/justin/2006/10/11/blog.mozilla.com-blogs-for-the-people/</guid>
		<description><![CDATA[We (IT) put up a new blog server to help consolidate and provide a supported and maintained blog server.  Expect to see a lot more blogs showing up from MozCorp people here&#8230;should be fun.
]]></description>
			<content:encoded><![CDATA[<p>We (IT) put up a new blog server to help consolidate and provide a supported and maintained blog server.  Expect to see a lot more blogs showing up from MozCorp people here&#8230;should be fun.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mozilla.com/justin/2006/10/11/blog.mozilla.com-blogs-for-the-people/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
