<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Socorro&#8217;s File System Storage</title>
	<atom:link href="http://blog.mozilla.com/webdev/2008/10/10/socorros-file-system-storage/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.mozilla.com/webdev/2008/10/10/socorros-file-system-storage/</link>
	<description>Everybody Likes Ninjas</description>
	<lastBuildDate>Sat, 13 Mar 2010 02:29:21 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Online Data Storage</title>
		<link>http://blog.mozilla.com/webdev/2008/10/10/socorros-file-system-storage/comment-page-1/#comment-193246</link>
		<dc:creator>Online Data Storage</dc:creator>
		<pubDate>Thu, 26 Mar 2009 23:06:29 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.com/webdev/?p=83#comment-193246</guid>
		<description>Yeah that double-indexing could come back to haunt users :/
-jack</description>
		<content:encoded><![CDATA[<p>Yeah that double-indexing could come back to haunt users :/<br />
-jack</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: K Lars Lohn</title>
		<link>http://blog.mozilla.com/webdev/2008/10/10/socorros-file-system-storage/comment-page-1/#comment-158675</link>
		<dc:creator>K Lars Lohn</dc:creator>
		<pubDate>Mon, 13 Oct 2008 16:58:01 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.com/webdev/?p=83#comment-158675</guid>
		<description>Of course we considered several mechanisms, but settled on this one because it is a compromise in complexity.  This brief blog post doesn&#039;t tell the whole story.  We have two use cases for this structure with sightly different needs.  This implementation covers them both efficiently.  I really wanted to avoid having two different code bases for the two use cases.  

The complicating factor in having a single index is in the first use case outlined just after the diagram.  We&#039;re using the date key as, what in the relational database world is called, a partial index.  Walking that index shows us only new entries that we had not previously seen.   Had we implemented this using a single index, we would be forced to have some way to mark an entry as new.  Scanning for new entries would then degrade into a search requiring us to walk the whole index considering each entry.  We&#039;ve already seen that this doesn&#039;t scale.  Granted, there are some tricks that we can do to make such a search more efficient, but that brings the code base further away from our needs in our second use case.

After much consideration, I found this algorithm to be an acceptable compromise.  I, of course, reserve the right to totally change my mind after we&#039;ve seen this code perform in production...</description>
		<content:encoded><![CDATA[<p>Of course we considered several mechanisms, but settled on this one because it is a compromise in complexity.  This brief blog post doesn&#8217;t tell the whole story.  We have two use cases for this structure with sightly different needs.  This implementation covers them both efficiently.  I really wanted to avoid having two different code bases for the two use cases.  </p>
<p>The complicating factor in having a single index is in the first use case outlined just after the diagram.  We&#8217;re using the date key as, what in the relational database world is called, a partial index.  Walking that index shows us only new entries that we had not previously seen.   Had we implemented this using a single index, we would be forced to have some way to mark an entry as new.  Scanning for new entries would then degrade into a search requiring us to walk the whole index considering each entry.  We&#8217;ve already seen that this doesn&#8217;t scale.  Granted, there are some tricks that we can do to make such a search more efficient, but that brings the code base further away from our needs in our second use case.</p>
<p>After much consideration, I found this algorithm to be an acceptable compromise.  I, of course, reserve the right to totally change my mind after we&#8217;ve seen this code perform in production&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jmdesp</title>
		<link>http://blog.mozilla.com/webdev/2008/10/10/socorros-file-system-storage/comment-page-1/#comment-158162</link>
		<dc:creator>jmdesp</dc:creator>
		<pubDate>Sun, 12 Oct 2008 09:55:51 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.com/webdev/?p=83#comment-158162</guid>
		<description>Hum, I myself in that situation would reconsider if there really is no way to get rid of the double indexing :-) However smart you are, it makes things much more complex.

I think if the initial part of the UID were time based, you could have only one indexing. Random UID generate an automatically balanced tree, but you have enough crashes, all around the world, that any 5 minute period is pretty well balanced too.</description>
		<content:encoded><![CDATA[<p>Hum, I myself in that situation would reconsider if there really is no way to get rid of the double indexing <img src='http://blog.mozilla.com/webdev/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  However smart you are, it makes things much more complex.</p>
<p>I think if the initial part of the UID were time based, you could have only one indexing. Random UID generate an automatically balanced tree, but you have enough crashes, all around the world, that any 5 minute period is pretty well balanced too.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Fred Wenzel</title>
		<link>http://blog.mozilla.com/webdev/2008/10/10/socorros-file-system-storage/comment-page-1/#comment-157785</link>
		<dc:creator>Fred Wenzel</dc:creator>
		<pubDate>Sat, 11 Oct 2008 05:41:41 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.com/webdev/?p=83#comment-157785</guid>
		<description>I said it before, but anyway: Very nicely engineered. I share your opinion: This should become pretty darn fast. It should also be possible to divide up this tree across mount points if necessary (by the way another advantage of using symlinks: With hardlinks you&#039;d be in trouble there).</description>
		<content:encoded><![CDATA[<p>I said it before, but anyway: Very nicely engineered. I share your opinion: This should become pretty darn fast. It should also be possible to divide up this tree across mount points if necessary (by the way another advantage of using symlinks: With hardlinks you&#8217;d be in trouble there).</p>
]]></content:encoded>
	</item>
</channel>
</rss>
