Posted in Uncategorized on February 4th, 2011 7 Comments »
We recently had a situation where we needed to copy a lot of HBase data while migrating from our old datacenter to our new one. The old cluster was running Cloudera’s CDH2 with HBase 0.20.6 and the new one is running CDH3b3. Usually I would use Hadoop’s distcp utility for such a job. As it [...]
Posted in Uncategorized on December 30th, 2010 10 Comments »
Few months ago, at Hadoop World 2010, the metrics team gave a talk on Flume + Hive integration and how we plan to integrate it with other projects. As we were nearing production date, the BuildBot/TinderBox team came with an interesting, albeit pragmatic requirement. “Flume + Hive really solves our needs, but we would ideally [...]
Posted in Uncategorized on August 15th, 2010 4 Comments »
Exponential growth, one of the few problems every organization loves, is usually alleviated by scaling out using clustered computing (Hadoop), CDN, EC2 and myriad of other solutions. While a lot of cycles are spent in making sure each scaled out machine contains requisite libraries, latest code deployments, matching configs, and the whole nine yards, very [...]
Posted in Uncategorized on May 19th, 2010 No Comments »
Pentaho announced this morning that they were going to be adding some features to Pentaho Data Integration (Kettle) and to their BI suite to make it easy for people to use Kettle to retrieve, manipulate, and store data in Hadoop, and to integrate Hadoop communication into the reporting and analysis layer. They posted a nice [...]
Posted in Uncategorized on May 18th, 2010 38 Comments »
We are marching along in our integration of HBase with the Socorro Crash Stats project, but I wanted to take a minute away from that to talk about a separate project the Metrics team has also been involved with. Mozilla Labs Test Pilot is a project to experiment and analyze data from real world Firefox [...]
Posted in Uncategorized on August 10th, 2009 8 Comments »
I was presented with the challenge of answering the question – how many Firefox users have one add-on or more installed on their Firefox. Currently, addons.mozilla.org (AMO) has statistics on the download counts of add-ons but the actual usage of add-ons has been unanswered. The Add-ons manager inside of Firefox will check each add-on for [...]