Dehydra Updates

May 11th, 2009

How well are you packing your structs?
Arpad asked that question with an awesome Dehydra script and came up with an interesting list.

GCC 4.3.3 Is supported
GCC 4.3.3(4.3.[210] worked) broke C++ compatibility in the headers used by Dehydra.
Zach pointed out that passing -fpermissive to g++ solves the problem. Sorry to all the people who had issues building the hydras with GCC 4.3.3, that’s fixed now.

As I mentioned before, we are skipping GCC 4.4 support in Dehydra and aiming for supporting unpatched GCC 4.5. I wish that the small GCC patches were as quick to land as that big one I landed a couple of weeks ago :( .

GCC Rant + Progress

April 30th, 2009

I feel strange working on GCC-specific stuff and then discussing it on planet mozilla as mozilla work. However, without GCC, Dehydra and Treehydra would not be half as awesome (much less feasible even). The power of open source is that it allows us to leverage the entire open source ecosystem to achieve specific goals. When open source projects combine their efforts, not even the biggest software companies can compete as cross-project goals would be incredibly expensive and unpleasant otherwise.

Occasionally, it is very frustrating to see people treat open source software as immutable and independent black boxes. In my personal experience, the browser and the compiler are viewed as finished products and therefore it is OK to bitch and complain about them. That’s frustrating because the same users could be channeling that energy in a more positive way by reporting bugs, contributing code/documentation, etc.

Sometimes these rants result in rather comical conclusions: Ingo’s rant is priceless. My perspective on this:

  • what have Linux kernel devs done to help GCC help them?
  • <flame>Sparse is a deadend. Writing compiler code in C is silly, writing analysis code in C is sillier (and frustrating and limiting). Taking a crappy parser and bolting a crappy compiler backend onto it will result in bigger pile of crap :) Given how smart kernel devs are, they sure like wasting their time on crappy solutions in crappy languages.</flame>
  • Wouldn’t it be cool if instead of complaining these talented people wrote a GCC plugin to do what they want?

GCC Plugin Progress

I finally landed the massively boring and annoying GTY patch. I can barely believe that the patch went in so smoothly without excess complaining from GCC devs. From GCC perspective it’s merely a cosmetic cleanup that affects a large number of headers. For us it enables Treehydra to be generated via Dehydra with little manual effort. It basically makes Treehydra possible without patching GCC. I have another 3-4 patches that need to land before trunk GCC can run the hydras out of the box. Those are mainly localized bugfixes and cleanups so I fully expect them to go in and for GCC 4.5 to rock my world.
Once GCC 4.5 ships. analyzing code will depend on a trivial matter of apt-getting(or equivalent) the hydras and specifying the analysis flags on the GCC commandline!

I wrote Fennecmark to automate some of the tasks that I did manually while doing performance debugging.

I tried to capture some of the “perceived performance” in numbers. My goal is to focus on user-visible areas of performance. Ideally it will enable us to track performance better to ensure that key features do not regress in performance and enable us to compare Fennec speed on various platforms. I need to spend some quality time with QA people to figure out how to achieve that.

Currently Fennecmark loads a slow-to-load webpage, zooms around it and then pans from the top to the bottom. This measures: responsiveness during pageload, zoom speed and panning lag.

See code at http://hg.mozilla.org/users/tglek_mozilla.com/fennecmark.

JSD Instrumentation

Spidermonkey provides an API that allows one to get a notification on every method entry/exit. I was able to do most of my Fennec performance analysis via a component in bug 470116. My stopwatch component times the execution of every js function call and spits out a log that has been very useful in figuring out what is taking up time in Fennec chrome.

Porkstain

I am itching to write a tool that can instrument large portions of Mozilla code such that it can be profiled across C++/JS boundaries and without any external tool support. I am guessing this would be most useful on platforms with crappy sampling tools, but it would be cool if it made finding slow codepaths easier in general. If you know any lightweight instrumentation techniques, please share.

I wrote a little prototype to insert stopwatch stuff into code deemed interesting by oprofile (stuff in the bug above). The code patching part works well, but it’s a big runtime hit and outputs too much data.

#static on irc.mozilla.org is now the correct irc channel for anything to do with static analysis.

Codecon
On Sunday, I will be presenting on Pork at codecon. I have been meaning to attend codecon since the days when P2P was considered cool, was not able to make it until this year. It is a historical milestone for me. Codecon was how people at Mozilla first heard of Elsa, which is now the foundation of all our refactoring tools (it is also inherited baggage I get to maintain).

Piglet

I adopted Piglet, Dave Mandelin’s de-oinkification project, and imported it into hg. It feels really good to finally be able to do a make -j without disturbing people nearby with surprise explosions of foul language. I plan to move all relevant static analysis tools into piglet. After that I shall finally merge a dozen or so elsa repositories and end up with Pork consisting of elsa/ + piglet/.

Pork*

Chris Jones is quietly working on making Pork magnitudes more useful to average developers. It’s exciting stuff and I’ll let him announce it when he’s ready. Between his work and David Humphrey’s DXR. I think we are finally going to make it easier to hack on Mozilla for a much wider audience than before.

nsresult analysis

March 31st, 2009

After I wrote prcheck, I was surprised by the errors it found. I expected to find lots of cases of prbool variables having integers assigned into them. Indeed there were some of those, but the most frequent offenders were things like

NS_ENSURE_SUCCESS(rv,rv);

in methods with a PRBool return value. In this case (and many similarĀ  return values within macros) the function will likely do the opposite of what was intended if there is an error condition. Here is a less hypothetical example in bugzilla.

So I’m thinking that instead of porting the prbool analysis to Treehydra (such that it’d based on a less buggy backend and can be integrated into the build) it might be more interesting to ensure that nsresults do not mix with other integer types. That would catch all of the worst prbool offenders and possibly other nsresult misfortunes.

Has anyone run into bugs like this that do not involve prbools?

I suppose a general solution would be to define a lattice of typedefs with rules specifying which typedefs can be assigned to each other. This would make GCC distinguish certain typedefs as discrete and incompatible types. Thoughts?

A couple of months ago Stuart casually asked me to investigate Fennec performance for moving about a page, zooming and loading pages in general. Beta 1 contains the result of that:

  • There is little to no hardware graphics acceleration on mobile arm device. That combined with low memory bandwidth results in painfully slow screen updates (10x slower than crappy gfx on the desktop?). The painting engine now works hard to skip redundant draws of the page.
  • During loading pages or zooming Fennec now only draws the minimum required. In my testing complicated pages load 2-5 times faster. Zooming is now 5 times faster.
  • There is less DOM querying now. Things like checking an element’s size can cause pages to reflow resulting in a less responsive UI

Other performance highlights:

  • Fennec now features a redesigned firstrun page which not only looks better, but also contributed to a 0.5second startup speedup. Overall Beta 1 should startup is almost a second quicker than the previous release
  • The JavaScript JIT is now on by default providing a noticeable performance boost throughout Fennec.

For more info on Fennec Beta 1 and where to get it see Stuart’s blog.

Quickfix Model of Develoment

February 19th, 2009

I love new programming toys and this week is a good one for those. Ever since I laid the foundations for Dehydra I’ve been dreaming of a world where I can quickly lookup a piece of code(say something that someone complains about on IRC), fix it, get it reviewed and pushed in the most efficient manner possible. Seems that the pieces are finally falling into place.

  1. I want quick semantically aware code lookup via DXR. And guess what, there is progress in that direction.
  2. I want to DXR to provide a link to edit the code. Bespin looks like the most promising candidate for editing.
    As an aside, using canvas to do text editing is badass. I salute devs who are crazy enough to prove their point by reimplementing something (hopefully better) from scratch using an approach that hasn’t been tried before.
  3. I want my changes to be saved as a diff into bugzilla. I want that to be two way so I can edit existing patches and save them as new bugzilla attachments.
  4. From there I’d like a commit feature in bugzilla so the patch would go through try-then-push cycle that Jesse described.
  5. Having all this inplace would make it trivial to integrate random features such as crash stack trace navigation or Pork automagic refactoring.

Now I’m sure that most of us would still run Emacs and other desktop editors for longer development tasks. But just imagine being bored with a computer at a webcafe, boring friend, etc and having the ability to quickly jump into in the development process as easily as logging into webmail.

Security with Dehydra

February 16th, 2009

When I wrote the initial prototype of Dehydra I pondered how long it would take before it’s adopted by security guys. Unfortunately, until now take-up has been non-existent. Grep and Perl still seem to rule in that community even though the plain text approach restricts the range of possible security scans.

Normally I would be tempted to rant on how grep is convenient yet limiting. However Ben Kurtz discovered Dehydra for security scans and did a great job explaining the issues involved. Thanks to Georgi for linking me to Ben’s post.

LWN published an article about a tool that does refactoring of C code. Guess what, it’s yet another tool on top of a crappy C-parser that will never grok C well or even hope to support C++. To my great disappointment the author was not aware of my work on Pork. Clearly I have failed in letting people know that complex C and C++ can be refactored with (somewhat raw, but powerful) open source tools.

In addition to Dehydra (which is even mentioned in the first comment, yay!), I also maintain Pork – a fork of oink that is well suited to large-scale refactoring of real-world C/C++ code.

So far pork has been used for “minor” things like renaming classes&functions, rotating outparameters and correcting prbool bugs. Additionally, Pork proved itself in an experiment which involved rewriting almost every function(ie generating a 3+MB patch) in Mozilla to use garbage collection instead of reference-counting.

So to summarize:

  • Refactoring C is hard, but C++ is much harder
  • For refactoring C++ there is no better toolchain to start with than Pork
  • Pork shares no code with Dehydra.
  • Pork is built on the Elsa parser which makes it well-suited for rewriting large amounts of code. Dehydra’s isn’t suitable for rewriting code due to GCC providing a very lossy AST and incomplete location information.
  • Pork is not as convenient for analysis needs as Dehydra

For any questions regarding Pork feel free to post on the mailing list or ping me on IRC.

Language Wars

I find it depressing that the comments to the LWN article ended up being about language wars rather than the refactoring topic. Pork is written in C++ which is much more widely known than OCaml. However, I seriously doubt it’s easier for anyone to hack on advanced compiler frontend pieces in a language as ill-suited for the task as C++.

GCC Plugins are a Go!

January 27th, 2009

The nice folks at FSF allowed GCC have plugins. In a couple of GCC releases, Dehydra(4.5 if we are lucky) will work with distribution GCCs. Of course the API is yet to be decided on, but we have been coordinating with authors of other GCC plugin efforts to ensure that the final API meets reasonable needs.

In the future enabling static analysis checks will involve little more than specifying –with-static-checking in your Mozilla build!

JSHydra

The other breakthrough news is that Joshua Cranmer has been working on hooking up a *hydra style API to the Spidermonkey parser. This resulted in JSHydra. Ability to look into JavaScript has been sorely missing from our stack, so this is extremely exciting.