State of Static Analysis At Mozilla
January 21st, 2010
Mozilla has static analyses built into the buildsystem that can be turned on with –with-static-checking= flag. The analyses live in xpcom/analyses directory. The testcases (aka documentation) are in xpcom/tests/static-checker. Analyses are implemented in either Dehydra or Treehydra and run within a patched GCC 4.3.
The currently landed checks are:
- final.js: Java-like “final” keyword for C++
- flow.js: Ensure code in a function flows through a particular label
- must-override.js: Force derived classes to override certain methods
- override.js: Ensure methods exist in base class
- outparams.js: Ensure outparameters and return error codes are in sync
- stack.js: Mark classes as stack-only
A whole lot more analyses in various states of completion can be tracked in the static analysis bug.
Asynchronous discussion happens in the mailing list. #static irc channel is the place for interactive discussion.
Nearterm Plans For Plugins
GCC 4.5 has an official plugin framework enabled by default. I will try to switch to GCC 4.5 as soon as it is out. Currently 4.5 is still changing too often for me to bother fixing Treehydra (Dehydra usually works). As soon as 4.5 is out I will revise the installation instructions to use distribution GCC and JavaScript packages to avoid the current mess (draft can be found here). Sometime after that I’ll switch Mozilla static analysis to GCC 4.5 and drop 4.3 support.
Hopefully, this will make it easier for other open source projects to adapt the hydras.
Plans for Analyses
I’m a big believer into application-specific static analyses, but I would like to see some heavy duty open source analyzers built on top of GCC.
Some of the not-so-Mozilla-specific analyses should be bundled together to make them easy to try out on other projects.
Hopefully 2010 will be the year that open source static analysis catches on.
LCA2010
I posted my slides from yesterday.
Chromium vs Minefield: Cold startup performance comparison
January 19th, 2010
Hunting Down Mythical “Slowness”
I recently met a developer who used Chromium instead of Firefox. Chromium’s superior startup speed was his reason for using it.This got me excited because said developer was running Linux, so it was relatively easy to measure cold startup and get a complete IO breakdown.
Turned out Firefox took roughly 23 seconds to start. After much cursing about how I’ve never seen Firefox startup this slow, I eventually gave up on figuring out what’s slowing his startup and instead we measured Chromium startup. It also turned out to also be roughly 23 seconds. The super-slow hard drive made everything slow. Turned out Chromium’s superior startup was a myth in this case.
Measuring Startup
As a result of investigating the startup myth above, my kiwi coworkers encouraged me to post a comparison of Chrome/Firefox startup. I am at linuxconf at the moment so I did the comparison on my laptop.
Laptop configuration:
- Intel(R) Core(TM)2 Duo CPU L9400 running at 800Mhz to amplify any performance differences.
- HITACHI HTS722020K9SA00 harddrive for the user profile and browser binaries
- OCZ Vertex 30GB SSD for system libraries/configuration.
- Fedora 12, Minefield 20100119 tarball, chromium-4.0.285.0-0.1.20091230svn35370.fc12.i686
- sudo sync && sudo sysctl -w vm.drop_caches=3 && sudo sysctl -w vm.drop_caches=0 to clear the disk cache inbetween runs
What am I testing? I am measuring the time between invoking the browser until a JavaScript snippet embedded within a basic webpage is executed (ie Vlad’s approach, with a slightly modified startup.html). The above sysctl command clears disk caches, this creates a similar situation to when one turns on the computer and it hasn’t yet loaded all of the browser libraries from disk into memory. This is a blackbox approach to measuring how long it takes from clicking on the browser icon to get an interactive browser.
Firefox commandline: firefox -profile /mnt/startup/profile/firefox -no-remote file://`pwd`/startup.html#`python -c ‘import time; print int(time.time() * 1000);’`
Chromium commandline: chromium-browser –user-data-dir=/mnt/startup/profile/chrome file://`pwd`/startup.html#`python -c ‘import time; print int(time.time() * 1000);’`
Both of these tests are done with an empty profile that was populated and has settled after running the browser a few times.
Results
The following numbers are milliseconds reported by the startup.html above.
Running Chromium five times: 4685, 4168, 4222, 4197, 4232
Running Minefield five times: 3155, 3273, 3352, 3311, 3322
I picked Minefield because that’s the browser that I run and the codebase that I focus on. The linux Chromium channel seems to be the closest parallel to Minefield. I did not test on Windows because it is a bit of a nightmare to measure cold startup there.
Conclusion
On my system Minefield is around 30% faster at starting up with an empty profile than Chromium (the difference is amplified by running the CPU at 800Mhz). For comparison of Minefield against older Firefox versions, see Dietrich’s post.
I suspect that there is a relatively small difference between the two browsers because we are running into the fundamental limitations of loading large applications into memory (my rant).
Some developers manually grope around in the dark
January 4th, 2010
Cool thing about static analysis is that you can ask painful-for-humans questions about your codebase AND have them answered.
Here are two that got answered by Ehren:
Where do function bodies continue after return statements (ie obviously dead/broken code)? Bug 535646.
How many functions in Mozilla could/should be marked static? Bug 536427.
Awesome!
Windows 7 Startup Exploration
January 4th, 2010
I did some digging to figure out if one can setup cold-startup testing in Windows 7 without nasty hacks. My conclusion is: sorta-kinda.
The Good – Most of the Ingredients Are Present
I haven’t actively used Windows since pre-XP days. It looks like it has come a long way since then: there is now a decent interactive shell, all kinds of settings/services can be controlled from the commandline and there is even sudo-like functionality.
PowerShell takes inspiration from the korn shell and throws in .net which allows for much nicer “shell programming” than the dominant bash shell.
mountvol is a terrible equivalent to mount in linux – but it exists, so I’m happy.
NTFS junctions are frustrating equivalents to links in a unix filesystem.
The Bad
The essential ability to completely flush filesystem caches isn’t there. This isn’t quite as embarrassing as it seems as Mac OS X’s purge command does not flush the page cache (resulting in mmapped files not purged from cache), so technically OS X has the same limitation and only Linux gets it right.
The Ugly Workaround
After much brainstorming we figured out that we can clear all relevant caches on Mac OS X by putting files that we care about on a separate partition and mounting/unmounting it for every measurement.
Ridiculously, Windows is “smarter” than that and appears to cache stuff per-drive, such that mounting/unmounting a partition has no effect on the cache. The best workaround I could come up with involves putting the said partition onto a USB disk and unplugging it in-between unmount/mount testing cycle.
Windows 7 Startup Recipe
1) Set up junctions for the 2 profile directories to point to the USB partition, unzip firefox onto that partition.
2)
$old = (get-location)
$mountpoint = $env:userprofile + "\cold"
# magic name given by running mountvol
$drive = "\\?\Volume{885d5bc3-e918-11de-a4e5-002268e3077c}\"
# Based on http://poshcode.org/696 + fiddling with UAC settings to avoid prompts
sudo mountvol $mountpoint $drive
# Mountvol doesn't seem to block until drive is mounted
sleep 1
#mountvol
cd $mountpoint\firefox
echo (pwd)
# The following command shows PowerShell awesomeness
# based on Vlad's approach
./firefox.exe -no-remote "file://$(pwd)\startup.html#$([Int64](([DateTime]::utcnow - (new-object DateTime 1970,1,1)).ticks/10000))"
cd $old
# I haven't yet figured out how to wait on firefox.exe to finish
sleep 10
sudo mountvol $mountpoint /d
3) Unplug USB drive
JavaScript DOM
December 17th, 2009
I filed bug 533874 to expose the JavaScript AST to JS, but turns out I need to explain why it’s important to expose the JavaScript AST. So lets start from the beginning.
My name is Taras and I like to wrestle useful information out of tools that do not think to offer it. I firmly believe that writing/maintaining code is more difficult than it should be. I think the current organization of compilers is partially responsible for it.
Related Prior Work
Some groups at Mozilla have realized that inspection and refactoring of code are tasks that will often consume as many people as can be thrown at it so there has to be a better way. That better way is through tools: get computers to do things that are difficult for humans. We do that a lot for other kinds of tasks, but not so much for computers. For example, when was the last something you googled something using telnet? Yet our tools for presenting and analyzing code are about as sophisticated as telnet. For some reason it is easier to find some random piece of information on the web than it is to find what code implements a function that is being called from my code.
As the original author, I think Dehydra, Treehydra on top of the GCC plugin API give us a pretty reasonable way to extract useful information out of our C++ code. I feel lost and confused without Dave’s DXR, the JS team routinely breaks (and fixes) invariants in the JS engine and we’ve been able to wipe out certain classes of bugs (ie code patterns that cause them) mozilla-wide. I’m looking forward to reducing the footprint of Mozilla by deleting more dead code.
JavaScript Needs
My efforts so far have focused on C++ because it is finicky language and developers need all the help they can get with it. I think we need to the same thing for JavaScript. I think there is a lot of useful information in JavaScript code that is burried and hard to get to. Joshua Cranmer broke new ground in building JSHydra on top of SpiderMonkey. This way one can parse JavaScript in exact same way as Spidermonkey sees it (something that can only be approximated with other approaches). I think we should offer the ability to analyze js code within Spidermonkey. Unfortunately, he is a busy student so he doesn’t have the time to develop JSHydra this to the next level.
How Would JavaScript Developers Benefit?
Currently we have jshydra deployed on AMO to scan javascript for potential issues to make the reviewer’s job more productive.
Dave Humphrey is working away on integrating JSHydra into DXR so we can have semantic navigation of JavaScript: what components are implemented JavaScript, wouldn’t it be nice to know who calls certain methods(including javascript callsites), etc. Gandalf has a cool idea where he needs to be able to extract translatable strings out of JetPacks. I believe JetPack people in general would like to be able to analyze JS code more.
I think it’s clear that the common trend in all these tasks is to turn an opaque text blob into something that can be easily navigated programatically. Wouldn’t it suck to not to be able to walk the DOM for HTML? Why is it ok to accept that handicap for the essense of all programs?
How Should This Be Solved?
I don’t know for sure, but my friend Jim Blandy, came up with a good suggestion. We already have eval/uneval in SpiderMonkey. it would be really cool to extend uneval to produce a iteratable data structure instead of a text blob. So I filed a bug 533874 on this, and this my explanation of why I think that would be a very empowering feature to expose ASTs in JavaScript.
Dehydra Testsuite Passes on GCC 4.5
November 20th, 2009
I spent couple of days fixing the remaining test-suite failures on GCC 4.5 trunk for Dehydra. Since the last time I looked into this, GCC went from crashing all over the place to only crashing if I did something bad. It was nice to discover that as a result of switching to 4.5 Dehydra users will get saner .isExplicit behavior and more precise location info.
Treehydra will take more work due to me misunderstanding GTY annotations.
By the way, I am really grateful for all of the people who contributed GCC 4.5 fixes so far. You guys have been a big help in getting Dehydra testsuite to 100% on 4.5. Looks like I will meet my goals to finish De+Treehydra by the end of the year in time for GCC 4.5 release and my “Introducing Dehydra to the Developer World”-type talk at LinuxConf.au.nz 2010.
Startup
I reduced my focus on startup speed at the moment to catch up on Dehydra. I plan to work on reducing xpconnect overhead during startup next, ie more of this bug.
FSOSS & Dehydra Update
November 6th, 2009
Last week I was in Canada to present at FSOSS with David Humphrey on awesome Mozilla Tools: Dehydra, DXR, Pork, etc. I think we managed to convey the message regarding what a sad affair that current developer development tools are.
General-Purpose Dehydra Scripts
Dehydra grew out of Mozilla’s constant need to figure out what is going on in the source code. As a result most of our scripts are very Mozilla API-specific. This makes harder for people outside of Mozilla to learn Dehydra. There is no library of Dehydra code that one can just plugin to start analyzing their codebase. Instead one has to sit down, figure out what Dehydra is capable of and then see if any of the problems facing the developer can be solved this way. If anyone wants to contribute such a library, let me know.
In the meantime, more general-purpose analyses are surfacing.
Shadowed Members
My favourite script so far is the member-shadowing checker. I ran into a member-shadowing warning that is unique to Sun’s C++ compiler. It was triggered by some code that I just landed on the tree. I fixed the warning, but within a few days a coworker ran into a bug caused by that member shadowing(due to having an unlucky revision of the code). The following example shows how simple it was to implement the warning in GCC/Dehydra.
See bug 522776 for the complete story on adding the member shadowing check to Mozilla.
Printf
Another general purpose analysis was done outside of Mozilla by Philip Taylor for his game. His script checks wide printf format strings (which are overlooked by gcc).
Independently, Benjamin wrote a printf checker for Mozilla printf-like code, see bug 493996.
Custom Sections in Object Files
We have long speculated about how nice it would be if Dehydra could emit info into object files that could then be yanked out of the resulting binary (by say, valgrind). bug 523435 will soon make that a reality.
Studying Library IO – SystemTap Style
October 23rd, 2009
In my last blog post I expressed frustation with slowness induced by library IO. Then I went on a mission to measure it. I have been wanting to this for a while, but I figured that only DTrace can get this info without recompiling my kernel. So I tried to build Mozilla under Slowlaris (but the linker got up to 3GB and then set there swapping, ensuring that the nickname is justified). Then I fired up DTrace on the mini, but ran screaming because it seemed like fbt DTrace provider refused to let me dereference structs (later Joel told me that I’m supposed to copy data explicitly like here).
But while googling for a fbt workaround, I stumbled upon a DTrace/SystemTap comparision wiki. SystemTap? The DTrace knockoff I have been hearing about? It works? This was a lightbulb moment where I realized that Linux was about to provide me with more information than I thought was possible.
So here is the data I got out of it:
Rant on Library IO
October 20th, 2009
So I’ve been trying to figure out how optimize disk IO startup. I looked into IO caused by libraries and turns out that apps with big libraries are screwed. Here is how I came to this conclusion:
Gnomer’s research on startup pointed out that dumb readahead leads to wins in terms file io. So I wrote some code and sure enough, reading in libxul on top of our main() function does indeed result in a significant measurable speed-up on both Linux and OSX.
From the gnome page I found a link to some diskstat stuff. There lay a presentation with graphs that appear to show that OpenOffice has a much better cold IO pattern than Firefox. Given that there are some strong similarities between our application layouts I went digging to see if OpenOffice does something funny. And oh boy, it does do funny page reordering on Windows and “slightly-smarter-than-dumb-readahead-style library prefetch” on Linux…
So here is an innocent question: Why is page-reordering not done as a PGO step? I mean shouldn’t you fire up your app, feed some info back to the linker and be done with it? Another question: Why can’t we mark certain files as “keep this whole file in ram if someone asks for part of it to be paged in”?
So is the only way to fast application startup via static linking? It sure is easy to
posix_fadvise(open(argv[0],O_RDONLY), POSIX_FADV_WILLNEED);
Are these hacks still the state of the art in making apps with large libraries startup fast?
Update: Found some mentions of GNU Rope unfinishedware and a relatively recent blog post
Restless Bug Fixing
October 8th, 2009
I spent the past couple weeks analyzing and improving fastload performance. I’ve long been suspicious of fastload, but only finally got around to investigating it in detail. I think there is some fundamentally ironic rule in software that if you put the word “fast” in the name of a component, it is bound to eventually become a performance bottleneck.
Almost a decade has passed since the conception of this code, so it was time to update code’s assumptions to reflect the capabilities of modern OSes. I landed the fix today. It results in startup performance gains of 1-20% on various platforms I tested, making this the most exiting perf bug I’ve worked on.
Plans
Now that I’ve had my fill of almost a year’s worth of startup performance analysis, for the remainder of the year I plan to refocus on static analysis. My main goal is decent C support on Dehydra(not to mention the ever elusive GCC 4.5 compatibility) and to facilitate a production-quality DXR.
I’m hoping that we’ll end up with cool ways of dealing with the painful/slow boilerplate (bugs 520626, 516085 and 517370)