Rant on Library IO
October 20th, 2009
So I’ve been trying to figure out how optimize disk IO startup. I looked into IO caused by libraries and turns out that apps with big libraries are screwed. Here is how I came to this conclusion:
Gnomer’s research on startup pointed out that dumb readahead leads to wins in terms file io. So I wrote some code and sure enough, reading in libxul on top of our main() function does indeed result in a significant measurable speed-up on both Linux and OSX.
From the gnome page I found a link to some diskstat stuff. There lay a presentation with graphs that appear to show that OpenOffice has a much better cold IO pattern than Firefox. Given that there are some strong similarities between our application layouts I went digging to see if OpenOffice does something funny. And oh boy, it does do funny page reordering on Windows and “slightly-smarter-than-dumb-readahead-style library prefetch” on Linux…
So here is an innocent question: Why is page-reordering not done as a PGO step? I mean shouldn’t you fire up your app, feed some info back to the linker and be done with it? Another question: Why can’t we mark certain files as “keep this whole file in ram if someone asks for part of it to be paged in”?
So is the only way to fast application startup via static linking? It sure is easy to
posix_fadvise(open(argv[0],O_RDONLY), POSIX_FADV_WILLNEED);
Are these hacks still the state of the art in making apps with large libraries startup fast?
Update: Found some mentions of GNU Rope unfinishedware and a relatively recent blog post
October 20th, 2009 at 9:51 pm
someone did attempt reordering at some point: http://mxr.mozilla.org/mozilla-central/source/tools/reorder/
October 20th, 2009 at 9:57 pm
Hilarious, you found a moz version of G[nu]rope!
October 21st, 2009 at 12:38 am
Funny that I had a look the grope paper yesterday, but could not find the source anywhere. The mxr link may be useful, I’m sure you can make it work!
October 21st, 2009 at 3:01 am
Not convinced the mxr link is any use, this thing is awfully old and never moved since check-in :
http://bonsai.mozilla.org/cvslog.cgi?file=mozilla/tools/reorder/garope.cpp&rev=HEAD&mark=1.1
1.1 waterson%netscape.com 2001-11-30 First checked in.
October 21st, 2009 at 6:12 am
Visual C++ PGO does do block reordering:
http://msdn.microsoft.com/en-us/library/aa289170%28VS.71%29.aspx#profileguidedoptimization_topic4
but I’m not sure if it’s exactly what you’re asking for here.
Note that we’re not building with PGO on Linux or OS X right now. Last time we tried it it wasn’t a big win, plus it always seems to hit GCC bugs.
October 21st, 2009 at 11:36 pm
There is a lot to be said for static linking.
A lot of the things (limited disk space, limited RAM) that made shared libraries so attractive are no longer an issue.
Static linking (with link-time optimisation) allows for nice compact code that cold-loads very quickly.