Memory Dragon Slain
03.23.09 - 10:51pm
The pre-alpha build we released in February got a lot of criticism (and rightfully so) for what became known as the “checkerboard” issue. Mark Finkle blogged about the problem and the solution before. Tonight I pushed the fix to mozilla-central, here are some details.
It all comes down to memory.
Simply put, we were running out of memory. The frustrating part though, was that we weren’t running out of real memory. We were being constrained by an artificial limit of 32Mb (including the binary) that Windows CE places on each process. That is why otherwise extremely capable phones like the HTC Touch Pro with 288Mb of RAM were showing a beautiful checkerboard pattern. I’m still not entirely sure why this didn’t show up on our development devices before the release. I suspect it has something to do with the fact that we’re using unlocked HTC Touch Pros for development, while most of the testers seeing the problem were using handsets from Sprint, Verzion or AT&T.
Digging further, the reason people saw a checkerboard pattern, but otherwise usable UI is due in part to the way Fennec works. In Fennec, the user actually interacts with an html canvas element. We render the page off screen and then paint the image onto the canvas. In this particular case, it seems as though we were successfully rendering pages but failing to allocate enough memory to do the final paint onto Fennec’s canvas. We were handling this low memory “gracefully” and continuing to run.
jemalloc to the rescue
At first Doug Turner took a crack at cooling down this memory hot spot by improving on how we allocate memory during paints. Essentially, allocating a single buffer and reusing it every time he needed a surface to write to. This didn’t actually decrease the amount of memory we were using, but instead significantly reduced the amount of memory “thrashing,” (repeatedly allocating and freeing chunks of memory).
With that change we were able to run for much longer without running into memory problems, but it became clear that there are plenty of other places in our code that we thrash.
The solution was to use jemalloc, originally written by Jason Evans. Stuart and Vlad used jemalloc to greatly improve the memory performance of Firefox over a year ago, so it was already in the Mozilla bag of tricks.
Success!
Fearful of counting my chicks before they were hatched, I didn’t want to blog about this until it landed, and tonight it did. With jemalloc enabled on our windows ce build of Fennec I am able to start up without any worries of seeing a checkerboard.

…and I can browse to the two most memory intensive websites that I encounter on a regular basis (both of these have crashed my iPhone on occasion), the first being Planet Mozilla

…and the second being the Firefox tinderbox waterfall

I can even look at both at the same time!

With these bugs closed, we’re down to 5 blockers for our next release on Windows Mobile. Stay tuned.

Excellent stuff Brad. Glad to be rid of that damn checkerboard!
Been running rock solid on my HTC Touch Diamond with your jemalloc stuff for a while.
Also my device is locked, it’s a Rogers phone. That may lend some data to your hypothesis on locked phones having the issue.
Great to see this finally land! No more patch-queues (Stuart will be happy).
Great job on this Brad. Not an easy problem and you killed it.
Hopefully you have some sort of “tab paging” mechanism so that when too many tabs to fit in ram are loaded users don’t start getting random checkers. Slow or a error is better than crashing or (seemingly) randomly not working.
I wonder if this fix will bring the poor memory performance form the big brother – Firefox.
@Fred we are doing paging. We are also working on memory pressure handlers which will help the situation. One thing to note though is that on Windows CE there isn’t much difference between “RAM” and “disk”
@Dawid huh? If Firefox 3 performs poorly, what browser performs well? See http://ejohn.org/blog/firefox-3-memory-use/ and http://arstechnica.com/open-source/news/2008/03/firefox-3-goes-on-a-diet-eats-less-memory-than-ie-and-opera.ars
just amazing !
Just downloaded the nightly (24.03.09) and it still checkerboards on my O2 branded Touch Pro
At least it loads up now, the original Feb release checkerboarded before it got as far as loading the interface.
This version displays a black screen, swiping right displays a tab list on the left and a checkerboard on teh right.
http://joone4u.blogspot.com/2009/03/fennec-10-alpha-for-windows-mobile-on.html
It’s awesome, Fennec is working on SAMSUNG i780.
I can see web pages instead of checkerboards
Thanks~
Same as David. Have tried several nightlys and just tried (01.04.09) with same results. Black screen with checkerboards when you swipe right/left. I can get to a setup screen with Add-ons, Extensions, Themes, etc…
Phone information
Sprint HTC Touch Pro
Manufacture date 15.12.08
Software Ver. 1.03.651.4
Firmware Ver. 1.03.15F
Hardware Ver. 0002
I tried installing to main memory and storage card.
Thanks for all the hard work! I look forward to changing over to Fennec.
I tried it on my Sony-Ericsson X1 Xperia (unlocked).
It started but got about the same results as David and Bill.
What I did’t get was the touchscreen keypad working. (it doesn’t work on Opera either but it works on Internet Explorer).
So I’m waiting in the que for next try and hope it will be more allround working.
Keep up the good work – it feels better than Opera and IE already.
Downloaded the latest nightly, managed to get it to load a page, although it then crashed…
It takes forever to load, but loading pages is relatively quick, compared to the Opera on there, but I think Fennec still has a way to go in user friendliness and usability…
Keep up the good work.
@David What page did you load? QA testing has found some crashes related to specific sites that have plugins. I’d like to get as exhaustive a list of those sites as possible so we can fix that bug.
As always, bug reports (bugzilla.mozilla.org) are much appreciated, but leaving another comment here works too.
Great one folks, artificial limit of 32 is just too little.