MemShrink progress report, week 23

Post author By Nicholas Nethercote
Post date November 23, 2011
12 Comments on MemShrink progress report, week 23

The only significant MemShrink-related change that landed this week was that David Anderson removed TraceMonkey, the tracing JIT compiler. In fact, TraceMonkey was disabled a while ago, so the effects on code size and memory consumption of its removal have been felt since then. But it feels more real now that the source code is gone (all 67,000 lines of it!), so I figure it’s worth mentioning. (BTW, many thanks to Ryan VanderMeulen who has been going through Bugzilla, closing many old TraceMonkey-related bugs that are no longer relevant.)

People have asked why TraceMonkey isn’t needed any more. In my opinion, tracing compilation can be a good strategy for certain kinds of code, such as very tight, non-branchy loops. But it tends to do badly on other kinds of code. Before JaegerMonkey, JS code in Firefox ran in one of two modes: interpreted (super slow), or trace-compiled (usually fast). This kind of bimodal performance is bad, because you lose more when slow than you gain when fast. Also, because tracing was the only way to make code fast, huge amounts of effort were put into tracing code that shouldn’t really be traced, which made TraceMonkey really complicated.

Once JaegerMonkey was added, the performance was still bimodal, but in a better way: method-compiled (fairly fast) or trace-compiled (usually fast). But the heuristics to switch between the two modes were quite hairy. Then type inference was added to JaegerMonkey, which made it faster on average than JaegerMonkey+TraceMonkey. Combine that with the fact that TraceMonkey was actively getting in the way of various additional JaegerMonkey and type inference improvements, and it was clear it was time for TraceMonkey to go.

It might sound like there’s been a lot of wasted effort with all these different JITs. There’s some truth to that. But JavaScript is a difficult language to compile well, and people have only been writing JITs for it for a few years, which isn’t long when it comes to compilers. Each new JIT has taught the JS team about ideas that do and don’t work, and those lessons have been incorporated into the next, better JIT. That’s why IonMonkey is now being developed — because JaegerMonkey with type inference still has a number of shortcomings that can’t be remedied incrementally.

In fact, it’s possible that IonMonkey might end up one day with a trace compiler for really hot, tight loops. If it does, this trace compiler would be much simpler than TraceMonkey because it would only target code that trace-compiles easily; trace compilation would be the icing on the cake, not the whole cake.

Enough about JITs. Time for this week’s MemShrink bug counts.

P1: 31 (-0/+2)
P2: 132 (-3/+8)
P3: 60 (-0/+2)
Unprioritized: 4 (-0/+4)

Not a great deal of movement there. The quietness is at least partly explained by the fact that Thanksgiving is happening in the US this week. Next week will probably be quieter than usual for the same reason.

12 replies on “MemShrink progress report, week 23”

Thanks for the explanation about tracemonkey. Will the lessons that have been learned be usable by other interpreter languages? Or do they also have to go through this painful process?

The “don’t start with a tracing JIT” lesson probably applies widely. Other lessons are probably more language-specific.

I posted this by error to one of the older posts but it should go probably here to get noticed:

I’m really glad for your effort.

But I have recently observed that although Firefox’s main process is a bit lighter in terms of memory (after a few days it slowly takes more and more, however), there are definitely problems with the plug-in container processes. I can have 600 MB of firefox.exe but how is it good for me when I have another 500 MB of plugin-container.exe after a few days? Its the Flash’s plug-in container that does this. I don’t know if this is primarily a problem of Firefox, or Flash, but ultimately it is of course Firefox’s problem. And this will happen to anyone, I think, who has e.g. a GMail tab open all the time (pinned to the tab bar, for example) because there is always at least that one tab using Flash so the plug-in container won’t die as it dies e.g. in the case of Java’s container when you leave a web page where Java applet was run.

The growing memory is, I believe, largely related to fragmentation in the JavaScript heap — hopefully Nicholas will correct me if I’m wrong here. If it is largely JS fragmentation, then things should be much better in FF10. We have a team actively working on this problem now, which should have even bigger results within a few months.

As to the size of plugin-container.exe: that is all flash’s fault. There is literally nothing we can do to fix this other than to kill off flash usage on the web. Thankfully, this is actually starting to happen, now that HTML5 is taking off. Note: this is largely due to Mozilla’s constant push towards HTML5 for the last 5 years. So, in a sense, we are working on this memory problem too.

Flash will die off eventually, but I suspect that’s still quite a few release cycles away.

In the mean time, what about more brute-force stopgap approaches? For instance, could the plugin-container be monitored for memory usage, and if it seems to be leaking too badly, throw up a prompt asking the user if they want to restart flash? (Trickier, but more slick, would be to identify when flash was idle and restart it automatically.)

Even though it’s currently incarnated as flash leaking, the problem the browser has is that any plugin can start leaking memory, and there’s nothing anyone except the plugin author can do about it. I wouldn’t mind a tool or two to deal with such plugins, even if they were clubs instead of scalpels.

Would it be possible to use multiple instances of the plugin container process so as to isolate and limit the lifetime of the leaked memory?

One could create one process per tab (so the memory could be collected when the tab closes). Another approach would be to create a new process each time a time threshold (e.g. per hour) or memory threshold is crossed and then terminate the old processes once they no longer have any users. Either of these approaches would transparently limit the accumulation of leaked memory for users with many tabs as long as they weren’t all open for long periods of time.

Would it be a good idea for MemShrink Team to Meet with Snappy Team every two to three weeks?

Since both are partly related.

Some people go to both meetings. Beyond that, normal interaction (Bugzilla, IRC, etc) is enough, I think.

I would like to point out that just because tracemonkey did not work out is no proof tracing does not work. It works just fine for PyPy and LuaJIT for *all* kinds of code, not just some specific code. LuaJIT is probably the most advanced (albeit the language is simple) dynamic VM implementation out there, so one can also say it works just fine.

Cheers,
fijal

LuaJIT has an extremely fast interpreter, and then tracing is done on top of that, and it works well. SpiderMonkey has a slow interpreter, and tracing used to be done on top of that, and it didn’t work well, because you get the bad bimodal performance (either really fast or really slow). This is what I meant when I said tracing can be the icing on the cake, not the whole cake.

I’ve also heard that Lua is a significantly simpler language than JS which makes it easier to JIT, but I don’t know that first hand.

@fijal: Lua is a simple language but it also completely outclasses JavaScript in terms of features. If language choice for the browser were based simply on merit, JS would be thrown out in place of Lua without a moment’s thought.

Although LuaJIT is cool, if you through a bunch of nested closures with loops it also starts degrading quite fast. If we want to, we can praise it as one man’s very good job doing what he has been able to to make something cool.

Javascript in contrast is very flexible and users mangle code pretty fast with it. If some day Javascript is made simpler by means of other languages compiling into a subset of it, it could also enjoy major speedups.

Comments are closed.