Conservative stack scanning 101
August 9th, 2007
Maybe you’re curious about how the MMgc garbage collector works. Here’s my rambling attempt to explain it on IRC this morning (edited for clarity and layout).
(Note: The MMgc documentation is another good place to start, if you already know all about the C++ stack and heap.)
peter: this "scanning the C stack" is a mystery to me peter: I had no idea C was reflective like this jorendorff: Oh, it's notpeter: ok good because I never heard of such things! jorendorff: Happy to explain.
jorendorff: Or try.
jorendorff: But you need to understand a little about how C works in practice. jorendorff: What do you know about "the stack"? peter: not much I guess peter: I know about regular C dynamic memory use jorendorff: Well, when your program runs, and main() gets called, jorendorff: say you've got some local variables in there. jorendorff: C has to store them somewhere. What happens is jorendorff: ...C has some memory, maybe 64K bytes, set aside jorendorff: ...for local variables. peter: yep peter: set aside by the OS, correct? jorendorff: eh, not exactly jorendorff: C asks the OS for 64K bytes, the OS says sure, here you go, peter: right peter: ok jorendorff: ...and C uses those for the stack. jorendorff: But the OS doesn't know what C is doing with them. peter: ok
Here when I talk about “C” doing things, I really mean “the C language runtime”. This is code that the C linker automatically links into every C/C++ program. It’s called libc on Unix. On Windows, they call it the CRT.
jorendorff: So. C remembers how much of the stack is in use. jorendorff: (It has a machine register pointing to "the top of the stack") jorendorff: (...on intel x86, this is ESP) peter: C's malloc asks for memory from this 64kb and manages it peter: making sure things that need to be in even alignment, are peter: and so on jorendorff: Well, the stack is separate from all malloc'd memory. peter: ahh ok jorendorff: Local variables don't go in malloc'd memory. peter: gotcha jorendorff: C actually sets up both the stack and the malloc() heap jorendorff: before main even runs. peter: ok peter: and the stack is the automanaged part for locals jorendorff: exactly peter: thats the part everyone likesjorendorff: right. jorendorff: OK. So when main() calls another function, jorendorff: ...say it calls printf() jorendorff: C does all this automatically jorendorff: ...*before* any code in printf() actually executes: jorendorff: (1) stores a "return address" jorendorff: so it knows where to pick up again when printf() eventually returns jorendorff: (2) sets aside some stack space for printf's local variables. jorendorff: I guess (2) actually happens in some code within printf jorendorff: ...that is generated automatically by the compiler. jorendorff: Then printf() runs. peter: ok jorendorff: When printf() returns, all those local variables go away. jorendorff: My point is-- jorendorff: can you see what the data structure for this is like? jorendorff: "The stack" is just a contiguous chunk of memory. jorendorff: main()'s locals are at the bottom. peter: sure jorendorff: If main() calls another function, those locals are adjacent jorendorff: and so on peter: makes sense jorendorff: it's actually a "stack" in the data structures sense. jorendorff: ok, what this means is jorendorff: all local variables, both for the current function jorendorff: and all its callers, jorendorff: are in this contiguous chunk of memory. jorendorff: For each platform (meaning, OS + CPU + compiler) jorendorff: ...there's a way to find out where that chunk of memory is. jorendorff: Tamarin does that, jorendorff: then it scans all that memory during GC jorendorff: to see where the locals point to. peter: there is some C header can be included that allows access to all this? jorendorff: peter: not standard C, no peter: no but some header jorendorff: yeah, look in the tamarin source
![]()
This magic lives in a macro, MMGC_GET_STACK_EXTENTS, defined in the header MMgc/GC.h. As you can see, there’s a separate implementation for each platform.
At any given moment, some locals might be in CPU registers and not on the stack. To cope with this, the macro uses a few lines of assembly code to dump the contents of all the registers onto the stack. That way MMgc can just scan the stack and it’ll see all local variables.
peter: I have been thinking there would be ways to fool this stack scan jorendorff: peter: Yes, there would be.jorendorff: Can you give me an example, though? peter: if a local pointer points to dynamically allocated bit of memory peter: that points to other dynamically allocated bits peter: that should not be garbage collected peter: and there have been a bunch of type casts along the way jorendorff: Oh, type casts don't affect MMgc at all. jorendorff: MMgc doesn't look at C/C++ code. jorendorff: It's looking at raw memory. peter: ahh right peter: just inference by bit patterns? jorendorff: yep peter: and it follows from local pointers to malloc()ed memory, peter: and follows the pointers in that memory to other memory, and so on? jorendorff: yes, in principle -- but jorendorff: MMgc has its own allocator jorendorff: that you must use instead of malloc(). peter: ahh peter: ok jorendorff: Since it knows all the internal data structures jorendorff: used by that allocator, jorendorff: when it follows pointers jorendorff: it knows the size of every region jorendorff: and can scan each reachable region for further pointers.
In addition, MMgc does enough bookkeeping that it knows exactly which parts of memory are GC-managed. So it can’t be fooled into following a pointer into bad memory (such as NULL or a pointer to a page of memory that has been returned to the operating system, either of which would immediately crash the process). The same bookkeeping information helps MMgc avoid mistakenly writing to non-GC-managed memory. (This matters because it does write to GC-managed memory, just like any other mark-and-sweep garbage collector.)
There are other ways to fool MMgc. Simply using malloc() is enough —MMgc never scans memory allocated using malloc(). It’s up to the programmer to avoid fooling MMgc. This is not such a heavy burden in practice.
peter: fascinating stuff peter: who's the genius that thought up all this stuff? jorendorff: Some Lisp hacker in prehistoryjorendorff:
jorendorff: Lisp was the first language with GC jorendorff: you can look up garbage collection online jorendorff: It's a whole field of computer science. jorendorff: It *is* fascinating. peter: do you know the proper comp sci terminology peter: for this stack-scan style of GC? peter: or is it just part of the collective knowledge? jorendorff: Um, jorendorff: What MMgc does is called "conservative GC" jorendorff: which means (a) it doesn't really know the types of variables, jorendorff: so it scans everything whether it's really a pointer variable or not peter: ok I've seen that written in a few places recently jorendorff: and usually (b) it treats all the likely places as "roots" jorendorff: including the stack peter: The root of the issue is I had no idea C could analyze it's own stack jorendorff: heh jorendorff: C can do anything
![]()
Indeed it can—which is why, despite their many deep, fundamental problems, C and C++ remain the weapons of choice for operating system and VM designers.
Stage 0 home stretch
August 8th, 2007
For the past 4 weeks or so, Ed Lee and I have been incrementally nudging the actionmonkey repository toward a state where we can switch over from SpiderMonkey’s existing memory management routines to Tamarin’s memory management library, MMgc.
Today some key patches for ActionMonkey Stage 0 were posted on bug 391211. These patches delete the old JSGCArena scheme and start allocating everything—JavaScript strings, functions, objects, and more—from MMgc.
This change will lead to several simplifications. For example, SpiderMonkey currently saves a reference to every new object as it’s created. This reference keeps the object alive in case GC happens while it’s new, before it’s reachable from other data. The reference lasts until the next object is created. So as long as you know SpiderMonkey internals reeeally well, and you’re dead certain no new objects are being created, you can sometimes avoid explicitly rooting an object. Hmmm, maybe this sounds a little crazy and brittle… anyway, these “newborn roots” will be unnecessary with MMgc, because MMgc scans the C stack. Even if your new object isn’t reachable from anywhere else yet, there will be a pointer to it from the stack or registers, so it won’t be unexpectedly collected. Similarly, throughout SpiderMonkey, JSTempValueRooters are used to pin objects referenced by local variables in C code. These temporary roots are generally short-lived, whereas GC happens rarely; so adding and removing the temporary root is just useless bookkeeping most of the time. Thanks to MMgc’s stack scanning, these TempValueRooters can be deleted altogether. And third, we’re planning to delete some bits called GCThingFlags that currently cost us 4 to 8 bytes per allocation. I’m curious to see how these simplifications will affect performance.
These cleanup tasks are an easy way to get involved in ActionMonkey, by the way. Leave a comment if you’d like to lend a hand.