Dehydra as a GCC plugin

January 8th, 2008

Thanks to the 2-fold increase in manpower working on pork, we finally have an opportunity to work on the nice-to-have things.

Progress

Recently I have been working on a GCC plugin to do Mozilla-specific analyses with GCC.

Unfortunately, I didn’t notice that GCC had a plugin branch so I reinvented the wheel there. Fortunately that part was rather easy and turned out that the plugin branch isn’t very useful to work with as it is in SVN, doesn’t link GCC with -rdynamic nor does it install the hooks I need in the C++ frontend. Overall the plugin shim is relatively trivial and it will be pretty easy to merge with other similar efforts.

My first and only plugin is a C reimplementation of Dehydra. GCC sources are currently fairly hostile to C++, so I elected to not make my head spin by mixing in C++ in addition to C and JavaScript. I think the C Dehydra has reached the hello world state, to take it for a spin see the wiki page.

GCC Thoughts

Integrating with GCC is pretty awesome. So far I regret not jumping in earlier. I was reluctant to do so as everyone I’ve talked to (other than Tom Tromey) claimed that GCC is ridiculously complicated and impossible to do stuff with. In fact academic people are so scared of GCC that they tend to opt to go with commercial frontends that have ridiculus licensing terms and make it impossible to release their work to general public.

GCC internals are pretty crazy since everything is done with macros and the AST is dynamically typed so it’s fairly painful to figure out seemingly simple things like “what AST nodes does this AST node contain”. Additionally, GCC loves rewriting AST nodes inplace as the compilation progresses which sucks when one wants to analyze the AST while it looks as close as possible to the source. GCC parser also sucks to work with as it is implemented as a C code hodge-podge (technical term which applies to much code in GCC). Luckily, I am mainly concerned with poking at data that’s already in GCC.

The upside is that GCC is a well-tested production compiler that most source compiles with. Integrating with GCC means that the AST is correct (Elsa is a frontend so there is no way of knowing if AST has mistakes in it) . Integration also means that the user doesn’t have to worry about making preprocessed files and maintain obsolete versions of GCC or old GCC headers. Unlike Elsa, GCC already has useful features like typedef tracking and doesn’t implement location tracking with a stupid programming trick. Additionally, I hope to reuse computations from from middle-end GCC passes to build my control flow graph, do value numbering and other useful, but tricky to implement stuff.

GCC isn’t scary at all, it’s just another way of implementing a compiler. Some people elect to have more pain in life by electing to reinvent ML in C++ instead of using ML for compiler writing,  others get their pain dosage from working on a C compiler originally generated from LISP sources.

Lastly, I’d like to thank patient gcc hackers in #gcc without whom I wouldn’t stand a chance in figuring out how to get this far.

2 Responses to “Dehydra as a GCC plugin”

  1. Myk Melez Says:

    At foss.in last month I attended a fascinating presentation on gcc internals. It might be a good way to get a general sense of the architecture:

    http://foss.in/2007/register/speakers/talkdetailspub.php?talkid=409

    The slides are available here:

    http://foss.in/2007/register/slides/GCC_Internals__A_Conceptual_View_into_GCC_409.pdf

  2. Daniel Says:

    Hi Tara,

    I’m going to respectfully disagree about the utility of the gcc code base. I think it is very hard to add to or even build. It is poorly maintained and has poor documentation. building a cross compiler is a nightmare, so improvements are very slow. My experience has been that obscene compile times lead to unhelpful failure messages, and even hardcore programmers are left scratching their heads because of the messy undocumented gcc internals.

    Anyhow, I’m just saying. gcc is a good binary and a bad dream for compiler researchers. Stay away from it if you can, suffer endlessly if you can’t.

Leave a Reply