Meanwhile in a parallel universe
August 14th, 2008
Someone else is developing their own app-specific rewrite tools. In this case app-specific refers to automating porting code from gtk2 to gtk3. The approach is similar in that patches are produced, but it doesn’t look like a patch aggregating tool is written yet. Instead of the elsa/mcpp magic sauce, clang is being used, so this is limited to C at the moment.
KDE folks are behind in automated code rewrites arms race, perhaps the trolls should try some pork to accelerate KDE3->4 transition
All kidding aside, it is awesome to see that less-manual-labour-through-compiler-assisted-refactoring approach is gaining mindshare.
This week in the Static Analysis Corner
August 13th, 2008
New Static Analysis Toys
I have been catching up on my backlog of little bugs, here are some of the most notable ones.
Benjamin has been pushing the limits of what Dehydra can do for his DXR prototype which resulted in a couple of cool new features with one new feature breaking backwards compatibility. Sorry about that, it is for the greater good.
Dehydra now processes more declarations.
Dehydra uses JavaScript prototypes to distinguish between types and declarations.
Treehydra is now built by default when building with a plugin-enabled compiler.
Treehydra now exposes the C++ frontend’s verbose and as-close-as-gcc-gets-to-written-code syntax tree via process_cp_pre_genericize. Access to the early C++ AST should make it easier to automatically translate a certain class of C++ functions into JavaScript.
Coming soon: buildbot setup for Dehydra along with autobuilt debian packages.
Also, Benjamin’s GSoC student, Bo Yang, has been doing some awesome work making our static analysis toolchain work on mingw. In my mind, Bo sealed his awesomeness in not only getting Mozilla to build under mingw yet again, but also by fixing a couple of exciting compiler bugs on Win32.
Path to 1.0
For more information on these and other developments see the Dehydra 1.0 tracking bug.
I am not yet sure what the next release of Dehydra will be. My giant GTY patch to GCC is still awaiting review in a GCC developer’s inbox. Depending on whether that gets accepted I’ll continue releasing Dehydra 0.9.x with the current GCC patchset or delay a 1.0 release to work on getting the GCC plugin API reviewed and more or less finalized.
Plans for Near Future
I think I figured out the missing pieces needed to make outparamdel’s deCOMtamination patches acceptable, will work on that next. I’ll be continuing to clean up pork to be more developer-friendly. After the recent unhappyness involving bisection 10separate repositories at once, I’ve decided to merge pork into one giant repository and if someone just wants a couple smaller of pieces, those should be proken up at the package management level.
Additionally, I would like to start landing the SpiderMonkey analyses soon.
Oink testsuite within pork passes
August 11th, 2008
I have never enjoyed the theory behind software engineering. It seems particularly depressing as it can be summarized as: “What can we learn from past software development experience in order to not repeat old mistakes such that we can come up with newer and shinier mistakes?”.
For that reason I haven’t been able to stick to any particular software development doctrine (paired, test-driven, OO, SOA, etc) and instead taken shortcuts to whatever is practical at the time.
One unfortunate result of such neglect is that the oink test suite ended up not being utilized. I tried it a couple of times while starting out with oink and it failed in many cumbersome ways. However, as pork evolved out of oink, I learned more about the “architecture” behind it, I fixed a couple of the issues that were causing funny make errors.
However one giant bug remained. Turned out other people were able to run the original oink testsuite, but not the equivalent one in pork. Fearing that I somehow screwed up Elsa, I spent way too long investigating the failure only to learn it wasn’t my fault. Pork users: rejoice, the testsuite should run as expected now.
PS. I may not be a SENG believer, but I do think that open source + good version control + testsuites result in better software.
Summit
August 4th, 2008
The past week rocked. I was especially impressed with the localizers. The guys who bear through translating an entire browser with associated websites to expose their country to an awesome browsing experience are simply electrifying.
It was great to see the South American guys again and to meet hordes of Europeans.
The most exciting outcome of the summit in my neck of the static analysis woods is that DXR (a semantically aware successor to MXR) will be rewritten from scratch in Python. Another reason to rejoice is that a tracing spidermonkey should make Treehydra ridiculously fast without much (or any) effort on my part.
Pull pork with care
July 25th, 2008
I just committed the large giant change to bring down elsa’s namespace pollution to reasonable levels. Elsa code now is now using std::foo style, or using namespace std. As I mentioned before, Elsa’s string is now sm::string, a summary of how to perform similar renames is here. The good news is that Pork will soon work out of the box with a modern toolchain.
For the handful of porkers out there, you need to hg pull & hg up all of the pork repositories. This has been a use-case in why splitting up a codebase into a billion repositories is a bad idea:
a) Lovely, I have to do many commits instead of one
b) To top it off, now my users will curse my name while updating whatever pork repository that interests them most.
I feel like I’m going to throw up if I see any more C++ code diffs in the next 10minutes.
In contrast, while rewriting things on the Mozilla-scale is a lot less feasible manually, it is very rewarding to automate. Gotta love big C++ codebases.
Dogfooding pork & OSCON
July 22nd, 2008
I wrote a class renamer and used it to fix my pork pet-pieve #1: a class named string that isn’t std::string. This has been a low priority goal for as long as I’ve been using Elsa. It’s pretty cool to apply a tool to fix itself.
The renamer is a 3x simpler than the next simplest tool. I plan to extend it to also rename class members. Renaming is the most trivial use-case for rewriting code, I plan to post a tutorial on usingĀ the renamer in the near future.
OSCON
If you are at OSCON, you do not want to miss our static analysis session on Wednesday.
Pork, MCPP, Oink and Elsa…What’s going on?
July 18th, 2008
It seems that there is some confusion as to what pork is and how it’s related to oink and elsa. So here is my view of it.
Pork is my set of tools that use Elsa to rewrite sourcecode (mainly Mozilla code). Our use of Pork is solely for rewriting as it is not suited for convenient and hardcore analysis needs as much as the GCC based tools are.
MCPP is the secret sauce C preprocessor that makes C++ rewriting with Elsa possible by annotating preprocessed files with information to undo the lexical braindamage resulting from macro expansion.
Elsa is a awesome C++ parser. Awesome in that is can preserve more information regarding parsed code than any other C/C++ parser and it is easy to extend.
We maintain our own version of Elsa within pork.
I think our version of Elsa is the most up to date and most compatible with newer C++ features and headers used by newer GCC releases. We encourage other projects with C++ parsing/rewriting needs to collaborate with us. We will be parsing code with Elsa for a few years to come and it’s a lot of work to maintain a C++ parser by a single entity. I think elsa is a much better backend to build refactoring support onto than any other C++ parsing project out there right now.
The Messy Details
Now lets move on the more confusing parts: oink, oink-stack, and the oink mailing list.
oink consists of some static analysis tools and was meant to be a central place where all of the Elsa and Elsa-related development was supposed to happen. When people refer to oink, they usually mean the oink-stack which is a subversion meta repository that pulls in a dozen of subrepositoes(smbase, elkhound, elsa, oink(where static analysis tools live), etc).
So when I started working on refactoring tools I was told that I should aim to have my tools added to oink, but there were some legal hassles to work out in the meantime so I cloned the oink-stack and developed my tools with minimal changes to oink-stack. This included various elsa extensions, bugfixes, etc.
However, the little momentum that oink had has fizzled out due to various personality conflicts and various academics loosing interest. The code has been bitrotting for as long as I’ve been working at Mozilla.
So the end result of oink is that we have pork which is a superset of oink. I’m not even sure if I mention the name pork anywhere in the sources. So pork at the moment means “Taras’ continuation and extension of oink”. I am using the oink mailing list for any discussion on changes to Elsa/etc in hopes that at least some of the genius lurkers there will regain their interest in elsa.
Where do We Go From Here?
Onward! Due to the original authors vision of what C++ is and the state of C++ at the time Elsa was conceived, current pork code causes people to have many WTF moments (followed by banging head against keyboard) when they first start using it.
The short version of my plan is:
- allow one to do “using namespace std” when using elsa
- Restructure pork repositories such that there are only 3 of them rather than 11 (elsa, elkhound, pork)
- get rid of the oink repository (those tools do not work for us)
- Make pork only consist of just my tools (with a sane build system) rather than be mixed into unmainted oink stuff
- Make pork compile with new compilers (GCC 4.3 and recent MSVC++)
- Keep track of this in a bug
- Clean up various misc things
Some of you might ask “But Taras, why now, why not just keep doing what you’ve been doing?”. I was doing what I was doing because I had an overwhelming goal of devising a way to automate static analysis and refactoring of Mozilla on my shoulders and I wasn’t convinced that it was feasible. I had to learn to split my time between tool development and actually using the tools. Naturally I cut corners on tool development
Since then slowly, but surely various awesome hackers have started doing rewrites and analyses themselves freeing me up to focus more on development. To make matters sweeter, various hackers have started submitting bugreports, fixes, ports to my tools. This gives me more time to focus on the big picture.
Finally, I belive that automation of the sort we are doing at Mozilla is something that has been missing from open source development practices and it will catch on once people realize what they’ve been missing. Reducing those WTF moments will help people think positively.
Static Analysis and Refactoring Tooling Updates
July 9th, 2008
Hydras
I am close to landing a flow check. Turns out, it is super-easy to introduce new analyses into Mozilla due to a very nice build system hooks setup by bsmedberg.
Since coming back from the GCC summit I have forward-ported our GCC patches to GCC trunk. The FSF legal paperwork came through today so I posted the first and biggest patch to the GCC for review.
I am not sure if I mentioned this before, but the C port of Dehydra is somewhat operational. It doesn’t yet have access to function bodies, but type traversal should work. Unfortunately, the C frontend has less features(pretty printing sucks, locations are even less reliable, etc) and thus is less awesome to work with than the C++ frontend.
jst was awesome enough to list some interfaces that need some outparamdelling. The list is here (in the content/ section). This lead me to spent some time making outparamdel’s output prettier. There are still some improvements to be made, and I will be making them in the near future. However if someone is interested in refactoring of this kind land in the near future, they could easily complete outparamdel’s work with some clever scripting and a bit of manual labour. Sure beats doing the entire thing manually. From outparamdel’s perspective last 10% appear to be slightly painful and might take some time.
Here is a patch that takes about 30seconds to produce.
Another exciting aspect of this is that a certain emacs wizard has confirmed that it would be possible to feed emacs such a patch file and have it correct indentation for the affected areas only.
I am also very excited that a certain volunteer came forward and decided to start improving some of the stomach-turning areas of Pork. Hopefully in the near future we’ll modernize the C++ a little bit and a user’s first reaction wont be: “What the hell, why can’t I do ‘using namespace std;’”.
To this end I have filed a bug to write a renamer tool so we can dogfood renaming of unfortunately named pieces of code.
OSCON
The plan is to have some sort of a minisession on our static analysis efforts at Mozilla. So if you are attending OSCON and are interested in doing exciting things to depressingly large amounts of code, drop me a line.
Where is the sanity in the C++ std library?
July 8th, 2008
Dear lazyweb,
Please explain to me why the following code works the way it does. From looking at the following code and stringstream::str(), stringstream::str(string) docs the behavior of the following code does not make sense to me.
#include <sstream>
#include <iostream>
using namespace std;
int main(int argc, char**) {
stringstream ss(”foo”);
cout << ss.str() << endl;
ss << “bar”;
cout << ss.str() << endl;
ss << “more”;
cout << ss.str() << endl;
}
Pork 0.9 in the wild
June 30th, 2008
Those who would like to play with Pork, but are allergic to pulling sources from version control can now download an actual pork release. Now someone needs to hook this into a GUI to provide easy Eclipse-style refactoring for C++.
Next Page »