Main menu:

Site search

Categories

Archive

nothrow/NS_OK-always

Yay, my first Mozilla blog post! Brief background: I’m working on static analysis-related Mozilla projects with tglek. In fact, I seem to have already become a maintainer of Pork-Oink and Dehydra Classic.

Anyway, right now I’m trying to finish up a static analysis that will list all the nsresult-returning methods that always return NS_OK (i.e., “never” “fail”) and will also say for each nsresult-yielding call site in Firefox whether that call site always returns NS_OK. The initial purpose was to help determine the feasibility of switching Firefox to use C++ exceptions instead of nsresults. If a method returns NS_OK, the switch is easy, because no exception handling is needed at all. This work also overlaps with the outparamdel project, which will convert methods that return a value in an out parameter and always return NS_OK  into methods that return a value directly. As a final application, I hope to be able to get a reasonably accurate list for each method of which error codes it can return.

Here’s an outline of how it works. I started by solving a simplified version of the problem, which is to get all the NS_OK-always methods that don’t call other methods. This is a lot easier than the general problem because each procedure can be analyzed in isolation. To solve this simple version, I used a flow-insensitive analysis, which is just a fancy way of saying the analysis doesn’t care about control flow paths or the order in which statements are executed, which is in turn just a fancy way of saying I used what most people think of as a type checker. The type system applies to nsresults only, so a type is just a set of nsresult values. Based on that type system, I wrote a simple type inference, and the type inferred for the return value of the method tells me what nsresults can be returned.

The next step is extending the analysis to cover the case where method foo() returns the result of calling method bar(). The way to do this, of course, is to first analyze bar(), then assign the return type of bar() to the call inside foo(), then analyze foo() normally. But this won’t actually work until 2 problems are solved.

First, many methods in Firefox are virtual, so when you see a call to Base::foo, the method being called could be Derived::foo, Derived2::foo, etc. So there might be more than one target method, and it’s not obvious what the targets are. The solution is to start by building a call graph for Firefox, which just means a mapping from each call site to the set of possible target methods. There are many ways of building call graphs, but I started with the simplest, which is just to look at the inheritance hiearchy. (More powerful analyses analyze the code around the call site to try to figure out exactly which derived classes can actually show up there. But that’s too hard for now.) So, as a byproduct of the NS_OK analysis, I also have data that can be used to build a call graph for Firefox, which might be useful for cross-reference documentation. (I didn’t bother building the call graph yet because for this analysis all I need is a database that I can query to get call graph information for nsresult methods only.) It might be nice to have a web app that gives call-graph cross-references and results of other program analyses we devise.

The second problem is that methods can be mutually recursive: foo() might return the result of bar(), which can return the result of foo().  Clearly, my suggestion above of analyzing one of the methods first won’t actually work in this case, and this case is common in Firefox, so I have to do something else. The something else I have in mind is to express the problem as a subset-based constraint system, which I can explain best by an example. Let’s say we have foo() and bar() as above, and foo() can directly return NS_OK and NS_ERROR1, and bar() can directly return NS_OK and NS_ERROR2. The key thing about foo() returning the value of bar is that whatever the set of results foo() can return actually is, it has to contain the set of results that bar() can return. We can express everything we know as a constraint system:

foo_results contains { NS_OK, NS_ERROR1 }

foo_results contains bar_results

bar_results contains { NS_OK, NS_ERROR2 }

bar_results contains foo_results

The answer to our original question, what values can these methods return, is the least solution of this constraint system. (Least means with the smallest sets in the answer.) We can read off the answer here by hand, but for Firefox, the system will be much bigger and so we need an automatic solver.  The basic solver algorithm is very elegant:

  initialize solution to empty set for every return-value set
  repeat
    for each constraint C:
      if C is not satisfied in the current solution:
        add elements from contained set to the other to satisfy C
  until all constraints are satisfied.

The proof that this algorithm is correct is also elegant: (a) Clearly, if it terminates, all constraints are satisfied. (b) If we ever reach the point where every return-value set contains every possible error value, all constraints are satisfied, because they’re all “contains” constraints, so the algorithm terminates. (c) Each time around the loop, we add at least one element to some answer set, so can go at most a finite number of steps before reaching the all-value solution, where the algorithm stops.

It took me all of 15 minutes to code this algorithm in Python, and it works (after I fixed a bug with having the termination condition be the opposite of what it should be). But will it work on all of Firefox? If there are N nsresult methods (N~=30000, by the way), M constraints, and K possible error values, then we can go around the loop up to NK times and the inner loop NMK times. Each time through there we have to do a binary set operation on sets of size K, which we’ll pretend is O(K). Thus, the algorithm is O(NMK^2). That might be a big number for Firefox.

I know a few tricks for speeding up the basic solver, but my goal here is to analyze and refactor Firefox, not write awesome constraint solvers, so I’ll try the simple stuff first and use the easiest thing I can get away with.  In particular, there’s an existing solver called Banshee. Or maybe the Python program will be fast enough after all.

The long-term goals here are to remove nsresults from Firefox and replace them with C++ exceptions. (See also outparamdel.)  (And of course we’ll have to test exceptions in various ways to demonstrate that they actually are a good idea. But I’ve been told several times exceptions would make life easier for Firefox hackers.) There’s a lot more to be done to reach that goal, which will hopefully be the subject of future blog posts.

Comments

Comment from Mr WordPress
Time: January 15, 2008, 9:00 am

Hi, this is a comment.
To delete a comment, just log in, and view the posts’ comments, there you will have the option to edit or delete them.

Pingback from Taras’ Blog » Blog Archives » GCC + SpiderMonkey = GCC Dehydra
Time: January 17, 2008, 4:16 pm

[...] tuned for more exciting developments regarding regaining control over source code here and on Dave Mandelin’s blog. Posted in dehydra | Trackback | del.icio.us | Top Of [...]

Comment from Joshua Cranmer
Time: January 17, 2008, 5:33 pm

Considering that I’m looking at writing a zealous static analyzer for Java involving keeping tight bounds on the values of inputs, I’ve thought about how best to deal with such recursive loops. This is the scheme I’ve come up with:

1. Initialize the parameters to the empty set for input values and the default value for output values for each function.
2. Create a processing queue for all functions.
3. While the queue is not empty, for the next function:
3a. Update the possible values for the function based off of the values we see.
3b. If the output values have changed, then we add to the queue (if not already there) all functions which rely on the current function.

(For analysis, N = function #, M = function calls/function, K = output values/function)
Steps 3a and 3b should be O(M) for each function (assuming that we have an O(1) function f is in the queue); steps 1 and 2 can be combined into one O(N) operation. I count that the maximum number of times a function can be pushed into the queue is O(N*K), so that the total runtime comes to be O(N*K*M), an improvement by one power of K, which I expect is less than 10 anyways.

Pingback from GCC + SpiderMonkey = GCC Dehydra · Get Latest Mozilla Firefox Browsers
Time: January 17, 2008, 6:58 pm

[...] tuned for more exciting developments regarding regaining control over source code here and on Dave Mandelin’s blog. addthis_url = ‘http%3A%2F%2Fgetfirefoxbrowsers.com%2F2008%2F01%2Fgcc-spidermonkey-gcc-dehydra’; [...]

Pingback from GCC + SpiderMonkey = GCC Dehydra · Get Latest Mozilla Firefox Browsers
Time: January 23, 2008, 5:06 am

[...] tuned for more exciting developments regarding regaining control over source code here and on Dave Mandelin’s blog. addthis_url = ‘http%3A%2F%2Fgetfirefoxbrowsers.com%2F2008%2F01%2Fgcc-spidermonkey-gcc-dehydra-2′; [...]

Comment from travesti
Time: November 13, 2009, 11:52 am

THANKS

Comment from escort bayan
Time: December 17, 2009, 7:12 pm

thanks admin

escort bayan

escort bayan

Comment from Escort Bayan
Time: January 19, 2010, 5:16 pm

Thanks admin

[url=http://www.escortbayanlariz.biz/]Escort Bayan[/url]

Comment from Escort Bayan
Time: January 19, 2010, 5:17 pm

THANKS

Write a comment