Monthly Archives: March 2009

Valgrind + Mac OS X update (March 17, 2009)

Another month has passed since I last wrote about my work on the Mac OS X port of Valgrind.  In that time 126 commits have been made to the DARWIN branch (and a similar number to the trunk).  I’ve done a lot of them, but Julian Seward has found some time to work on the DARWIN branch and so has been doing some as well.

Here are the current (as of r9455) values of the metrics I have been using as a means of tracking progress.

  • The number of regression test failures on Linux was: 484 tests, 4 stderr failures, 1 stdout failures, 0 post failures (which I’ll abbreviate as 484/4/1/0). It’s now 484/0/1/0.  I.e. the number of failures went from 5 to 1, and that one failure occurs on my machine even on the trunk (it’s a bad test).  In other words, the branch works on Linux as well as the trunk.  Now that this metric is the same on the branch as the trunk, I won’t bother tracking it in the future.
  • The number of regression test failures on Mac was 402/213/52/0.  It’s now 422/172/41/0. I.e. the number of failures went from 265 to 213.  Also, 20 extra tests are being run — a broken CPU feature-detection program meant that a number of tests that should have been running were not, and this has been fixed.  Once again, this is the most important metric, and it’s improving steadily, but there’s still a long way to go.  One encouraging thing here is that 121 of these failures (more than half) involve the tools Helgrind, DRD and exp-Ptrcheck, which are three of the less-used tools in the Valgrind distribution, and which are all completely broken on the branch, and which I haven’t really looked at yet precisely because they are less-used.  The other 92 failures involve Memcheck and Nulgrind (the “no-instrumentation” tool, failures for which indicate problems with the testing of Valgrind’s core).  A lot of these are problems with non-portable tests, rather than the Darwin port’s functionality.  Furthermore, the tools Cachegrind, Callgrind, and Massif pass all of their tests.
  • The size of the diff between the trunk and the branch was 41,895 lines (1.5MB).  It’s now 38,248 (1.3MB). But note, once again, that this is not a very useful metric.  I just scanned through the diff and there’s not a great deal of differences in the diff than can be merged before we reach the point of the big branch-to-trunk merge.

Functionality improvements are as follows.

  • Basic signals are now supported, thanks to Julian.  This accounted for a lot of the new test passes.  This also means that debug builds of Firefox run successfully!
  • Some extra system calls are handled.
  • 64-bit builds are working.  To configure Valgrind for them, pass to ./configure the option –build=amd64-darwin.  64-bit Valgrind is quite slow, it does some very large mmaps at startup which take several seconds.  This will need to be fixed.  This also hasn’t been tested as much as the 32-bit version, and passes fewer tests.

I’m taking three weeks of vacation starting on Thursday, so progress on Valgrind+Darwin will be minimal over the next month.  But I will be visiting Mountain View early next week (Monday, March 23 and Tuesday, March 24) so I’ll be able to actually meet some of the people I work with!  I may also give a talk about Valgrind, depending on whether it can be scheduled.  Any suggestions for things to talk about are welcome.

atol() considered harmful

Here’s what sounds like a simple question: in C, what’s the best way to convert a string representing an integer to an integer?

The good way

One way is to use the standard library function called strtol(). There are a number of similar functions, such as strtoll() (convert to a long long) and strtod (convert to a double), but they all work much the same so I’ll focus on strtol().

Here’s the prototype for strtol() from the man page on my Linux box:

#include <stdlib.h>
long int strtol(const char *nptr, char **endptr, int base);

The first argument is the string to convert.  The third argument is the base;  if you give it zero the base used will be 10 unless the string begins with “0x” (hexadecimal) or ’0′ (octal).  Any whitespace at the front is skipped over, and a ‘+’ or ‘-’ just before the digits is allowed.  The return value is the long integer value that the string was converted into.

The second argument is a call-by-reference return value.  If it’s non-NULL, it gets filled in with a pointer to the first non-converted char in the string.  If the string was entirely numbers, such as “123″, endptr will point to the terminating NUL char.  If the string had no valid integral prefix, e.g. “one two three”, endptr will point to the start of the string.  If the string was “123xyz”, i.e. contains numbers followed by non-numbers, endptr will point to the ‘x’ char in the string.

There are two good things about endptr.  First, it lets you do error-detection.  For example, if you are expecting a number without any extra chars at the end, endptr must point to a NUL char:

char* endptr;
long int x = strtol(s, &endptr, 0);
if (*endptr) { /* error case */ }

Second, it allows more flexible parsing.  For example, if you need to parse a string consisting of three comma-separated integers (e.g. “1, 2, 3″) you can do this:

int i1 = strtol(s,        &endptr, 0);  if (*endptr != ',')  goto bad;
int i2 = strtol(endptr+1, &endptr, 0);  if (*endptr != ',')  goto bad;
int i3 = strtol(endptr+1, &endptr, 0);  if (*endptr != '\0') goto bad;
...
bad: /* error case */

(Nb: This example allows whitespace before each number but not before each comma.)

Finally, strtol() returns 0 and sets errno to EINVAL if the conversion could not be performed because the given base was invalid.  Also, it clamps the value and sets errno to ERANGE if an overflow or underflow occurrs (as could be the case in the above code examples).

The bad way

Another way is to use the standard library function called atol().  (Again, it’s one of a family of similar functions, such as atoi() and atof().)  atol() is equivalent to this call to strtol():

strtol(str, (char **)NULL, 10);

Seems like a nice simplification, especially if you know you want base 10 numbers, right?  The problem is that there is no scope for error-detection.  atol(“123xyz”) will return 123.  atol(“John Smith”) will return 0.  Even better, atol() doesn’t have to set errno in any case!  (And this means it’s actually not quite equivalent to the above strtol() call).

It’s quite amazing, really;  I’m not aware of any other C standard library functions that actually make error-detection impossible.  Furthermore, the documentation involving atol() doesn’t make this clear.  Compare this to the dangerous function gets():  on my Linux box, the man page has a BUGS section that says “Never use gets().”  The man page on my Mac is a little less forthright;  in the SECURITY CONSIDERATIONS section it says “The gets() function cannot be used securely.”

The POSIX specification is a little more forthright about atol():

The atol() function is subsumed by strtol() but is retained because it is used extensively in existing code. If the number is not known to be in range, strtol() should be used because atol() is not required to perform any error checking.

The only use I can think of for atol() is in the case where you’ve already scanned the string for some reason and know it contains only digits.

A few of the options in the current release of Valgrind (3.4.0) that take numbers currently accept non-numbers because they are implemented using atoll().  (Most of them use strtoll(), however, and the discrepancy has been removed in the trunk.)  For example, if you pass the option –log-fd=foo it will interpret the “foo” as 0.  Lovely.

Valgrind on iPhone

With Valgrind now working reasonably well on Darwin, it’s possible to run Valgrind on an iPhone. Well, not directly on an iPhone, because the Darwin port doesn’t work on ARM, but you can run Valgrind on x86 binaries built against the iPhone Simulator SDK.

However, it’s a little tricky, because you can’t run Valgrind from the command line in this environment.  Landon Fuller explains here the small hack that is required to get around this limitation.