Cold Ts

Alice ran the first cold Ts on her staging machines today. I ran it locally against today’s nightlies and produced these numbers:

Shiretoko (3.5)   10266.25 ms
Namoroka  (3.6)   10017.55
Minefield (trunk) 11023.65

If we had had cold Ts when we landed whatever caused Minefield’s 10% regression over Namoroka, we would have caught it.

Cold Ts is Ts run under simulated cold startup. Right now we’re able to simulate cold startup on OS X only, so the test is not yet switched on for Windows and Linux. The procedure on Linux, echo 3 > /proc/sys/vm/drop_caches, requires root privileges, which either opens a security hole on the Tinderboxen or forces a sudo prompt. Windows is worse: Nothing we know of works well or at all except for reboot, and the Talos infrastructure makes rebooting impractical. Any help here would be greatly appreciated.

Comments (12)

  1. If it’s hard to do as part of Talos, why not do it separately from Talos?

    Friday, September 4, 2009 at 10:26 pm #
  2. adw wrote:

    Jesse, that’s a good point that no one has brought up. The Firefox team has a goal of reducing startup by a certain (large) percentage, and under that goal we will locally run the test and any others just to get occasional or one-off data. No one has talked about running a rolling, shadow Talos, though, something that’s tied into the repos, which is ultimately what we’re after. I wonder whether setting that up would be harder than modifying Talos in the first place.

    Friday, September 4, 2009 at 11:49 pm #
  3. Prasino wrote:

    Can having a large history influence cold start time significantly? If yes, would it make sense measuring it with a realistic profile too rather than just a clean one?

    Saturday, September 5, 2009 at 12:02 am #
  4. adw wrote:

    Prasino, Alice and David have recently been working on dirty profile Ts (bug 414660 among others), and in fact it’s been on the Tinderboxen since the middle of last month. If you check them and search for strings like “ts_places”, you’ll see that large history and bookmarks databases don’t make quite the warm startup impact you might think. (They really impact shutdown, though. There is probably a fair amount of low-hanging fruit there.) Cold startup is a different story and one we have not measured yet. I think it shouldn’t be too hard to hook up the dirty profile Ts to cold Ts. Personally I’ve been trying to get a baseline idea of where the problems are and what can be improved without the profile overhead; even with a fresh profile cold startup sucks. (There’s more to profiles that impacts startup than history by the way — session restore and add-ons for example.)

    Saturday, September 5, 2009 at 12:37 am #
  5. Anonymous wrote:

    Allowing sudo for a single command (”sudo /usr/local/sbin/simulate-cold-startup”) shouldn’t introduce a security hole; it would only mean that the user account which runs Talos could decrease system performance. Seems well worth doing on a test system.

    Saturday, September 5, 2009 at 12:56 am #
  6. I’m sure the prompt with sudo can somehow be worked around, Windows is a problem though (like so often)…

    In any case, could you also test Gran Paradiso (3.0) as a comparison so we have another baseline? For hysterical reasons, even a comparison against FF2 might be interesting – I hope we actually did improve somewhere ;-)

    Saturday, September 5, 2009 at 7:19 am #
  7. ant wrote:

    Seems easy enough – just have a script or something allocate all free RAM plus a few MB extra. That should flush the cache out of memory at least.

    Saturday, September 5, 2009 at 10:06 am #
  8. Anonymous wrote:

    About the uncaught trunk regression: isn’t it possible to test old builds too, and determine the regression date after the fact?

    Saturday, September 5, 2009 at 4:01 pm #
  9. adw wrote:

    Thanks for the comments everyone.

    @Anon #1: Is there a way to suppress the password prompt without compromising security? Could you point me to some documentation?

    @Robert: Sure, thanks for the idea. I’ll get to it next week.

    @ant: A kind soul has already written a program that does just that. (It’s listed on the notes page I linked above.) It does work, but problem is, it grinds your system to a halt for ten minutes as it does its business. The more RAM, the longer it takes. Ts is the average of 20 runs, so that’s three hours and twenty minutes just to complete one Ts. Maybe we’ll have to use it, though.

    @Anon #2: Sure, I meant that we would have caught it as we landed it.

    Saturday, September 5, 2009 at 4:46 pm #
  10. bws42 wrote:

    I’ll reply for anon. If you look at the man page for the sudoers config file it will show you the line you can add to allow the talos account to execute a single command without a password prompt. The line would look something like this:

    talos talos_machine = NOPASSWD: /path/to/script/to/clear/cache

    where the script would be owned by root to prevent modification. I used a similar setup when I need a web interface to reload firewall rules after editing.

    Sunday, September 6, 2009 at 8:20 am #
  11. Zack wrote:

    Or you could write a short C program that does the equivalent of “echo 3 > /proc/sys/vm/drop_caches” and install it setuid to root, executable only by a group that includes the talos user (but, ideally, no one else).

    Monday, September 7, 2009 at 12:26 pm #
  12. Zack wrote:

    Like so:

    #include
    #include

    int
    main(void)
    {
    int fd = open(”/proc/sys/vm/drop_caches”, O_WRONLY);
    if (fd == -1) return 1;
    write(fd, “3\n”, 2);
    close(fd);
    return 0;
    }

    If you go this way I strongly suggest using dietlibc (http://www.fefe.de/dietlibc/) as glibc’s support for static linking is incredibly half-assed these days. dietlibc will produce a binary that you can easily disassemble and inspect for safety.

    Monday, September 7, 2009 at 12:45 pm #