Anatomy of an SDK update

October 2nd, 2009

Over the course of the past week or so I’ve been working on rolling out the Windows 7 SDK to our build machines. Doing so presented two challenges: Getting the SDK to deploy silently and properly, and updating the appropriate build configurations to use it. Neither of these may sound very challenging, and indeed, they didn’t to me either, but because of a combination of factors this ended up becoming a week long ordeal. In this post I will attempt to detangle everything that happened.

Let’s start with the actual SDK installation. Unlike most other reasonable packages, the Windows 7 SDK is not distributed as an MSI package, but rather a collection of MSIs wrapped in an EXE. Unfortunately, this EXE doesn’t enable you to do a customized, silent install – the precise thing we need. Vainly, I thought I could figure out the proper order and magic options to install the enclosed MSIs properly. Needless to say, this failed. To work around this I fell back onto using an Autoit script that would click through the interactive installer for me. It took some fuss, but not too much difficulty to get that working.

Now, the fun part (of deployment). We use a piece of software called OPSI to schedule and perform software installations across our farm of 80 or so Windows VMs. OPSI runs very early in the Windows start-up process, and actually executes as the SYSTEM user. Well, it turns out that the Windows 7 SDK must be installed by a full user, not the SYSTEM account. This seems unnecessary, as we’ve deployed other SDKs through OPSI in the past without issue. After trying to fake it out by setting various environment variables I turned to the OPSI forums for some help. (As an aside, the OPSI developers have been fantastic in their support of our installation, many thanks to them.) It turns out that I’m not the first person to hit problems like this. They pointed me to a template for a script that works around such an issue. The solution ends up being:

  1. Copy installation files to the slave
  2. Create a new user in the Administrators group, set that user to automatically login at next boot
  3. Reboot, and run the package installation at login
  4. Restore the original automatic login, reboot
  5. Cleanup (delete installation files, remove the created user)

This is obviously quite hacky, but it gets the job done.

So! With that in hand (and in repo) we set the SDK to deploy over the course of Wednesday night and Thursday morning. Overall, this went smoothly. For a reason (which I haven’t yet figured out) some of the slaves needed some kicking to do the installation properly.

Remember how I said part 2 of this was updating the build configurations? I had planned to do this on Friday, and even posted a patch in preparation. Well, it turns out that MozillaBuild likes to be smart and find the most recent SDK and compiler for you. This completely slipped my mind while I was doing the deployment and a result, all builds from Thursday (yesterday) morning to Friday (today) morning, including those on mozilla-1.9.1, were done with the Windows 7 SDK. This went unnoticed most of Thursday until I was doing a final test of my build configuration patch.

Here’s where the fun starts for this part. After discovering I’d accidentally changed the SDK for everything I went into a bit of a panic and rapidly started testing some fixes out in our staging environment. During the course of this I discovered that things were worse than I thought. Most builds were using the Windows 7 SDK, but not the “unit test” ones. So we weren’t even using the same SDK for all the builds for a given branch! Getting all of that sorted out was compounded by all of the iterations of path styles (c:/ vs. c:\ vs. /c/) I had to try before I found the magic combination. In the end, I discovered a few things:

  • If you’re specifying LIB/INCLUDE/SDKDIR in a mozconfig, you must use Windows-style paths
  • If you’re specifying PATH in a mozconfig, you CANNOT use Windows-style paths – you must use MSYS style
  • You can’t test for these things properly without clobbering

As I write this the first set of builds that all use the correct SDK are finishing up, and this deployment from hell appears to be nearly over. I want to express a special thanks to the OPSI developers, who were very helpful, and to Nick Thomas and Chris AtLee, for their patience with my countless iterations of build configuration patches. As a final note, let me state explicitly which SDK is being used where:

  • Windows Vista SDK (6.0a): mozilla-1.9.1 builds
  • Windows 7 SDK (7.0): mozilla-central, mozilla-1.9.2, TraceMonkey, Electrolysis, and Places builds

WinCE and WinMO builds are unaffected by this deployment.

MozillaBuild wiki page

December 27th, 2007

Yes, MozillaBuild finally has a home that isn’t in the form of blog posts or bugs. The page is a bit barren right now, but it *does* contain links to release notes, which some people have asked for.

You can watch this page for future news and updates about MozillaBuild.

UPDATE: A link to the wiki page, behold: http://wiki.mozilla.org/MozillaBuild (thanks reed)

MozillaBuild 1.2rc2

December 21st, 2007

Thanks to everyone that helped test MozillaBuild 1.2rc1. A bug with cvs was found that makes cvs checkouts, checkins, and maybe other things fail in weird ways. RC1 was the first MozillaBuild to use cvs 1.11.22, for rc2 we’ve switched back to just plain 1.11. We’ve also added a couple of small enhancements. Notably, the font issue with ClearType/widescreens has been fixed. For a more complete list of what’s in rc2 checkout the dependent bugs in bug 406085.

I should note that you must run start-msys*.bat “as administrator” on Vista.

Hopefully this next part will be unnecessary but if you *do* find any bugs please file them in mozilla.org:MozillaBuild.

MozillaBuild 1.2rc1

December 19th, 2007

While working on release automation it was discovered that the ssh version that ships with MozillaBuild 1.1 is archaic. This has bitten us in a few places (try server, stage migration, for example).. I managed to upgrade SSH on MozillaBuild 1.1 without too much trouble and ended up putting together a MozillaBuild patch for it.

Ted asked me to setup a MozillaBuild build environment. “How hard could that be”, I thought to myself. I ended up spending 3 days getting a VM/build environment setup (someone should really make some sort of installer that sets up an MSYS build environment for you…MSYSBuild, maybe? ;-). Ted and I both spent additional time fighting with DLL rebasing. (I have to admit though, I don’t understand DLL rebasing at all, even after it was explained to me.

Ted eventually found a solution for this, and now, there is a MozillaBuild 1.2rc1. Anybody who can help us test will be greatly rewarded (with a new MozillaBuild sometime soon ;)!