Archive for May, 2007

Partitioning Fun in PostgreSQL

Tuesday, May 15th, 2007

Last week I learned a few things (the hard way) about PostgreSQL (pgsql) partitioning:

  • You really have to read the “caveats” part of the manual (scroll down, very bottom)
  • Server config matters (SET constraint_exclusion = on;) if you want to avoid unnecessarily checking out-of-bounds partitions
  • Child tables don’t inherit permissions, so owning the parent table won’t automagically grant you access to child tables — which in this case prevented me from setting up my explicit indexes.
  • Foreign key constraints on a parent table are not inherited. This is actually a bug in pgsql. They plan on changing “CREATE TABLE LIKE” in pgsql 8.3 but fixing this for INHERIT statements is a bit different since it means inheriting for the lifetime of the parent->child relationship (instead of one-time-only during creation). For more, read chizu’s thread on the issue — he actually submitted a patch that fixed things for table creation. :)

Long story short, partitioning is tricky business. For more, see bsmedberg’s blog, where he talks about query performance in queries spanning multiple partitions and his battle with the postgresql query planner.

From the very beginning, even before you start messing with the query planner, I think it’s important to understand the pitfalls of inheritance in pgsql and how it will affect your use of partitions.

So just be extra careful when you read the manual and make sure you’re aware of these issues when you plan your schema. If you’re not, it’ll come back to bite you.

Tagging in SVN

Tuesday, May 1st, 2007

So we decided to use SVN because it’s cooler and more modern. It’s not so bad, especially for webapps. We like the atomic commits, webdav, yadda yadda. You know why it’s better than CVS we aren’t going to regurgitate that to you.

But when deploying we ran into some issues with SVN merge. If you always do updates in a batch and never have an “oh crap update this now” update then you’re fine. But as we all know, stuff happens. So when we have tried to merge the entire tree after merging single files manually we’ve had some interesting results — conflicts in certain places and an overall headache.

The typical way of tagging is using an svn cp and placing a snapshot of a particular trunk revision into the tags directory. This SVN tagging approach using svn merge was supposed to be analogous to our previous way of tagging prod checkouts with CVS.

However, as clouserw put it, “the recommended strategy is only useful when merging the complete trunk to a tag, and when individual revisions/files are involved, svn eats itself”. So to avoid the merge issues with SVN that we found using svn merge we have come up with a new method that uses SVN externals instead.

Both methods have weaknesses. SVN merge has the “eating itself” problem that presents a lot of conflicts and makes merging painful because of all the manual conflict resolution. When merging the entire tree, though, it is desirable because it doesn’t have the weaknesses below. Using SVN externals, you run into some interesting issues but you don’t get the SVN eating itself problem:

  • Any locally modified files not in SVN (even when ignored) will kill a prod update in the event where an external must be updated (you get stuff like “svn: Failed to add directory ’site/app/tests’: object of the same name already exists”)
  • SVN externals cannot be a single file (they must be a directory so SVN can store info in an explicit .svn directory related to that checkout)

Aside from those two issues, just remembering the trunk revision we want in prod has been working for us. We’d like to find a good middle ground that doesn’t have us taking extra steps to update production when deploying trunk changes to the lives site.

So, if you guys have any input please let us know. We’re always open to suggestions.