Archive for February, 2008

MySQL Proxy: so promising, but not quite there?

February 25th, 2008 | Category: MySQL

I will admit in advance, this post is based on only a few days of working with MySQL Proxy.  I’ve emailed the author a couple of days ago, but not heard back yet, and I’m stumped as to why this isn’t working as advertised.  Theoretically it supports read/write splitting of queries to different backends, but when put to the test it seems that there are problems with connection pooling.  I.e., the second it tries to reuse a connection, that connection terminates and the client gets an error.

The concept of a MySQL Proxy is quite interesting, however, especially from the perspective of being in the IT team and having the quickest way to solving problems sometimes not involve fixing the app.  Let’s talk about a particular issue, the one thing I am presently working on solving: put a Bugzilla instance in Europe to provide a much better experience for people who need to use Bugzilla and are thousands of miles from the main instance in California.

The first step was to setup a mini-cluster in our datacenter in Amsterdam.  Matthew did most of this work, and then handed the project off to me after the machines were generally setup.  I took over and did some work to get the machines into monitoring and our configuration management system.  The basic setup is that we have a database slave running MySQL and replicating from the main database in the US.  A few webservers are then setup with this database as the Bugzilla “shadow” database, i.e., a read-only slave.

The problem is that Bugzilla doesn’t really use the shadow role very often.  Most of the queries, even reads, are sent over to the master database in the US.  This causes really problematic latency issues as it sends a dozen queries to build a page.  If each one takes ~170ms one-way to pass data, then we easily see page load times of 5-10 seconds or more depending on how many queries are done to generate the page.  This makes the site nigh unusable.

The typical method of fixing this involves changing the application to support read-only slaves.  While that’s certainly an option and one that should definitely be done, for various reasons it’s not the most attractive option in this case.  Enter MySQL Proxy.  With the promise of read/write splitting, it seems like it would be a great drop-in solution for solving this problem.

Theoretically, MySQL Proxy is setup on the database slave in the Netherlands.  It’s configured to point to the US master as the writable database and the local slave as the read-only database.  The application (Bugzilla-EU in this case) is then configured to use the Proxy address as the database.  In theory, with this done, we then will solve the latency problem by sending read queries locally and write queries across the pond.  Hurrah!

The problem comes down to this: the software doesn’t work (version 0.6.1) in this use case.  I spent some time trying to debug it, going so far as to step through the core C code to try to determine how it does what it does.  I hit a roadblock of time to invest, unfortunately, and decided that I couldn’t justify spending another day monkeying with the bits to find the problem.  I’ve posted on the MySQL Support Forum for MySQL Proxy and hope to get a response at some point, as this would be great software to use for this problem.

Well, fingers crossed!

4 comments

Release parties and database load.

February 08th, 2008 | Category: MySQL

Dealing with a setup where most things are based around huge spikes is interesting.  It’s different from my previous work at Six Apart where generally we had daily peaks and troughs but there was rarely a case where we could say “this is going to cause an order of magnitude increase in traffic overnight.”

Here at Mozilla, we do.  Firefox 2.0.0.12 was released to the world yesterday around 3PM, and over the next 18 hours the traffic started ramping up.  At this point, the database that does most of the queries for addons.mozilla.org (known as AMO) are pushing 3,000 database queries per second.  That’s a heckuva lot of data to be pushing around, and the database is definitely hitting some load limits.

At this point, the database is CPU limited in the processing of queries and there’s not much that can be done on the database itself.  The queries are all hitting indexes, the tables are not too huge, everything is cached in memory, etc.  There are some minor tweaks that can be done to the table cache (it was at 450 and we were slamming into it), but otherwise it’s just going about as fast as it can go.  The query cache is taking 80% of the traffic, too, which is nice.

Since the traffic is so read heavy, adding a second database would be a real win in this situation.  We would get nearly double the performance by having a second database and having the application randomly select one (with or without weighting).  Another approach would be to use a caching system like memcached to store the query results for some period of time.  There are advantages to both approaches.  Generally the first is easier to implement as the application already understands talking to databases, however the second will usually buy you more performance for less money.  Tradeoffs.

Still a lot of work to do, and today is busy.  I’m trying to get in the habit of updating, and missing a week would be most unfortunate, so you get this rather detail light post.  Enjoy.

No comments