Recently, we ran in to a problem with our web content sync setup.
Old Setup:
- Host all web bits on two admin servers
- Pull via rsync from admin01 server to the apache servers on 5 min staggered cron
That setup worked fairly well for us for quite a long time, but it doesn’t scale. At about 18 or 20 apache servers pulling at 5 min intervals the admin server was constantly pegged from all the rsync processes scanning the file system.
We needed something with state, but something that was simple. Revision control has state, but would the operations be quick enough to be useful? It turns out Git was pretty well suited for the task.
New setup:
- Host all bits on admin servers
- Commit bits on admin01 on a 5 minute cron in to git
- Pull commits via Git to apache servers
This new setup scales very well, because we only need one file system scan per 5 minutes instead of 20+. The Git fetches are very fast.
Initial installation:
admin01:/etc/xinetd.d/git-daemon:
service git
{
disable = no
type = UNLISTED
port = 9418
socket_type = stream
wait = no
user = nobody
server = /usr/bin/git-daemon
server_args = --inetd --export-all --verbose /www
log_on_failure += USERID
}
Admin01: Import /www:
cd /www
git-init
git-add .
git-commit -m 'Initial Import'
Apache Servers initial setup:
git-clone git://admin01/www
Commit Cron on admin01 (add any new files and delete any removed files from the repo):
#!/bin/bash
lockfile="/tmp/git.lock"
if [ -f $lockfile ]; then
if kill -0 $(cat $lockfile); then
echo "$0 appears to be already running."
exit 1
fi
fi
echo $$ > $lockfile
cd /www
/usr/bin/git-add .
/usr/bin/git-commit -a -m "AUTO COMMIT: `date +%FT%T`"
rm -f $lockfile
Fetch Cron on the apache servers (fetches origin and then resets the /www to origin):
#!/bin/bash
cd /www;
/usr/bin/git-fetch && /usr/bin/git-reset --hard origin;