Keeping a directory in sync with SVN

I keep my CFEngine policy (and some other similar things) in a Subversion repository.  The progression from unit test to integration test to production is handled by using tags.  Basically, the integration test policy is the trunk, unit tests are done by branching the trunk, and promotion to production is done by tagging a revision of the trunk with a release name (monthly_YYYY_MM.POINT). But this discussion doesn’t need to be just about that approach; my solution should work for pretty much anyone who needs a directory to match a portion of a subversion structure.

In order for this to integrate with CFEngine, I need to have a filesystem available that the policy masters can read from and distribute to clients.  I used to just check the repo out and have a scheduled job do an “svn update” every 5 minutes.  But I hate polling; event-driven solutions are almost always better.  In this case, that 5 minute polling not only was checking in to the subversion server way more often than changes actually happened, it also introduces another 5 minute maximum delay into policy propagation.  I want as close to real-time as I can get.  There’s also the problem of .svn directories occasionally getting corrupted, which requires another full checkout.

So, I could use one of the WebDAV filesystems out there.  The problems with those are first performance, as I’ve got over 110K files involved in the different policy branches and stuff.  Assuming the performance goes away, the other problem is that I use some SVN externals to move directories around; the policy isn’t laid out on the clients just like it’s laid out in Subversion.  The DAVfs implementations I could find don’t really deal well with externals.

Thus, I need to roll my own solution.  The idea is basically to have something in the post-commit and post-revprop-change hooks which will update the filesystem to match subversion whenever anything changes in the repo.  The second part is to have something watching the filesystem for unauthorized changes (i.e. anything not done through the hook script) and revert any such changes.

I’m teaching myself Python, so both solutions will be implemented in Python.  I’l host the code on github, since that seems to be what all the cool kids are doing today.  And there will eventually be a couple more blog posts here about how it works.  The links should show up in the comments below as trackbacks, or they’ll be in the subversion + python tag sections here.