16.10.08. Leaps and Pains (or - changing development/deployment and scm tools to more closely realize the component architecture dream)

A year or more ago, I was really struggling with zc.buildout, a Python based tool for building out "repeatable" deployments. Buildout makes setuptools actually usable, particularly for development and deployment of web apps, although there are many other uses.

Buildout keeps everything local, allowing one app to use version 3.4.2 of one package while another app can use 3.5.2. But more than just being an 'egg' / Python package manager, it can do other tasks as well - local builds of tools (from libxml to MySQL and more), again allowing one app to build and use MySQL 5.0.x and another app to use 5.1.x; or just allowing an app to be installed onto a new box and get everything it needs, from web server to RDBMS to Memcached and beyond. We don't use all of these features (yet), but it's a nice dream.

Already it's very nice to be able to make a git clone of a customer app, run buildout, and then start it up. Buildout will put setuptools to work to ensure that proper versions of dependent components are installed (and, quite nicely, it's very easy to share both a download cache and a collection of 'installed eggs' - multiple versions living side by side, with individual buildouts picking the one they desire).

But it was not easy to get to this golden land. Prior to using Buildout, we'd check our code out of our CVS repository. Our customer apps were just another Python package, nothing special (not an application, and - more importantly - not packaged up in 'distutils' style). As we started to make more and more reusable parts, we had to do a lot of checkouts; and so I wrote a tool to help automate this checkout process. It would also check out other third party code from public Subversion repositories; all because it was easier to check out a particular tag of 'SQLAlchemy' or 'zc.table' than to try to install them into a classic-style Zope 3 'instance home'.

But it was getting harder and harder to keep up with other packages. We couldn't follow dependencies in this way, for one thing; and it required some deep knowledge of some public SVN repository layouts in order to get particular revision numbers or tags.

'Buildout' promised to change all of that, and offer us the chance to use real, honest-to-goodness distributed Python packages/eggs. But getting there was so very hard when there are deadlines beating you down.

I took a lot of my frustration out on both Setuptools (which is so goddamn woefully incomplete) and Buildout. But the fault was really in ourselves... at least, in a way. As mentioned above, it was easier to just checkout 'mypackage' into$INSTANCE_HOME/lib/python/mypackage than to figure out the install options for distutils/setuptools. As such, NONE of our code was in the Python 'distutils' style. We put some new packages into that style, but would still just check out a sub-path explicitly with CVS just like we were doing with public SVN code.

Part of the big problem that we had which made it so difficult was that we had hung onto CVS for, perhaps, too long. And doing massive file and directory restructuring with CVS is too painful to contemplate. But moving to Subversion never seemed worth the effort, and so we held on to CVS. But I knew I'd have to restructure the code someday.

Fortunately, Git arrived. Well, it had been there for a while; but it was maturing and quite fascinating and it offered us a chance to leapfrog over SVN and into proper source code management. Git is an amazing tool (perhaps made more so by being chained to CVS for so long), and it provided me with the opportunities to really restructure our code, including ripping apart single top-level packages into multiple namespaced packages (ie - instead of 'example' being the root node with 'core' and 'kickass' subpackages, I could split that into 'example.core' and 'example.kickass' as separate packages and Git repositories while keeping full histories).

For a while, I used Git with its cvsimport and cvsexportcommit tools to clean up some of our wayward branches in CVS, while starting to play with Buildout. I was still struggling to get a Zope 3 site up and running using our frameworks. And here... well, the fault was partly in ourselves for having to go through fire to get our code into acceptable 'distutils' style packages, which made learning Buildout all the more hard. But the available documentation (comprehensive, but in long doctest style documents) for some of the Zope 3 related recipes was very difficult to follow. Hell - just knowing which recipes to use was difficult!

But after many months of frustrated half-attempts, often beaten down by other pressures, I opened a few different tabs for different core Buildout recipes in my browser and furiously fought through them all... And boom! Got something working!

Unfortunately it was one of those processes where by the time I got out of the tunnel, I had no idea how exactly I had made it through. One of my big complaints as I was struggling was the lack of additional information, stories of struggle and triumph, etc. And there I was - unable to share much myself! I can't even remember when I was able to break through. It's been quite a few months. Just a couple of weeks ago we deployed our last major old customer on this new setup; and we can't imagine working any other way now.

'Git' and 'Buildout' have both been incredibly empowering. What was most difficult, for us, was that it was very difficult to make the move in small steps. Once we started having proper distutils style packages in Git, they couldn't be cloned into an instance home as a basic Python package (ie, we couldn't do the equivalent of cvs checkout -d mypackage Packages/mypackage/src/mypackage and get just that subdirectory). And we couldn't easily make distributions of our core packages and use them in a classic Zope 3 style instance home (I did come up with a solution that used virtualenv to mix and match the two worlds, but I don't think it was ever put to use in production).

So it was a long and hard road, but the payoffs were nearly immediate: we could start using more community components (and there are some terrific components/packages available for Zope 3); we could more easily use other Python packages as well (no need to have some custom trick to install ezPyCrypto, or be surprised when we deploy onto a new server and realize that we forgot some common packages). Moving customers to new server boxes was much easier, particularly for the smaller customers. And we can update customer apps to new versions with greater confidence than before when we might just try to 'cvs up' from a high location and hope everything updated OK (and who knows what versions would actually come out the other end). Now a customer deployment is a single Git package - everything else is supplied as fully packaged distributions. It's now very hard to 'break the build' as all of the components that are NOT specific to that customer have to come from a software release, which requires a very explicit action.

Labels: , , , , ,