Showing posts with label build systems. Show all posts
Showing posts with label build systems. Show all posts

Thursday, 13 May 2010

Programming Tools: Versioning (and magic)

The programming tools series of articles (which, I know, I promised I had already finished) relied heavily on the version control system at its heart.

Today I want to talk about a new feature of versioning: Magic Files

Imagine, I check into the version control system my source code (or a gif), I want the facility to say to the VCS "now give me the compiled code" (or "now give me a jpeg version of the gif"). The VCS - not me - should know how to generate the file I'm asking for. Because there should be rules which tell the VCS how to generate the file

Now, because we can make a rule that any such process should be side-affect free, once the VCS has generate the file, it can keep a copy of it, at least until one of the files that were used in generating it has changed. And there is nothing to say that the VCS has to wait for someone to ask for the file before it generates it - if it sees free processor time, the VCS can spawn compilation.

We can take this further: why not just check out your test results file: that would run every test and tell you how it did. And running an individual test [by looking at the individual test results log] would just check out the files it needed to run - which means only the files needed to perform a test would be compiled - leading to the optimal code-compile-test loop.

In short, I'm talking about merging the version control system and make.

Its a different world - and one which probably requires far more storage (and a few extra VCS features such as a 'regenerate' option to regenerate a file which for some reason went awry) - but one which feels to me more inherently usable, and which provides us with more records of exactly what has and hasn't been run.

Monday, 19 April 2010

Programming Tools: The Build Farm

In my first essay on programming tools I suggested that you take some code, press a button and it builds.

What I didn't talk about was how it builds.

Firstly, it doesn't build on your local machine (it could, if you install the right tools, but that's an unnecessary addition. If you're working locally, you just use your version control system back to the online system to build...). Instead of local builds we have a build farm - a set of one or more machines which include all the tools for building for one or more particular environments. When you hit build, your code is shunted off to the first available machine that can build for the platform you require. The build is performed, and the results are returned.

Actually - its even better. Because this is always happening in behind the scenes, as you type. Whenever it gets a chance a new build of your code is scheduled... so usually when you press the build button the results are instantly available.

And by results I mean build errors and warnings. I mean binaries. I mean all outputs one would usually see from a build. Of course, you probably won't care about the binaries because for most of your development time you'll be using a test farm to cope with these. The errors are the important thing - and they'll be checked into the appropriate point in the version control system so that they are always instantly available

In order to make such a system work, each machine in the build farm will need to have a working build environment installed (its probably best to do this with a VM image, so that making clones is easy). They will also need a daemon process to control the build. Its probably best to think of this as being the build tool - the equivalent of make.

The build tool will advertise itself on the network (using zero-conf or rendezvous or whatever), along with advertising the systems it is capable of building for. Any machine with a tool put on the network will automatically be available to build

I've spent a lot of time thinking about what the ideal replacement for make would be like, but the long and the short of it is, the characteristics of the build system tool won't matter so long as it is possible to only need to rebuild those files which have changed... it could rely heavily on the vcs here, after all, it will know from the vcs exactly which files are different from the last build.

I do think, however, that directory structure is probably not a desirable thing to have. Rather, it should be possible to put files into groups either explicitly, or by smart groups (just as you have smart folders in itunes). A file can be in many groups. When a build tool needs an arbitrary structure, it should be possible to generate it on the fly by moving (groups of )files to the correct locations. This gets over the issues with - say - Microsoft's driver build environment, which only allows two levels of directory structure, and helps where the directory structures need to be different on different platforms. It also helps with version control: at the moment version control systems have to try to emulate every filing system their files may run on, so that permissions and so forth can be replicated correctly. Moving files around is a pain. With this system, everything can fall into build system metadata, and we can provide our own security features (such as restricting changes different code areas to different program groups) if we need them. Of course, build system metadata is stored in a file in version control - so if a file is merged or permissions need to be changed, those changes are tracked like everything else - transparently.

The end result of this complexity is simplicity. Once a build has been set up, it will 'just work' for everybody. If you suddenly need to build for another platform, its just a simple choice of a new platform to see if it works. You can test every platform quickly and easily without searching for appropriate machines and solving 101 build problems.

And because a build daemon is externally controllable - it is no effort at all to use it (and maybe a particular 'gold' machine) for scheduling nightly builds. Or rolling test builds of the code integration branch.

Programmers life is simpler. Builds are easier. Testing is easier. And everyones' lives can be made faster and easier still just by throwing on new hardware. Its a win for everybody.