Ken's Musings about Sourceless Apache Distributions (a rant)

Last updated: Monday, 28 March 2005 14:08 -0500

Apache Software Foundation distributing binary packages that don't include the sources.

If you find the background distracting, try looking at this version of the page.


Open Source -- without the source

Although the Apache licence permits [re]distribution without the inclusion of the source, I was very surprised to find that the sources are no longer included in the 'binary' packages available from the Apache site itself. This is new as of Apache 1.3.22.

In ye golden dayes of olde, no matter what package you downloaded from the Apache HTTP Server project's distribution area, you were assured of getting the sources. You might or might not get a pre-built binary, but you got the source. Always.

The next step (as I recall) along this particular evolutionary branch was the provision of Windows installation packages in two flavours: regular (with source) and extra-crispy (without). The rationale was that the vast majority of Windows users of Apache didn't have the (very expensive) development tools, and were thus being penalised in terms of a couple of megabytes of stuff they didn't want nor need.

This latest mutation of the Apache distribution model completely segregates the source from binary packages. You want a binary? Fine, download it. Want the source too? That's a completely separate download.

Trend or evolutionary dead end?

It seems to me there is only one more ramp possible on this particular road: the discontinuation of source packages altogether. And that is essentially impossible because of the charter of the Apache Software Foundation and the people who comprise it. But I find it worrisome that it has even gotten this far.

As I said in the beginning, this latest change caught me by surprise. I only noticed it because I was looking at the distribution area whilst documenting something. I will freely admit that I was several months behind on my reading of the developers' list, and caught up by reading about four thousand messages in a fortnight -- so I might easily have missed the discussion that led to this change.

The reasons I've been given for separating the source packages from the binary ones include:

Saves on space and bandwidth, both at Apache.Org and the mirror sites
True enough; at roughly 2MB per package under <URL:http://www.apache.org/dist/binaries/>, that's roughly 200MB of largely-duplicated content. That tree currently contains about 370MB total, which is a not-insignificant amount for dozens of mirrors to copy. (The total HTTP project distribution tree is about 565MB, so about 65% of that is binary packages.)
Past Apache binary packages haven't included the source
Maybe true, but I haven't found one yet..

So there's actually a reasonable justification for changing the distribution model: size.

What's the Right Answer?

At the time I'm writing this, it looks like there was a misunderstanding that led to this change in our distribution model. However, looking at the sheer size of the package totals makes it fairly evident that taking another look at this may be appropriate.

So.. Should all Apache packages include the source, full as they have in the past? Or should the source be a single (well, two, one for Windows and one for other platforms) package, and all of the binary packages be stripped down to just the results of a build? Should a binary package omit the server source itself, but include the bits you need to build modules (e.g., the .h and library files)? Or not even that? Stay tuned!


coar