The docker.io Debian package is back to life

Last week, a new version of docker.io, the Docker package provided by Debian, was uploaded to Debian Unstable. Quickly afterward, the package moved to Debian Testing. And this is good news for Debian users, as before that the package was more or less abandoned in "unstable", and the future was uncertain.

The most striking fact about this change: it's the first time in two years that docker.io has migrated to "testing". Another interesting fact is that, version-wise, the package is moving from 1.13.1 from early 2017 to version 18.03 from March 2018: that's a one-year leap forward.

Let me give you a very rough summary of how things came to be. I personally started to work on that early in 2018. I joined the Debian Go Packaging Team and I started to work on the many, many Docker dependencies that needed to be updated in order to update the Docker package itself. I could get some of this work uploaded to Debian, but ultimately I was a bit stuck on how to solve the circular dependencies that plague the Docker package. This is where another Debian Developer, Dmitry Smirnov, jumped in. We discussed the current status and issues, and then he basically did all the job, from updating the package to tackling all the long-time opened bugs.

This is for the short story, let me know give you some more details.

The Docker package in Debian

To better understand why this update of the docker.io package is such a good news, let's have quick look at the current Debian offer:

rmadison -u debian docker.io

If you're running Debian 8 Jessie, you can install Docker 1.6.2, through backports. This version was released on May 14, 2015. That's 3 years old, but Debian Jessie is fairly old as well.

If you're running Debian 9 Stretch (ie. Debian stable), then you have no install candidate. No-thing. The current Debian doesn't provide any package for Docker. That's a bit sad.

What's even more sad is that for quite a while, looking into Debian unstable didn't look promising either. There used to be a package there, but it had bugs that prevented it to migrate to Debian testing. This package was stuck at the version 1.13.1, released on Feb 8, 2017. Looking at the git history, there was not much happening.

As for the reason for this sad state of things, I can only guess. Packaging Docker is a tedious work, mainly due to a very big dependency tree. After handling all these dependencies, there are other issues to tackle, some related to Go packaging itself, and others due to Docker release process and development workflow. In the end, it's quite difficult to find the right approach to package Docker, and it's easy to make mistakes that cost hours of works. I did this kind of mistakes. More than once.

So packaging Docker is not for the faint of heart, and maybe it's too much of a burden for one developer alone. There was a docker-maint mailing list that suggests an attempt to coordinate the effort, however this list was already dead by the time I found it. It looks like the people involved walked away.

Another explanation for the disinterest in the Docker package could be that Docker itself already provides a Debian package on docker.com. One can always fall back to this solution, so why bothering with the extra-work of doing a Debian package proper?

That's what the next part is about!

Docker.io vs Docker-ce

You have two options to install Docker on Debian: you can get the package from docker.com (this package is named docker-ce), or you can get it from the Debian repositories (this package is named docker.io). You can rebuild both of these packages from source: for docker-ce you can fetch the source code with git (it includes the packaging files), and for docker.io you can just get the source package with apt, like for every other Debian package.

So what's the difference between these two packages?

No suspense, straight answer: what differs is the build process, and mostly, the way dependencies are handled.

Docker is written in Go, and Golang comes with some tooling that allows applications to keep a local copy of their dependencies in their source tree. In Go-talk, this is called vendoring. Docker makes heavy use of that (like many other Go applications), which means that the code is more or less self-contained. You can build Docker without having to solve external dependencies, as everything needed is already in-tree.

That's how the docker-ce package provided by Docker is built, and that's what makes the packaging files for this package trivial. You can look at these files at https://github.com/docker/docker-ce/tree/master/components/packaging/deb. So everything is in-tree, there's almost no external build dependency, and hence it's real easy for Docker to provide a new package for docker-ce every month.

On the other hand, the docker.io package provided by Debian takes a completely different approach: Docker is built against the libraries that are packaged in Debian, instead of using the local copies that are present in the Docker source tree. So if Docker is using libABC version 1.0, then it has a build dependency on libABC. You can have a look at the current build dependencies at https://salsa.debian.org/docker-team/docker/blob/master/debian/control.

There are more than 100 dependencies there, and that's one reason why the Debian package is quite time-consuming to maintain. To give you a rough estimation, in order to get the current "stable" release of Docker to Debian "unstable", it took up to 40 uploads of related packages to stabilize the dependency tree.

It's quite an effort. And once again, why bother? For this part I'll quote Dmitry as he puts it better than me:

Debian cares about reusable libraries, and packaging them individually allows to build software from tested components, as Golang runs no tests for vendored libraries. It is a mind blowing argument given that perhaps there is more code in "vendor" than in the source tree.

Private vendoring have all disadvantages of static linking, making it impossible to provide meaningful security support. On top of that, it is easy to lose control of vendored tree; it is difficult to track changes in vendored dependencies and there is no incentive to upgrade vendored components.

That's about it, whether it matters is up to you and your use-case. But it's definitely something you should know about if you want to make an informed decision on which package you're about to install and use.

To finish with this article, I'd like to give more details on the packaging of docker.io, and what was done to get this new version in Debian.

Under the hood of the docker.io package

Let's have a brief overview of the difficulties we had to tackle while packaging this new version of Docker.

The most outstanding one is circular dependencies. It's especially present in the top-level dependencies of Docker: docker/swarmkit, docker/libnetwork, containerd... All of these are Docker build dependencies, and all of these depend on Docker to build. Good luck with that ;)

To solve this issue, the new docker.io package leverages MUT (Multiple Upstream Tarball) to have these different components downloaded and built all at once, instead of being packaged separately. In this particular case it definitely makes sense, as we're really talking about different parts of Docker. Even if they live in different git repositories, these components are not standalone libraries, and there's absolutely no good reason to package them separately.

Another issue with Docker is "micro-packaging", ie. wasting time packaging small git repositories that, in the end, are only used by one application (Docker in our case). This issue is quite interesting, really. Let me try to explain.

Golang makes it extremely easy to split a codebase among several git repositories. It's so easy that some projects (Docker in our case) do it extensively, as part of their daily workflow. And in the end, at a first glance you can't really say if a dependency of Docker is really a standalone project (that would require a proper packaging), or only just a part of Docker codebase, that happens to live in a different git repository. In this second case, there's really no reason to package it independently of Docker.

As a packager, if you're not a bit careful, you can easily fall in this trap, and start packaging every single dependency without thinking: that's "micro-packaging". It's bad in the sense that it increases the maintenance cost on the long-run, and doesn't bring any benefit. As I said before, docker.io has currently 100+ dependencies, and probably a few of them fall in this category.

While working on this new version of docker.io, we decided to stop packaging such dependencies. The guideline is that if a dependency has no semantic versioning, and no consumer other than Docker, then it's not a library, it's just a part of Docker codebase.

Even though some tools like dh-make-golang make it very easy to package simple Go packages, it doesn't mean that everything should be packaged. Understanding that, and taking a bit of time to think before packaging, is the key to successful Go packaging!

Last words

I could go on for a while on the technical details, there's a lot to say, but let's not bore you to death, so that's it. I hope by now you understand that:

  1. There's now an up-to-date docker.io package in Debian.
  2. docker.io and docker-ce both give you a Docker binary, but through a very different build process.
  3. Maintaining the docker.io package is not an easy task.

If you care about having a Docker package in Debian, feel free to try it out, and feel free to join the maintenance effort!

Let's finish with a few credits. I've been working on that topic, albeit sparingly, for the last 4 months, thanks to the support of Collabora. As for Dmitry Smirnov, the work he did on the docker.io package represents a three weeks, full-time effort, which was sponsored by Libre Solutions Pty Ltd.

I'd like to thank the Debian Go Packaging Team for their support, and also the reviewers of this article, namely Dmitry Smirnov and Héctor Orón Martínez.

Last but not least, I will attend DebConf18 in Taiwan, where I will give a speak on this topic. There's also a BoF on Go Packaging planned.

See you there!