Docker Detour

Docker keeps fascinating me, purely as a use case. From the image hosting perspective, there are a couple things that are missing in its current stage of development. The biggest and the most obvious one is – a shared,  distributed, and deduplicated store for both image manifests (image metadata) and layer content (the data).

Due to the immutable sha256-protected nature of both the related complexity is about 3 orders of magnitude lower than (this complexity would look like for) anything less specialized.

Distributing the content-hashed and stacked stuff like this:


and then sharing it across multiple docker hosts is a walk in the park. Especially since image manifests and the image-contained write-once-read-many (WORM) layers are organized in Merkle-type structures.

The other, maybe not-so-obvious “miss” relates to the Docker registry, which is an upload/download type object archive (emphasis on archive). Since ‘docker commit’ is, effectively, an object Put, the ability to commit-and-share seems like a natural extension, to provide distributed and live versioned namespace for distributed apps.

And then there is the performance – both container startup and, separately, runtime performance. If done right, centralized distributed hosting may be awesome in many usual respects, including scalability and ease of management. But the other side of this “coin” is running traffic from/to remote servers..

There are ways to drastically reduce almost inevitable latency. There are also first projects/solutions in that space – see for instance one recent FAST’16 paper where the backend is proprietary Tintri VMstoreTM. It’s a well-written, comprehensive and detailed paper, with a lot of good diligence, research and development that obviously went into it. However, as far as optimizing container startup performance, the presented bitmap-based block level solution is not the best, and not only because bitmaps do not scale. But the broader point is that the progress is being made..

Two tiers

And so the question is, what would be an ideal solution for this image-hosting use case? First of, we need to admit (and accept) the fact that a) local POSIX and b) distributed namespace with scalable WORM/CoW backend – these two usually don’t go together.

Legacy kernel-mountable POSIX-compliant distributed filesystems, like NFS and Lustre, were never designed to handle this crystal clear “call” for the two distinctly different tiers implementing distinctly different functions. Never designed in part because NFS and Lustre (LNET) transports are file (and file/directory permission) based. Which does not perform..

And so for now and until, I guess, the vacuum is filled, docker users will have to keep using the same loopback “crutch” that was used in this same FAST’16 white paper I’d mentioned – to mount block volumes backed by remote files. Which in turn may or may not be distributed..