Understanding Docker Images and the Layered Filesystems

Andrei Ciorba
Andrei Ciorba November 26 2020 #docker

share this on:


Docker image is just a minimal set of files on your desk. It includes code, some libraries, some dependencies, everything that you're required to run a very specific application.


[Video Transcript]

Besides names, places and see groups, we have one more construct, which is not really a requirement for containers, but Docker uses it for image and container management. That's a layered file system. And before speaking about it, we should briefly mention what exactly is an image from his perspective. Now, Docker image is just a minimal set of files on your desk. That's it. It includes code, some libraries, some dependencies, everything that you're required to run a very specific application.

Remember, Containers are focused on one single app. So everything that you package inside of an image and everything you get to run inside of a container is destined just to provide the bare minimum necessary resources and libraries and dependencies for that specific. Of course, this also includes the app itself. Right. And this is why Docker is sometimes considered a technology for packaging applications, because everything is neatly packaged inside these images. Now, docker images have layers just like onions and ogre's that each layer usually adds more content to the image and builds upon the previous one.

All these layers and an image are really read-only in order to preserve the integrity of the base image.

So what do you do with these images?

Well, you use them as templates to start running containers based on them. There's no limit to the number of containers that can start from a single base image. So what actually happens every time you run a container based on an image is that the image file system is mounted read-only inside a container and a new writable layer gets created on top of it for each and every container.

This is actually the only layer that can be changed by the container every time files need to be created, changed or deleted because everything else is written right. And in order for this to actually work and to become sufficiently transparent to the user. And yep, this is where the cow, the copy on write mechanism comes into play. If a container needs to change a file from the read-only image layers, for example, the default configuration file, then the container first needs to make a copy of that file to its writable layer and then change it in place.

Also, this also means that the writable layer lives only as long as the container exists. But the moment you delete or recreate that container, the contents of that writable layer of that writable layer are lost. So you have multiple containers started from the same base image. One writable layer container is created, but that image is only stored once on your desk and also is shared by all the container instances that use it. Finally, what do we get these images from?

And the good news is that it's extremely easy to find Dockery's applications on the largest public and free image registry out there, which is called Docker Hub. Now, in some aspects, it's extremely similar to GitHub, if you ever used it. And it's not the only source out there. There are registries like these from all major cloud providers out there, like Amazon, Google, Oracle, Microsoft and so on, and can also host your own registry in your own data center.

Just to have a look, this is Dr. Hub right here, accessible at hub.docker.com. The first page here actually shows my repositories. But if I go under the Explorer section, the first one on the uppermost menu, we have access to over almost four million Docker images right here.

Right. If you scroll just a bit through these images here, you're probably going to recognize at least a few of them, like Post Crisis PostgreSQL, the Ubuntu Linux distribution read is no J.S. MySQL and so on. Right. Goaland Java Busi box and a lot of the tools that you're probably already. Now used to leverage software development and software distribution for compiling code and so on.

All right, and without going into too much detail right here, if you just look under one single image like Google to see that it's actually a collection of images right here, that it's available for a multitude of architectures. Right. Three six arm IBMs barbaresi and so on and so forth. So that's another interesting piece of information here. The fact that containers can actually run on a multitude of architectures are out there from the Raspberry Pi to the public clouds.

Right. Instances, and also want to find some tags in here, which are actually just a way to differentiate between different versions of the same parent image. So, for example, for Boots probably knows this, we have this versioning scheme here like for 20, 04 and so on. These tax breaks here actually point to a specific image in order for you to actually get the specific distribution that you want to use.

You can also use as a tag the name the Zengel Bionic and so on, the names that are given to these Ubuntu distributions here. We'll go back to Docker Hub in just a moment. 

So container's, as I mentioned before, are designed to be destroyed and recreated at any time from what we just saw, so deleting a container removes all the contents of his writable layer. So all the files that our container has changed or created are lost.

How would you be able to run a container that serves a web page or a database?

Right. What would be the point if you lost all that important data every time you recreate or update your container? So for things like user data databases and many other types of content, we need a persistent object, a persistent type of entity in Docker with a lifecycle that is completely independent of that of your containers. And that is a docker volume, a storage entity. But it can monitor your containers and provide them with persistent storage. Volume can be as simple as a bath on your dog or host or can point to a network location.

It can even mount memory back volumes that are not stored on the disk, very useful for passing sensitive information inside of containers like configuration files. That includes admin passwords, like certificates, stuff like that.

So there's a lot of flexibility here as well.

If you want to learn more about Docker, register now for one of our courses!

New call-to-action



share this on: