Compare Docker outside of Docker (DooD) with Docker in docker (DinD)
In micro service architecture., there are times., we would like to execute a command on another service — which runs as a docker container. During these times, it is useful to mount the host docker and delegate the command to host, host executes the command on desired container. To be precise, here we are discussing Docker outside of Docker (DooD)
and not Docker in Docker (DinD)
Example use case:
This is only a sample use case. The core concept is applicable for any docker to docker communication that happens in a typical microservice architecture.
Let’s say we have an Apache Airflow running as a docker container. We also have postgres running as container. Apache airflow needs :
- postgres database to be created
- database to be initialized with schema
- admin user to be created
Sample docker swarm setup
#docker-compose.yml
# Only shows the airflow webserver part as an example
# Here we mount the docker from host toairflow_webserver:
image: yourcustom_image:tag
depends_on:
- postgres
environment:
- AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql://username:pass@postgres:5432/airflow
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- ${DOCKER_BIN_PATH}:/usr/bin/docker
ports:
- 8080
command: ["webserver", "-p", "8080"]
networks:
- mynet
deploy:
restart_policy:
condition: none
- Notice we mount
docker.sock
anddocker
binary from host docker.sock
— this is the unix socket that docker daemon listens in the host system. It can be used to communicate from docker container to host docker daemon. In simple terms., a socket is IP + port- We don’t need to install docker client on the Airflow docker image. We are mounting docker binary from host system. I believe, if the host system is Windows, it may not work. I haven’t tried this.
Container user & group vs Host user & group
An important detail to understand while mounting host docker into a container docker is the user & his group inside the container and user & group of the docker.sock file. The system tries to match whether this user has permission to docker.sock file.
If container is run as user root
— then it may run without any customization of user / group / permissions etc., As a good practice, many docker container doesn’t run as root
and have another user.
Custom Airflow Dockerfile
FROM apache/airflow:1.10.10-python3.7ADD ./entrypoint.sh /entrypoint
ADD ./initialize.sh /initialize.shUSER root# Even on doing this, there is no guarantee that permission will be granted. Please grant permission in host.
RUN groupadd docker && usermod -aG docker airflow && \
usermod -aG root airflowUSER airflow
I have not added the entrypoint script here, since it is not core concern. All we do in entrypoint is to execute any entrypoint logic specific to that container and then call initialize.sh
Content of initialize.sh
is shown below.
Permission denied issue on /var/run/docker.sock
# In host system, give read,write permission to others
sudo chmod u=rwx,g=rw,o=rw /var/run/docker.sock
- Generally, if we mount a file from host to container, permissions are determined by UID (userId) and GID (groupID) comparison. If container is running as root, they usually have
UID=0
which happily matches with container. - When container runs as different user, then it is hard to make the UID and GID match with the host. The simplest way is to give access explicitly in the host to others.
- In the above Dockerfile, By doing
usermod -aG docker airflow
we also add thatdocker
secondary group to airflow user. In general, Linux systems runs docker fromdocker
user group. But there is no guarantee that GID ofdocker
inside container and GID ofdocker
in host would match. - By doing
usermod -aG root airflow
We add userairflow
toroot
user group. In MacOS/var/run/docker.sock
is owned byroot
user group. - If our container, like the airflow one, which has root user and airflow user, we need to explicitly handle the permissions.
- More on this and different ways to handle it
Database initialization script
#!/usr/bin/env bash
# This script is idempotent. Can be run multiple items and
# will not cause any effect if state is
# equal to desired state.
#
# This script helps to create airflow db, initialize it,
# create an admin user
#docker exec `docker ps | grep 'postgres' | awk '{print $1}'` createdb -U postgresAdminUser -E UTF8 --template=template0 airflow || true# init db
airflow initdb# create a default user. We can change password from UI
airflow create_user -r Admin -u admin \
-p defaultAdminPassHere \
-e testadmin@yourcompany.com \
-f Admin \
-l Org
- The above docker command executes using host docker daemon, finds container process that has postgres in it’s name. If you have multiple matching this name, then we need to modify this grep command
Additional Tips
- Don’t do interactive execution on commands exeucted(
-it)
This will result inThe input device is not a TTY
- If shell command’s errors are acceptable, for example, database creation failed since database already exists. If the use case is to silently ignore this, then use
<shell command> || true
More on this here
Summary
- Bind mounting docker from host to running containers is not new and many of us use it most of the time.
- When it errors out, at times it is hard to find the cause and fix it. Some times, team tend to take short cut to avoid these errors.
- By solving this problem, we can handle needs like initial setup of services, where we may need to create database, service registration on another service and other steps can be moved to
entrypoint
of concerned services. - This provides a clear separation of responsibility and each service will be self sufficient.
- I have seen use cases, where such initialization and base setup are moved as separate service. This causes business logic to be scattered around multiple services. During refactoring, team tends to miss making equivalent changes on all services.