teaser

Docker containers as virtual machines for build environments

This article describes a way to use Docker containers as lightweight virtual machines suitable for building software. Containerizing build environments makes it possible to build with different Linux distributions and to isolate the build environment from the host system, with only a negligible performance penalty.

Containers vs. VMs

Docker containers are not real virtual machines, they are only virtualized on the OS level. This leads to some interesting advantages and disadvantages compared to building with fully virtual machines: (using a tool like Vagrant)

  • A Docker container runs directly on the same Linux kernel and CPU as the host system. This means running a different Linux distribution in the container requires it to be compatible with the host system's kernel and CPU. Docker containers can't be used to cross compile for different operating systems or for different CPU architectures, like it's possible with VM's.
  • A container isn't configured with a fixed amount of memory and CPU cores from the start like a VM is, it simply takes resources as needed like any other program. This saves a lot of memory compared to running a real VM.
  • Files in the container can be shared with the host system with highly efficient bind mounts or volumes. When using a real VM files can't be shared as efficiently, as some kind of network file system or file sync mechanism must be used.
  • Docker containers tend to start and exit faster than VM images boot and shutdown.

Dockerfiles, images and containers

When using Docker it's important to understand what some basic terms mean, so let's start with some basic points here:

Docker images are templates used to create containers. An image can't be run, a container can.

A container is in many ways comparable to a virtual machine instance. It's created from an image with some configuration values and has a life of its own. It has files in a private file system. It can be stopped and started multiple times. It can be attached to with a terminal and operated like a separate system.

A Dockerfile is a recipe for defining a new Docker image. A Dockerfile typically specifies what to add to an existing docker image in order to get the new image.

Why this method isn't using a Dockerfile

Since our goal here it so emulate the experience of a virtual machine that will be manually managed and operated, as opposed to a fully automatic build solution, there is no good reason to define a custom image with a Dockerfile. Basing our container directly on one of the generic Linux distribution images like: ubuntu:20.04 is perfectly adequate.

Most of the container customization and configuration must be scripted anyways.

A note about users in containers

Docker is usually run as root on the host system in order to accomplish the OS level virtualization, and the user inside the container starts out as UID:0, GID:0 (root) as well. This is not as bad as it sounds, because the container can only access mounted volumes and its own files.

It's however not very convenient to have all the generated build artifacts owned by root. That's why, in this method, a user named builder is created inside the container with the same UID and GID as the user that ran the script on the host system. Files created by builder in the container will thus appear to be owned by the operating user in the host system.

Enough talk!

Let's build Python3.9.5 inside a Docker container running ubuntu:20.04 as a demonstration:

First, install Docker on the host system:

  • sudo apt install docker.io

Change directory to an empty folder and create a couple of scripts:

Contents of container_make:

#!/bin/bash

CONT_NAME=build_test

THIS_DIR=`readlink -f "./"`
USER_UID=`id -u`
USER_GID=`id -g`
IMAGE=ubuntu:20.04

echo "This will instantiate a Docker container named: $CONT_NAME for building with: $IMAGE

A user named: builder will be made in the container that mirrors the current user's UID: $USER_UID and GID:$USER_GID

Current dir will be mounted as /build inside the container
"

read -p "Press [Enter] key to proceed"

MOUNT1="$THIS_DIR:/build"
ENV_OPTS="-e USER_UID=$USER_UID -e USER_GID=$USER_GID"

sudo docker run -it --init -v "$MOUNT1" $ENV_OPTS --name $CONT_NAME --entrypoint "/build/bootstrap" $IMAGE

A couple of things to note here:

  • The script learns the current user's UID and GID, and passes them along as environment variables to the container so the bootstrap script can setup the builder user.
  • The current dir is mounted as /build in the container. Extra mounts may be added by adding more -v options.
  • The --entrypoint option instructs Docker to run a script when the container starts or is attached to.
  • The --it and --init options are necessary to make the container's terminal act reasonably close to a regular Linux terminal.

In the same folder, make a script called: bootstrap

#!/bin/bash

if id "builder" &> /dev/null; then
    echo "Bootstrap was already done"
else
    echo ""
    echo "Bootstrapping build container"
    echo "-----------------------------"
    echo ""
    echo "Adding user: builder with UID: $USER_UID, GID: $USER_GID"

    groupadd -g $USER_GID builder
    adduser --disabled-password --gecos "" --uid $USER_UID --gid $USER_GID builder

    echo ""
    echo "Installing basic packets and setting up locale"

    apt update
    apt upgrade -y
    apt install -y locales dialog command-not-found python3 software-properties-common wget
    locale-gen en_US.UTF-8
fi

export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8
export LANGUAGE=en_US.UTF-8

cd /build
bash

The bootstrap script starts by testing if the user "builder" has already been created to determine if bootstrapping has already been done.

Generating and setting up the locale environment is necessary to make the container's terminal support international characters and to make for example Python use UTF-8 decoding by default when opening files. Otherwise ASCII would be used as default leading to all sorts of issues.

In the same folder, make one more script named: container_attach

#!/bin/bash

CONT_NAME=build_test

echo "Starting (if necessary) and attaching to container: $CONT_NAME"

sudo docker container start $CONT_NAME
sudo docker container attach $CONT_NAME

This script is handy in order to restart or reconnect to the container after detaching or stopping it.

Make the scripts executable:

  • chmod +x container_make container_attach bootstrap

Building Python in a container

With these 3 scripts ready, let's make the container:

  • ./container_make

As the container comes up, and the bootstrap script finishes, the terminal prompt is now a shell inside the container running as root. Install the dependencies for building:

  • apt install make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev

Change to the intended build user:

  • su builder

Download and build Python:

  • wget https://www.python.org/ftp/python/3.9.5/Python-3.9.5.tgz
  • tar xf Python-3.9.5.tgz
  • cd Python-3.9.5
  • ./configure
  • make -j 4

That should work out of the box. Notice the build artifacts are already available in the mounted dir on the host system.

To exit the container, which will stop it, run: exit or press Ctrl-d

To detach from the container and keep it running, press: Ctrl-p-q

To start again or reattach, run: ./container_attach

Final remarks

The container will persist until it's deleted:

  • sudo docker container rm build_test

How to list all containers running or stopped:

  • sudo docker ps -a

Comments

Comments powered by Talkyard