Transitioning from Docker to Containerd in GitLab CI

A shared Docker daemon approach was initially employed, briefly transitioning through ephemeral Docker-in-Docker (DinD), and ultimately switching to using host node’s containerd and buildkitd sockets.

The Single Beefy Node: A Shared Approach

Initially, a large Kubernetes node with its own Docker instance was utilized. All CI jobs sent their docker build commands to this single daemon, allowing easy reuse of image layers and inline caches. However, disk I/O quickly became a bottleneck when multiple jobs ran concurrently.

Pros:
- Shared image layers and fast local caching.
Cons:
- Resource contention when many jobs run simultaneously.
- Limited scalability due to disk constraints.
- Inability to scale down to zero during idle periods, causing unnecessary costs.

Docker-in-Docker: A New Perspective

Following GitLab’s recommendations, the Docker-in-Docker model was adopted. Each job pod included a sidecar (GitLab CI “Job Services”) containing a separate Docker instance. This setup provided isolation and allowed the Kubernetes cluster to scale up or down more freely.

Advantages:
1. Isolation: Each job ran in its own environment, minimizing cross-job impact.
2. Scalability: The Kubernetes cluster could scale up or down depending on workload, a capability lacking in the initial setup.

Despite these improvements, persistent inline caching was still missing. Once a pod completed, its Docker daemon (and cached layers) would disappear, requiring repeated pulls of large image inline caches from an external registry.

Mitigating Overhead with a Docker Registry Proxy

To address the issue of re-pulling large inline caches, a local Docker registry proxy was introduced. This alleviated some of the overhead by reducing network fetch times and costs, but did not enable persistent caching across builds on the same node.

Embracing Host Containerd & BuildKit

The decision was made to allow CI pods to directly communicate with host nodes’ containerd and buildkitd sockets. Despite being more privileged access-wise, this proved beneficial in a dedicated build cluster without production workloads running.

Key Benefits:
1. Reduced Overhead: No need to spin up DinD sidecars. Jobs share the host node’s container runtime for image builds.
2. Persistent Caching: Produced image layers remain on the node, avoiding repeated remote pulls and speeding up subsequent builds.

Implementing Changes

On the Host Node

BuildKit was installed and enabled:

#!/bin/bash

BUILDKIT_VERSION="0.16.0"

curl -L https://github.com/moby/buildkit/releases/download/v${BUILDKIT_VERSION}/buildkit-v${BUILDKIT_VERSION}.linux-arm64.tar.gz | tar -xz -C /usr/local/

cat <<EOF > /etc/systemd/system/buildkitd.service
[Unit]
Description=BuildKit Daemon
After=network.target

[Service]
ExecStart=/usr/local/bin/buildkitd --oci-worker=false --containerd-worker=true --containerd-worker-addr=/var/run/containerd/containerd.sock
Restart=always

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable buildkitd --now

In GitLab Runner Configuration

An example snippet to mount containerd and buildkitd sockets:

[[runners]]
  [runners.kubernetes]
  # ... other settings ...

    [[runners.kubernetes.volumes.host_path]]
      name = "containerd"
      mount_path = "/run/containerd/containerd.sock"
      host_path = "/var/run/containerd/containerd.sock"

    [[runners.kubernetes.volumes.host_path]]
      name = "buildkitd"
      mount_path = "/run/buildkit/buildkitd.sock"
      host_path = "/var/run/buildkit/buildkitd.sock"

For jobs that create additional containers, read-only mounts for /var/lib/containerd and /opt/cni/bin can also be used:

[[runners]]
  [runners.kubernetes]
  # ... other settings ...

    [[runners.kubernetes.volumes.host_path]]
      name = "containerdlayers"
      host_path = "/var/lib/containerd"
      mount_path = "/var/lib/containerd"
      read_only = true

    [[runners.kubernetes.volumes.host_path]]
      name = "cnibridge"
      host_path = "/opt/cni/bin/"
      mount_path = "/opt/cni/bin/"
      read_only = true

With containerd.sock mounted, the Docker CLI was replaced with Nerdctl:

FROM alpine:latest

ARG NERDCTL_VERSION="2.0.3"
ARG ARCH=arm

RUN apk add --upgrade --no-cache \
    ca-certificates bash curl jq yq

RUN curl -L https://github.com/containerd/nerdctl/releases/download/v${NERDCTL_VERSION}/nerdctl-${NERDCTL_VERSION}-linux-${ARCH}64.tar.gz \
    | tar -xz -C /usr/local/bin

COPY --from=moby/buildkit /usr/bin/buildctl /usr/local/bin/

In GitLab Build Job

In CI job scripts, docker build was replaced with:

nerdctl build \
  --output type=image,\"name=${TAGS_LIST}\",push=true \
  --cache-to type=${CACHE_TYPE},ref=${TAGS_LIST%%,*}-cache \
  --cache-from type=${CACHE_TYPE},ref=${TAGS_LIST%%,*}-cache \
  ${DOCKER_ARGS} \
  .

Outcome

Registry Proxy:
A local registry proxy is still maintained, which speeds up pulls and reduces network costs by preventing unnecessary requests to external registries.
Containerd & BuildKit:
Connecting directly to the host’s container runtime retains cached layers between jobs, shortens build times, and reduces maintenance. The trade-off is somewhat less isolation, but for a dedicated build cluster, this is acceptable.

Future Optimizations

Git LFS Caching:
Plans are in place to cache Git LFS blobs similarly to image layers. This will reduce pull times for large repositories that frequently change.
Dependency Caching:
By utilizing GitLab’s built-in caching, packages and libraries will be stored during image builds, making them easily accessible for subsequent jobs.