Docker layer caching: faster agent image builds
Your agent rebuild takes 8 minutes every time you touch one line
You change a Python file in your agent. You run docker build. 8 minutes later it finishes. You made a one-line change. 8 minutes. You do this 15 times a day. That is 2 hours of your life per day watching pip install. Worse, your CI rebuilds from scratch on every PR and every merge, so every change pays the same tax.
The problem is not Docker. The problem is that your Dockerfile invalidates every cache layer the moment you change any file in your repo. The fix is ordering. Put slow-changing things (dependencies, base images) before fast-changing things (your code). Docker does the rest. Done right, a one-line code change rebuilds in under a minute.
This post is the layer ordering rule, the COPY discipline that makes it work, the cache mount pattern for pip and uv, and the 4 traps that silently invalidate caches on every build.
Why is your Docker build slow when nothing changed?
Because Docker caches build layers by their inputs. When you change any input, that layer and every layer after it rebuilds. If the first layer that copies your code is followed by the layer that installs dependencies, then every code change re-installs dependencies. That is the whole problem.
The naive Dockerfile looks like this:
# filename: Dockerfile.naive
# description: The Dockerfile you should not ship. COPY before pip install
# invalidates the dependency layer on every code change.
FROM python:3.12-slim
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["python", "main.py"]
Every time anything in your repo changes, Docker re-runs the COPY layer, which means the pip install layer also has to re-run because its input (the files in /app) changed. Your 7-minute dependency install happens on every one-line fix.
The fix is to separate the things Docker uses to cache. Copy the dependency manifest first, install from it, and only then copy the code.
graph TD
A[Base image: python 3.12] --> B[System packages]
B --> C[Copy requirements.txt]
C --> D[pip install from requirements]
D --> E[Copy app code]
E --> F[Set CMD]
Change[One-line code change] --> E
style D fill:#dcfce7,stroke:#15803d
style E fill:#fef3c7,stroke:#b45309
style Change fill:#dbeafe,stroke:#1e40af
A change to app code only invalidates layer E and later. The expensive D layer stays cached.
How do you order layers for maximum cache reuse?
Order from slowest-changing to fastest-changing. The rule of thumb: what changes least goes first, what changes most goes last.
A typical agent image has 6 layers in this order:
- Base image (
FROM python:3.12-slim). Rarely changes. - System packages (
apt-get install). Changes only when you add a new C library dependency. - Python dependency manifest (
COPY requirements.txtorpyproject.toml). Changes when you add or update a package. - Dependency installation (
RUN pip install). Changes only when the manifest changes. - Application code (
COPY . .). Changes on every development iteration. - Entrypoint (
CMD). Changes rarely.
# filename: Dockerfile
# description: Cached layer ordering for a FastAPI agent image.
# Dependencies install only when the manifest changes.
FROM python:3.12-slim
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential curl \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
Read the order carefully. COPY requirements.txt alone comes before pip install. COPY . . comes after. A code change invalidates only the last 2 layers. A requirements change invalidates pip install and everything after. A base image change invalidates everything, but that happens rarely.
What is a cache mount and why does it beat --no-cache-dir?
A cache mount in a Dockerfile (RUN --mount=type=cache) gives pip a persistent cache directory that survives across builds. Without it, the --no-cache-dir pip flag makes every build download every package from PyPI, even on a cache hit (because the layer recreates).
With a cache mount, pip downloads each wheel once, keeps it in /root/.cache/pip across builds, and reuses it on subsequent installs. The first install of a package is slow; every subsequent install is fast.
# filename: Dockerfile.cached
# description: Dependency install with a buildkit cache mount.
# pip downloads wheels once and reuses them across builds.
# syntax=docker/dockerfile:1.6
FROM python:3.12-slim
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY requirements.txt .
RUN --mount=type=cache,target=/root/.cache/pip \
pip install -r requirements.txt
COPY . .
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
The # syntax=docker/dockerfile:1.6 directive at the top enables BuildKit features including cache mounts. Without it, the --mount flag is ignored. This is a silent failure I have shipped.
The combined effect of layer ordering plus cache mounts: a rebuild on a code change takes 20 to 40 seconds instead of 8 minutes. A rebuild on a new dependency takes 1 to 2 minutes instead of 8.
What are the 4 traps that quietly invalidate caches?
-
COPY . .as the first COPY in the file. Every file change invalidates dependency installation. Fix: copy the manifest first, install, then copy the code. This is the single biggest mistake. -
RUN apt-get update && apt-get install ...without combining withrm -rf /var/lib/apt/lists/*. Not a cache issue but an image size issue, and most beginner Dockerfiles forget it. The apt lists add 40MB of cached metadata to the final image. -
ADDinstead ofCOPYfor local files.ADDhas implicit behaviors (URL downloads, tar extraction) that make cache invalidation unpredictable. UseCOPYunless you needADD's specific features. -
A timestamp or build arg embedded in an early layer. If you do
ARG BUILD_TIMEand reference it in layer 2, every build rebuilds every subsequent layer because the arg changes every time. Put build args only in the last layer or use them to tag the final image, not to bake into intermediate layers.
How does multi-stage help agent images?
Multi-stage builds let you use a fat build image (gcc, dev headers, compile tools) to produce artifacts and then copy only the runtime artifacts into a slim final image. For Python agents this is mostly about producing a smaller runtime image, not about speeding up builds, but it does help caching by isolating slow build steps in a stage that rarely changes.
# filename: Dockerfile.multistage
# description: Multi-stage build for an agent image. Builder stage has
# gcc and dev headers; runtime stage has only what's needed to run.
# syntax=docker/dockerfile:1.6
FROM python:3.12-slim AS builder
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /build
COPY requirements.txt .
RUN --mount=type=cache,target=/root/.cache/pip \
pip install --prefix=/install -r requirements.txt
FROM python:3.12-slim AS runtime
COPY --from=builder /install /usr/local
WORKDIR /app
COPY . .
RUN adduser --disabled-password --no-create-home agent
USER agent
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
The runtime image does not include gcc, build-essential, or pip's wheel cache. It is 150MB instead of 400MB. It pulls faster, boots faster, and has a smaller attack surface.
For the broader security picture including the non-root user pattern shown here, see the Docker Non-Root User for Agentic AI Security post. For the full production FastAPI stack, see FastAPI and Uvicorn for Production Agentic AI Systems.
How do you cache across CI runs?
Docker's local cache only helps if the build reuses the same daemon state. CI systems spin up fresh runners for every build, so the local cache is empty. Fix by pushing the cache to a registry.
2 mechanisms to cache across CI runs:
--cache-fromand--cache-towith a registry. BuildKit supports exporting the full build cache to a tagged image (myapp:cache) and pulling it back on the next build. The first build is slow; subsequent builds hit the cache.- GitHub Actions cache. If you use
docker/build-push-action, settingcache-from: type=ghaandcache-to: type=gha,mode=maxuses the GitHub-hosted cache backend, which is fast and free for most repos.
Either mechanism turns a 5-minute fresh-runner build into a 40-second cached build after the first run. Combined with good layer ordering, it makes CI fast enough to run on every push without slowing anyone down.
What to do Monday morning
- Open your Dockerfile. If
COPY . .comes beforeRUN pip install, that is your single biggest problem. Reorder: copy the manifest, install, then copy the code. - Add the
# syntax=docker/dockerfile:1.6directive and the cache mount on the pip install line. Enable BuildKit (export DOCKER_BUILDKIT=1) if you are not already. - Check that you are using
COPY, notADD, for every local file. Remove anyADDstatements whose only purpose is copying a local file. - Split the build into a builder stage and a runtime stage if your image is over 300MB. The gcc-free runtime image is smaller and more secure.
- Wire up
cache-toandcache-fromin CI. Every subsequent build will reuse the cache from the previous one and the feedback loop speeds up immediately.
The headline: Docker layer caching is not magic. It is a layer ordering decision and a cache mount flag. 15 minutes to fix. 8 minutes saved per build. Every day. Forever.
Frequently asked questions
Why is my Docker build so slow even when I only change one file?
Because your Dockerfile copies the whole repo before installing dependencies. Any file change invalidates the copy layer and every layer after it, including the expensive pip install. Fix by copying only the dependency manifest first, installing dependencies, and then copying the rest of the code. That change alone usually cuts build time by 80 to 90 percent.
What is a BuildKit cache mount?
A cache mount (RUN --mount=type=cache) gives a build step a persistent cache directory that survives across builds. For pip, the cache lives in /root/.cache/pip and stores downloaded wheels. After the first build, every subsequent install is significantly faster because pip does not re-download from PyPI. You need BuildKit enabled and the # syntax=docker/dockerfile:1.6 directive at the top of the file.
Should I use multi-stage builds for Python agent images?
Yes if the final image is over 300MB or if you care about attack surface. A builder stage with gcc and build tools produces the Python wheels; a runtime stage copies only the installed packages into a slim base. The runtime image is smaller, boots faster, and has no build tools an attacker could exploit. For small agents with pure-Python dependencies, the savings are smaller but still meaningful.
What is the difference between copy and add in dockerfiles?
COPY copies local files only. ADD can download URLs and extract tar archives automatically, which makes cache invalidation harder to predict and is usually not what you want. Use COPY for local files every time. Use ADD only when you specifically need its extra behaviors, which is rare in modern Dockerfiles.
How do I cache Docker builds across CI runs?
Use --cache-from and --cache-to with either a registry backend (push the cache as a tagged image) or a CI-native backend like GitHub Actions cache. The first CI build is slow because the cache is cold; every subsequent build pulls the cache and skips the expensive layers. Combined with good layer ordering, this turns 5-minute CI builds into 40-second ones.
Key takeaways
- Docker invalidates a layer and every layer after it when the layer's input changes. Order your Dockerfile so slow-changing things come first.
- Copy
requirements.txtalone, install dependencies, then copy the rest of the code. This is the single biggest speedup and costs nothing. - Enable BuildKit and add a cache mount on the pip install line. pip downloads wheels once and reuses them across builds.
- Split builder and runtime stages to keep the final image small and secure. The runtime image should not include gcc or build tools.
- Wire up cache-to and cache-from in CI so every runner starts with a warm cache from the previous build.
- To see these patterns wired into a full production agent stack with security, auth, and observability, walk through the Build Your Own Coding Agent course, or start with the AI Agents Fundamentals primer.
For the official Docker BuildKit documentation covering cache mounts, multi-stage builds, and cache import/export, see the Docker BuildKit docs. Every optimization in this post is documented in detail there.
Continue Reading
Ready to go deeper?
Go beyond articles. Build production AI systems with hands-on workshops and our intensive AI Bootcamp.