Docker Compose for the full AI agent stack

Setting up the agent stack on a new laptop takes half a day

A new engineer joins your team. Their first day looks like: install Postgres, install Redis, clone the agent repo, set up a Python venv, install dependencies, set 15 environment variables, migrate the database, install Langfuse for traces, figure out why the agent can't reach Postgres on localhost, give up, ping you on Slack. You spend an hour walking them through the setup. Then next week someone else joins and you do it again.

The fix is Docker Compose. One file describes every service the agent needs, one command starts them all, one command stops them all. Every engineer gets the same stack in under 90 seconds. Everything is isolated and reproducible. Production pushes the same image that docker compose up built locally.

This post is the Docker Compose pattern for the full AI agent stack: which services belong in the Compose file, the network rules that avoid "cannot reach Postgres" errors, volume persistence for data that should survive restarts, and the 3 gotchas that catch first-time users.

Why is manual setup the wrong default for agent stacks?

Because an agent stack has 4-6 dependencies that each need their own version, config, and port. 3 specific failure modes of manual setup:

Version drift. Engineer A has Postgres 15, engineer B has Postgres 16, production has Postgres 14. Bugs that only reproduce on one machine.
Port conflicts. Postgres on 5432, Langfuse on 3000, Redis on 6379, if anything else on the laptop uses these ports, setup fails.
Tedium. Onboarding is a full afternoon of yak-shaving. Engineers stop wanting to help new hires.

Docker Compose fixes all 3 by shipping a single declarative file that describes the whole stack, pins every version, and isolates networking to a private bridge.

graph LR
    Compose[docker compose up] --> Agent[agent service]
    Compose --> DB[(postgres)]
    Compose --> Cache[(redis)]
    Compose --> Trace[langfuse]

    Agent --> DB
    Agent --> Cache
    Agent --> Trace

    subgraph private_network
        Agent
        DB
        Cache
        Trace
    end

    style Compose fill:#dcfce7,stroke:#15803d

What goes in the Compose file?

Every service your agent needs at runtime, nothing more. Typical agent stack: your app, Postgres, Redis, maybe Langfuse for tracing and Grafana for metrics.

# filename: docker-compose.yml
# description: Full AI agent stack for local development.
# Run: docker compose up
version: '3.9'

services:
  agent:
    build:
      context: .
      dockerfile: Dockerfile
      target: runtime
    ports:
      - "8000:8000"
    environment:
      APP_ENV: development
      DATABASE_URL: postgresql+asyncpg://agent:devpass@postgres:5432/agent
      REDIS_URL: redis://redis:6379
      LANGFUSE_HOST: http://langfuse:3000
      OPENAI_API_KEY: ${OPENAI_API_KEY:-}
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy
    restart: unless-stopped

  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_USER: agent
      POSTGRES_PASSWORD: devpass
      POSTGRES_DB: agent
    ports:
      - "5432:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD", "pg_isready", "-U", "agent"]
      interval: 5s
      timeout: 3s
      retries: 5

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis_data:/data
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 5s
      timeout: 3s
      retries: 5

  langfuse:
    image: langfuse/langfuse:2
    ports:
      - "3000:3000"
    environment:
      DATABASE_URL: postgresql://agent:devpass@postgres:5432/langfuse
      NEXTAUTH_SECRET: dev-secret-change-me
      SALT: dev-salt
      NEXTAUTH_URL: http://localhost:3000
    depends_on:
      postgres:
        condition: service_healthy

volumes:
  postgres_data:
  redis_data:

5 decisions in this file do real work. depends_on with service_healthy ensures the agent waits for Postgres to be ready, not just started. healthcheck blocks on both Postgres and Redis. Named volumes persist data across docker compose down so databases do not reset every time. OPENAI_API_KEY: ${OPENAI_API_KEY:-} reads from the host env so secrets never land in the file. The agent service uses service names (postgres, redis) as hostnames because Compose sets up internal DNS.

For the Dockerfile that backs the agent service, see the Dockerizing AI systems layered approach post.

Why use service names instead of localhost?

Because each service runs in its own container with its own network namespace. Inside the agent container, localhost means the agent itself, not the Postgres container. Compose sets up a private network where every service is reachable by its name (the key in the YAML services: map).

The common bug: an engineer runs docker compose up, then ssh's into the agent container, and tries psql -h localhost. It fails. The fix is psql -h postgres. Inside a Compose network, service names are hostnames.

How do you handle data persistence?

Named volumes. Anything written to the volume persists across docker compose down / up cycles. The Postgres volume keeps your database between sessions; the Redis volume keeps the cache.

volumes:
  postgres_data:
  redis_data:

To wipe everything (useful when you want a clean slate):

docker compose down -v   # -v removes volumes

Without -v, volumes survive across stack restarts. With -v, you get a fresh database on the next up.

What are the 3 gotchas first-timers hit?

Using localhost inside a service. See above. Always use the service name as the hostname.
Not setting a depends_on healthcheck. depends_on without condition: service_healthy only waits for the container to START, not for the service to be READY. Postgres takes 2-3 seconds to start accepting connections. Without the healthcheck, the agent boots before Postgres is ready, fails its first DB connection, and crashes.
Putting secrets directly in the YAML. The Compose file goes into git. Secrets do not. Use ${VAR} substitution to read from the host environment, and keep real values in a .env file that is gitignored.

For the full Pydantic Settings pattern that reads secrets from env vars, see the Environment variable parsing for Python AI services post.

How do you override for development vs CI?

Use docker-compose.override.yml for local dev and explicit -f flags for CI.

# filename: docker-compose.override.yml
# description: Local dev overrides, gitignored.
services:
  agent:
    volumes:
      - ./app:/app/app  # hot reload
    command: uvicorn app.main:app --reload --host 0.0.0.0

Compose automatically merges docker-compose.yml + docker-compose.override.yml when you run docker compose up. In CI, run docker compose -f docker-compose.yml -f docker-compose.ci.yml up to pick a different overlay. This keeps the base file clean and environments explicit.

What to do Monday morning

Write a docker-compose.yml that lists every dependency your agent needs: your service, Postgres, Redis, any observability tool.
Add healthchecks to every service that the agent depends on. Use depends_on: service_healthy so the agent waits for real readiness.
Pull secrets from the host env using ${VAR} syntax. Never hardcode them in the YAML.
Run docker compose up on a fresh laptop. If setup takes more than 2 minutes (including image pulls), something is wrong with your config.
Add docker-compose.override.yml to .gitignore and document in the README which overrides live there (hot reload, dev-only volumes).

The headline: one YAML file replaces an afternoon of onboarding. Every engineer runs the same command, gets the same stack, moves on. The docker compose up experience is the best tool for developer experience on an agent project.

Frequently asked questions

Why use Docker Compose instead of running services manually?

Because manual setup takes hours per engineer and produces inconsistent versions, port conflicts, and bugs that only reproduce on one machine. Compose ships a single YAML that pins every version, isolates networking, and starts the whole stack in under 90 seconds. New engineers go from git clone to running agent in one command.

How do I connect the agent container to Postgres in Compose?

Use the service name as the hostname. In the agent's DATABASE_URL, set the host to postgres (the key in the services: map), not localhost. Compose automatically creates a private network where every service is reachable by its name. localhost inside the agent container refers to the agent itself, not to Postgres.

Why does my agent fail on the first connection to Postgres?

Because depends_on without a healthcheck condition only waits for the Postgres container to START, not to be READY. Postgres takes 2-3 seconds to accept connections after it starts. Add condition: service_healthy to the depends_on and a healthcheck block to the Postgres service. The agent now waits for real readiness.

How do I keep data between docker compose restarts?

Use named volumes in the volumes: section and mount them into the services that need persistence (Postgres, Redis). A docker compose down preserves volumes by default; use docker compose down -v to wipe everything when you want a clean slate.

Can I use Docker Compose in production?

You can, but it is not the best fit for multi-host production deployments. Compose is great for local dev, CI, and single-host staging. For multi-host production, use Kubernetes, Nomad, or a managed container service (ECS, Cloud Run) that provides scheduling, rolling updates, and horizontal autoscaling. The Dockerfile you built for Compose ports directly to those platforms.

Key takeaways

Manual setup of an agent stack takes hours per engineer and produces inconsistent environments. Docker Compose fixes both in one YAML file.
Use service names (postgres, redis) as hostnames inside containers. localhost means the container itself, not a sibling service.
Add healthchecks to services the agent depends on. Use depends_on: service_healthy so the agent waits for real readiness, not just container start.
Persist data with named volumes. A docker compose down preserves them; down -v wipes everything for a clean slate.
Read secrets from the host env via ${VAR} substitution. Never hardcode them in the YAML.
To see Docker Compose wired into a full production agent workflow with CI and observability, walk through the Build your own coding agent course, or start with the AI Agents Fundamentals primer.

For the full Docker Compose documentation covering profiles, extends, and advanced networking, see the Docker Compose docs. The healthcheck and depends_on conditions are documented under "Startup order".

Setting up the agent stack on a new laptop takes half a day

Why is manual setup the wrong default for agent stacks?

Because an agent stack has 4-6 dependencies that each need their own version, config, and port. 3 specific failure modes of manual setup:

Version drift. Engineer A has Postgres 15, engineer B has Postgres 16, production has Postgres 14. Bugs that only reproduce on one machine.
Port conflicts. Postgres on 5432, Langfuse on 3000, Redis on 6379, if anything else on the laptop uses these ports, setup fails.
Tedium. Onboarding is a full afternoon of yak-shaving. Engineers stop wanting to help new hires.

Docker Compose fixes all 3 by shipping a single declarative file that describes the whole stack, pins every version, and isolates networking to a private bridge.

graph LR
    Compose[docker compose up] --> Agent[agent service]
    Compose --> DB[(postgres)]
    Compose --> Cache[(redis)]
    Compose --> Trace[langfuse]

    Agent --> DB
    Agent --> Cache
    Agent --> Trace

    subgraph private_network
        Agent
        DB
        Cache
        Trace
    end

    style Compose fill:#dcfce7,stroke:#15803d

What goes in the Compose file?

Every service your agent needs at runtime, nothing more. Typical agent stack: your app, Postgres, Redis, maybe Langfuse for tracing and Grafana for metrics.

# filename: docker-compose.yml
# description: Full AI agent stack for local development.
# Run: docker compose up
version: '3.9'

services:
  agent:
    build:
      context: .
      dockerfile: Dockerfile
      target: runtime
    ports:
      - "8000:8000"
    environment:
      APP_ENV: development
      DATABASE_URL: postgresql+asyncpg://agent:devpass@postgres:5432/agent
      REDIS_URL: redis://redis:6379
      LANGFUSE_HOST: http://langfuse:3000
      OPENAI_API_KEY: ${OPENAI_API_KEY:-}
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy
    restart: unless-stopped

  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_USER: agent
      POSTGRES_PASSWORD: devpass
      POSTGRES_DB: agent
    ports:
      - "5432:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD", "pg_isready", "-U", "agent"]
      interval: 5s
      timeout: 3s
      retries: 5

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis_data:/data
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 5s
      timeout: 3s
      retries: 5

  langfuse:
    image: langfuse/langfuse:2
    ports:
      - "3000:3000"
    environment:
      DATABASE_URL: postgresql://agent:devpass@postgres:5432/langfuse
      NEXTAUTH_SECRET: dev-secret-change-me
      SALT: dev-salt
      NEXTAUTH_URL: http://localhost:3000
    depends_on:
      postgres:
        condition: service_healthy

volumes:
  postgres_data:
  redis_data:

For the Dockerfile that backs the agent service, see the Dockerizing AI systems layered approach post.

Why use service names instead of localhost?

How do you handle data persistence?

Named volumes. Anything written to the volume persists across docker compose down / up cycles. The Postgres volume keeps your database between sessions; the Redis volume keeps the cache.

volumes:
  postgres_data:
  redis_data:

To wipe everything (useful when you want a clean slate):

docker compose down -v   # -v removes volumes

Without -v, volumes survive across stack restarts. With -v, you get a fresh database on the next up.

What are the 3 gotchas first-timers hit?

Using localhost inside a service. See above. Always use the service name as the hostname.
Not setting a depends_on healthcheck. depends_on without condition: service_healthy only waits for the container to START, not for the service to be READY. Postgres takes 2-3 seconds to start accepting connections. Without the healthcheck, the agent boots before Postgres is ready, fails its first DB connection, and crashes.
Putting secrets directly in the YAML. The Compose file goes into git. Secrets do not. Use ${VAR} substitution to read from the host environment, and keep real values in a .env file that is gitignored.

For the full Pydantic Settings pattern that reads secrets from env vars, see the Environment variable parsing for Python AI services post.

How do you override for development vs CI?

Use docker-compose.override.yml for local dev and explicit -f flags for CI.

# filename: docker-compose.override.yml
# description: Local dev overrides, gitignored.
services:
  agent:
    volumes:
      - ./app:/app/app  # hot reload
    command: uvicorn app.main:app --reload --host 0.0.0.0

What to do Monday morning

Write a docker-compose.yml that lists every dependency your agent needs: your service, Postgres, Redis, any observability tool.
Add healthchecks to every service that the agent depends on. Use depends_on: service_healthy so the agent waits for real readiness.
Pull secrets from the host env using ${VAR} syntax. Never hardcode them in the YAML.
Run docker compose up on a fresh laptop. If setup takes more than 2 minutes (including image pulls), something is wrong with your config.
Add docker-compose.override.yml to .gitignore and document in the README which overrides live there (hot reload, dev-only volumes).

Manual setup of an agent stack takes hours per engineer and produces inconsistent environments. Docker Compose fixes both in one YAML file.
Use service names (postgres, redis) as hostnames inside containers. localhost means the container itself, not a sibling service.
Add healthchecks to services the agent depends on. Use depends_on: service_healthy so the agent waits for real readiness, not just container start.
Persist data with named volumes. A docker compose down preserves them; down -v wipes everything for a clean slate.
Read secrets from the host env via ${VAR} substitution. Never hardcode them in the YAML.
To see Docker Compose wired into a full production agent workflow with CI and observability, walk through the Build your own coding agent course, or start with the AI Agents Fundamentals primer.

Docker Compose for the full AI agent stack

Share this post

Share this post

Continue Reading

Which language should you build Redis in? Lessons from rebuilding it 6 times

Query anonymization for RAG bias mitigation

pip vs uv vs poetry for Python AI services

Weekly Bytes of AI

Ready to go deeper?

Docker Compose for the full AI agent stack

Share this post

Share this post

Continue Reading

Which language should you build Redis in? Lessons from rebuilding it 6 times

Query anonymization for RAG bias mitigation

pip vs uv vs poetry for Python AI services

Weekly Bytes of AI

Ready to go deeper?