The most common way agent teams leak API keys

Your agent has one .env file for every environment. It holds your real OpenAI key, your real Postgres URL, and DEBUG=True because you were testing yesterday. No .env.development for local work, no separate production config, just one file copied around. You commit a feature, push to staging, and the deploy works. A week later you notice your OpenAI usage doubled overnight. Someone on the team ran a load test against the local agent and pointed it at the production database by accident, because the local config and the production config were the same file.

This is the boring failure mode that costs real money. It is not exotic. It is the default outcome when you treat configuration as a single file you copy around.

Splitting your environment files is the cheapest, most underrated production upgrade for an agentic AI system. It takes 30 minutes and prevents an entire class of incidents. This post is how I do it, why the precedence matters, and the exact files you should put in your repo on Monday.

Why does one .env file stop working as soon as you have users?

A single .env file is fine when there is only one of you and one machine. The moment you add staging, a second engineer, or a CI runner, the model breaks in 3 predictable ways.

graph TD
    Dev[Developer Laptop] -->|reads| ENV1[.env - prod keys, debug on]
    CI[CI Runner] -->|reads| ENV2[.env - test keys, debug on]
    Stage[Staging Container] -->|reads| ENV3[.env - prod keys, debug on]
    Prod[Production Container] -->|reads| ENV4[.env - prod keys, debug off]

    ENV1 -.->|same filename, different content| Drift[Config Drift]
    ENV2 -.-> Drift
    ENV3 -.-> Drift
    ENV4 -.-> Drift

    Drift --> Bug1[Wrong DB hit]
    Drift --> Bug2[Leaked keys in logs]
    Drift --> Bug3[Debug routes exposed]

    style Drift fill:#fee2e2,stroke:#b91c1c

The 3 failures are always the same. First, debug mode leaks into production because nobody remembered to flip a flag. Second, a developer points the local agent at the production vector store while testing a tool, and writes garbage embeddings into it. Third, the CI runner uses real LLM credits because the test config inherited from .env.

None of these are mistakes by careless engineers. They are guaranteed outcomes of a config model where one filename means 4 different things depending on who runs it.

What are .env.development and .env, really?

.env.development and .env are 2 separate files that hold environment variables for 2 different runtime modes. Your loader picks which one to use based on a single signal, usually APP_ENV or NODE_ENV. Production reads .env. Development reads .env.development. They never overwrite each other.

The naming convention comes from the 12-Factor App methodology, which insists that config must live outside code and must vary per environment. The files are just an ergonomic way to express that locally. In production they are usually replaced entirely by injected environment variables from your secret manager, but the pattern is the same: per-environment, never shared.

I run 4 files in most agent projects:

File Loaded when What it contains Committed to git?
.env.development local dev Test keys, local Postgres URL, DEBUG=True No
.env.test CI / pytest Mock keys, in-memory DB, fake LLM provider No
.env.staging staging container Real keys for staging, separate vector store No
.env production container Production keys, DEBUG=False No
.env.example reference for new devs Every key with placeholder values, no secrets Yes

Only .env.example is in version control. Everything else is in .gitignore. This is the piece beginners get wrong: they commit .env once "by accident" and the secret lives in git history forever, even after they delete it.

How do you load env files cleanly with Pydantic settings?

Manual os.environ.get('OPENAI_API_KEY') works but it gives you no validation, no type coercion, and no clear contract for what your agent needs to run. Pydantic Settings (the v2 successor to pydantic.BaseSettings) is the right tool. It loads from a file, validates types, fails fast on missing keys, and lets you override with real environment variables in production.

# filename: app/config.py
# description: Single source of truth for agent configuration. Loads from
# the file matching APP_ENV, falls back to .env, then to real env vars.
import os
from functools import lru_cache
from pydantic import Field
from pydantic_settings import BaseSettings, SettingsConfigDict


class Settings(BaseSettings):
    app_env: str = Field(default='development')
    debug: bool = Field(default=False)

    openai_api_key: str
    database_url: str
    vector_store_url: str

    log_level: str = Field(default='INFO')
    request_timeout_s: int = Field(default=60)

    model_config = SettingsConfigDict(
        env_file=(
            '.env',                     # always loaded as base
            f'.env.{os.getenv("APP_ENV", "development")}',  # overrides
        ),
        env_file_encoding='utf-8',
        extra='forbid',  # unknown env vars become hard errors
    )


@lru_cache
def get_settings() -> Settings:
    return Settings()

3 things in here matter more than they look. The env_file tuple is loaded in order, so values in the more specific file win. extra='forbid' means a typo in your .env.development becomes a startup error instead of a silent default. And lru_cache on get_settings means the file is parsed once per process, not once per request.

In production, set APP_ENV=production as a real environment variable on the container and let your secret manager (AWS Parameter Store, Doppler, Infisical, whatever) inject the actual values directly. Pydantic Settings reads real env vars before file values, so the secret manager always wins.

Why do debug flags belong in the same file as secrets?

A common piece of advice is "debug flags are not secrets, put them in code." This is wrong for agentic systems. Your debug flag controls whether tracebacks leak into HTTP responses, whether your agent dumps prompts into logs, and whether it skips the rate limiter. Those are security-sensitive defaults. They belong in the same loader, with the same per-environment override, as your API keys.

# filename: app/agent.py
# description: A single config object decides whether to log prompts,
# whether to enforce rate limits, and which LLM client to use.
from app.config import get_settings

settings = get_settings()

async def run_agent(message: str) -> str:
    if settings.debug:
        # only ever true in .env.development and .env.test
        print('PROMPT:', message)

    if settings.app_env == 'production':
        await rate_limiter.check()

    response = await llm.complete(message, timeout=settings.request_timeout_s)
    return response

Notice that app/agent.py does not read environment variables at all. It only reads settings. That makes the agent testable: in pytest, you instantiate Settings(...) with explicit values and inject it. No mocking os.environ. No conditional imports. This pattern is the same one we use throughout the Build Your Own Coding Agent course, where every tool, every prompt template, and every model client reads from a single typed config object.

If you want a refresher on the underlying reasoning loop before you wire up config, the free AI Agents Fundamentals resource covers the structure of a minimal agent end to end.

How do you manage env variables for an agent in production?

The answer is: you do not put production secrets in a .env file at all. The file pattern is for local development. In production, the workflow is:

  1. Your container starts with APP_ENV=production set by the orchestrator (Kubernetes, ECS, Fly, Render).
  2. A secret manager injects real values for OPENAI_API_KEY, DATABASE_URL, etc., as actual environment variables.
  3. Pydantic Settings reads those real env vars and never opens a file.

You keep the .env file pattern for development consistency. The same Settings class works in both modes. What changes is who owns the values.

This is the part that confuses people coming from Rails or Django, where secrets.yml was a thing you committed (encrypted). Modern cloud-native config means: code in git, values in a secret manager, no overlap.

What should you .gitignore for an agent project?

This is the file most agent repos get wrong. Here is the one I use, copy it as is:

# filename: .gitignore (excerpt)
# description: Ignore every real env file. Commit only .env.example.
.env
.env.*
!.env.example

# Common offenders that leak the same secrets
.envrc
*.env.local
secrets.yaml
secrets.yml
config/local.json

The !.env.example line is what makes this safe. It re-includes the example file so new contributors can see every variable they need to set, with obvious placeholder values like OPENAI_API_KEY=sk-replace-me.

If you have ever shipped a .env to git and tried to remove it, you already know that git rm does not erase history. You have to rotate every secret it contained, force-push a rewritten history, and hope nobody cloned the repo in between. Prevention is the only real fix.

For a fuller picture of how config slots into a larger production stack, see the System Design: Building a Production-Ready AI Chatbot post, which shows where settings, lifespan, and secret loading meet.

What to do Monday morning

5 steps that take less than an hour:

  1. Rename your existing .env to .env.development. Strip any production keys out of it. Replace them with throwaway dev keys (most LLM providers let you create scoped keys for free).
  2. Create .env.example with every variable name and a placeholder value. Commit it.
  3. Add the .gitignore snippet above. Verify with git check-ignore -v .env that the file is actually ignored.
  4. Replace every os.environ.get(...) in your codebase with a Settings field. Add extra='forbid' so typos fail loudly.
  5. In your deploy pipeline, set APP_ENV=production and confirm your secret manager is the source of real values. Never copy .env to a server.

If your team has more than 2 people, also add a pre-commit hook that scans staged files for known secret patterns. gitleaks is the one I use. It catches the mistake before it reaches the remote.

Frequently asked questions

What is the difference between .env and .env.development?

.env.development holds local development values like test API keys and a local Postgres URL. .env is reserved for production values, and in most modern setups it does not exist on your laptop at all. The runtime decides which file to load based on an environment variable like APP_ENV or NODE_ENV. Keeping them separate prevents production secrets from sitting on every developer's machine.

Should you commit .env files to git?

No. Commit only .env.example, which lists every required variable with a placeholder value. Real .env, .env.development, .env.staging, and .env.production files must be in .gitignore. Once a real secret reaches git history, the only safe fix is to rotate the secret, because removing the file from history does not protect anyone who already cloned the repo.

How do you load environment variables in a FastAPI agent?

Use Pydantic Settings (pydantic-settings) and a single Settings class. Configure env_file as a tuple where the more specific file is listed last so it overrides defaults. Read settings through a cached get_settings() function so the file is parsed once per process. In production, real environment variables injected by your secret manager will take precedence over any file value automatically.

What is the 12-factor config principle?

12-Factor says configuration must be stored in the environment, never in code, and must vary cleanly across deploys. Concretely: no if env == "prod" branches, no committed secrets, and no shared config files between environments. The .env.development versus .env split is the file-based expression of that rule. In production, real environment variables replace the files entirely.

Is .env.development the same as .env.local?

Almost, but the convention differs by framework. Next.js uses .env.local for machine-specific overrides that should never be committed, on top of .env.development for shared dev defaults. Python and FastAPI projects usually skip the layered model and use .env.development directly. Pick one convention per repo, document it in README.md, and never mix both.

Key takeaways

  1. One .env file cannot represent 4 environments. Splitting them prevents the most common class of agent incidents: wrong database, leaked keys, debug flags in production.
  2. Use Pydantic Settings with an env_file tuple so the per-environment file overrides a base file. Set extra='forbid' so typos crash on startup.
  3. In production, do not ship a .env at all. Inject values from a secret manager and let the same Settings class read them from real environment variables.
  4. .gitignore every real env file and commit only .env.example. The !.env.example exception is the trick that makes it safe.
  5. Debug flags belong in the same loader as secrets. They are security-sensitive defaults, not "just dev tools."
  6. To see this config pattern wired into a working agent loop with tools and persistence, walk through the Build Your Own Coding Agent course, or start with the conceptual AI Agents Fundamentals primer if you are still building your first agent.

For the underlying methodology, see the 12-Factor App: Config chapter. It is short, opinionated, and still the best argument for why this matters.

Share this post

Continue Reading

Weekly Bytes of AI

Technical deep-dives for engineers building production AI systems.

Architecture patterns, system design, cost optimization, and real-world case studies. No fluff, just engineering insights.

Unsubscribe anytime. We respect your inbox.

Ready to go deeper?

Go beyond articles. Build production AI systems with hands-on workshops and our intensive AI Bootcamp.