Loading...
Loading...
A working agent in a notebook is a demo. A working agent at 3am is a system. This course walks the architectural layers that turn one into the other: modular FastAPI codebase, rate limits, circuit breakers, Langfuse tracing, Prometheus metrics, Grafana dashboards, eval-in-CI, and a stress test that proves it scales.
Message a mentor about fit, prerequisites, or where to start. Replies come on WhatsApp, usually within a day.
Engineers are learning here from
Take an agent that works on your laptop and make it on-call-friendly. Modular FastAPI codebase, rate limiting, circuit breakers, LangGraph workflow, Langfuse tracing, Prometheus metrics, Grafana dashboards, LLM-as-judge in CI, and a stress test that proves it scales. A guided curriculum that turns a notebook prototype into a service you would happily put on call rotation.
Turn a notebook agent into a service you would be happy to wake up to at 3am.
What you'll ship
What you'll learn
Curriculum
Modular foundations
Project structure, configuration strategy, and the Dockerfile that turns code into a runtime.
Persistence and security
Data layer with SQLModel and DTOs, plus the security layer that keeps abusive traffic and risky prompts out.
Service layer for agents
Wrap the LLM provider in connection pooling, exponential-backoff retries, and circuit-breaker protection so external failure does not cascade.
Workflow and API gateway
A LangGraph multi-node workflow plus the FastAPI auth and streaming endpoints clients actually call.
Observability, evals, and stress
Langfuse traces on every agent turn, Prometheus + Grafana for the operational view, LLM-as-judge in CI, and a Locust stress test that proves it scales.
Who it's for
Your agent is live and you have the scars to prove it. You want a coherent reference for the layers you have been bolting on ad-hoc.
You know the agent works in a notebook. You do not yet know which layers to wrap around it before users find the edges.
A data scientist threw a notebook over the wall. You need a clean boundary between agent logic and the stack that runs it.
You want a reference architecture you can show the team and adapt for your own agent without reinventing the operational layer from scratch.
FAQ
FastAPI AI Deployment Patterns covers deployment patterns for AI services in general. This course is the architectural layers around an agent, focused on Langfuse for tracing, Prometheus for metrics, circuit breakers for resilience, and an eval pipeline in CI. It assumes you already know how to deploy a FastAPI app and goes deeper into the operational layers production agents actually need.
Either works. The course uses Langfuse cloud for simplicity, but every Langfuse SDK call in the course is identical against a self-hosted instance. The Prometheus and Grafana stack is fully self-hosted via the included docker-compose.yml.
No. The course runs on docker-compose. Kubernetes is the natural next step but is out of scope. The Serving LLMs at Scale course covers the Kubernetes side.
The LLMRegistry abstraction in the source repo accepts any LangChain chat model. The course shows how to swap providers and how the connection pool, retry policy, and circuit breaker apply uniformly.
Pricing
One subscription unlocks every paid course and workshop replay. Pick yearly or monthly.
Unlock with Pro
You save 47% with regional pricing
Billed annually. Cancel anytime.
Still deciding? Ask Param a question
The agent is the easy part. The layers around it are why your service stays up.
Production agentic systems with Langfuse
From $16/mo with Pro