How is this different from FastAPI AI Deployment Patterns?

FastAPI AI Deployment Patterns covers deployment patterns for AI services in general. This course is the architectural layers around an agent, focused on Langfuse for tracing, Prometheus for metrics, circuit breakers for resilience, and an eval pipeline in CI. It assumes you already know how to deploy a FastAPI app and goes deeper into the operational layers production agents actually need.

Do I need Langfuse cloud or can I self-host?

Either works. The course uses Langfuse cloud for simplicity, but every Langfuse SDK call in the course is identical against a self-hosted instance. The Prometheus and Grafana stack is fully self-hosted via the included docker-compose.yml.

Does this require Kubernetes?

No. The course runs on docker-compose. Kubernetes is the natural next step but is out of scope. The Serving LLMs at Scale course covers the Kubernetes side.

What if I am using Anthropic or another model provider?

The LLMRegistry abstraction in the source repo accepts any LangChain chat model. The course shows how to swap providers and how the connection pool, retry policy, and circuit breaker apply uniformly.

47% OFFYearly Pro

$30/mo$16/mobilled yearlyGet Pro

47% OFFYearly Pro$30/mo$16/mobilled yearlyGet Pro

47% OFFYearly Pro

$30/mo$16/mobilled yearlyGet Pro

47% OFFYearly Pro$30/mo$16/mobilled yearlyGet Pro

Premium course

Make your agent on-call-friendly

Name: Production agentic systems with Langfuse
Price: 49 USD
Availability: InStock

A working agent in a notebook is a demo. A working agent at 3am is a system. This course walks the architectural layers that turn one into the other: modular FastAPI codebase, rate limits, circuit breakers, Langfuse tracing, Prometheus metrics, Grafana dashboards, eval-in-CI, and a stress test that proves it scales.

Enroll Preview curriculum

Still deciding? Ask first.

Message a mentor about fit, prerequisites, or where to start. Replies come on WhatsApp, usually within a day.

Curriculum fit, prerequisites, or where to start
Honest answer, no pressure to enroll

Engineers are learning here from

NVIDIAMICROSOFTGRABWISEPIPEDRIVEBOLTGLIA

Take an agent that works on your laptop and make it on-call-friendly. Modular FastAPI codebase, rate limiting, circuit breakers, LangGraph workflow, Langfuse tracing, Prometheus metrics, Grafana dashboards, LLM-as-judge in CI, and a stress test that proves it scales. A guided curriculum that turns a notebook prototype into a service you would happily put on call rotation.

Turn a notebook agent into a service you would be happy to wake up to at 3am.

What you'll ship

Real projects, not toy demos.

A modular FastAPI service that hosts a LangGraph agent across versioned endpoints
A persistence layer with SQLModel data classes and Pydantic DTOs for clean boundaries
A security layer with rate limiting, prompt sanitization, and per-user context scoping
A service layer with connection pooling, retry policies, and circuit-breaker protection
Langfuse tracing on every agent turn, with linked Prometheus metrics and Grafana dashboards
A LangSmith-style LLM-as-judge eval suite that runs in GitHub Actions on every push
A Locust stress test that drives realistic traffic and a grafana dashboard that holds up under it

What you'll learn

You finish able to:

Structure an agent codebase into modules with clean boundaries between routes, services, agent logic, and persistence
Wire Langfuse tracing into LangGraph so every node, tool call, and LLM invocation lands in a queryable trace
Add rate limiting, prompt sanitization, and per-user context scoping at the API gateway layer
Wrap the LLM provider in connection pooling, exponential-backoff retries, and circuit-breaker protection
Expose Prometheus metrics, build Grafana dashboards, and link traces to metrics from a single trace ID
Run an LLM-as-judge eval pipeline in GitHub Actions that blocks regressions before merge
Drive a Locust stress test against the service and read the dashboards that reveal where it breaks

Curriculum

The production agentic system curriculum

01
Modular foundations
Project structure, configuration strategy, and the Dockerfile that turns code into a runtime.
3 lessons
02
Persistence and security
Data layer with SQLModel and DTOs, plus the security layer that keeps abusive traffic and risky prompts out.
3 lessons
03
Service layer for agents
Wrap the LLM provider in connection pooling, exponential-backoff retries, and circuit-breaker protection so external failure does not cascade.
3 lessons
04
Workflow and API gateway
A LangGraph multi-node workflow plus the FastAPI auth and streaming endpoints clients actually call.
3 lessons
05
Observability, evals, and stress
Langfuse traces on every agent turn, Prometheus + Grafana for the operational view, LLM-as-judge in CI, and a Locust stress test that proves it scales.
3 lessons

Who it's for

Is this for you?

Engineers running an agent in production today

Your agent is live and you have the scars to prove it. You want a coherent reference for the layers you have been bolting on ad-hoc.

AI engineers about to ship their first production agent

You know the agent works in a notebook. You do not yet know which layers to wrap around it before users find the edges.

Backend or platform engineers handed an agent to operationalize

A data scientist threw a notebook over the wall. You need a clean boundary between agent logic and the stack that runs it.

Tech leads designing the agent platform team owns

You want a reference architecture you can show the team and adapt for your own agent without reinventing the operational layer from scratch.

FAQ

Common questions.

How is this different from FastAPI AI Deployment Patterns?
FastAPI AI Deployment Patterns covers deployment patterns for AI services in general. This course is the architectural layers around an agent, focused on Langfuse for tracing, Prometheus for metrics, circuit breakers for resilience, and an eval pipeline in CI. It assumes you already know how to deploy a FastAPI app and goes deeper into the operational layers production agents actually need.
Do I need Langfuse cloud or can I self-host?
Either works. The course uses Langfuse cloud for simplicity, but every Langfuse SDK call in the course is identical against a self-hosted instance. The Prometheus and Grafana stack is fully self-hosted via the included docker-compose.yml.
Does this require Kubernetes?
No. The course runs on docker-compose. Kubernetes is the natural next step but is out of scope. The Serving LLMs at Scale course covers the Kubernetes side.
What if I am using Anthropic or another model provider?
The LLMRegistry abstraction in the source repo accepts any LangChain chat model. The course shows how to swap providers and how the connection pool, retry policy, and circuit breaker apply uniformly.

Pricing

Unlock this course with Pro.

One subscription unlocks every paid course and workshop replay. Pick yearly or monthly.

Unlock with Pro

$30$16/mo

You save 47% with regional pricing

Billed annually. Cancel anytime.

This course plus every paid course
Workshop replays in your library
New releases the day they ship

Still deciding? Ask Param a question

Build the agent system you would happily put on call rotation.

The agent is the easy part. The layers around it are why your service stays up.

Enroll

Production agentic systems with Langfuse

From $16/mo with Pro

47% OFFYearly Pro

$30/mo$16/mobilled yearlyGet Pro

47% OFFYearly Pro$30/mo$16/mobilled yearlyGet Pro

47% OFFYearly Pro

$30/mo$16/mobilled yearlyGet Pro

47% OFFYearly Pro$30/mo$16/mobilled yearlyGet Pro

Premium course

Make your agent on-call-friendly

Enroll Preview curriculum

Still deciding? Ask first.

Message a mentor about fit, prerequisites, or where to start. Replies come on WhatsApp, usually within a day.

Curriculum fit, prerequisites, or where to start
Honest answer, no pressure to enroll

Engineers are learning here from

NVIDIAMICROSOFTGRABWISEPIPEDRIVEBOLTGLIA

Turn a notebook agent into a service you would be happy to wake up to at 3am.

What you'll ship

Real projects, not toy demos.

A modular FastAPI service that hosts a LangGraph agent across versioned endpoints
A persistence layer with SQLModel data classes and Pydantic DTOs for clean boundaries
A security layer with rate limiting, prompt sanitization, and per-user context scoping
A service layer with connection pooling, retry policies, and circuit-breaker protection
Langfuse tracing on every agent turn, with linked Prometheus metrics and Grafana dashboards
A LangSmith-style LLM-as-judge eval suite that runs in GitHub Actions on every push
A Locust stress test that drives realistic traffic and a grafana dashboard that holds up under it

What you'll learn

You finish able to:

Structure an agent codebase into modules with clean boundaries between routes, services, agent logic, and persistence
Wire Langfuse tracing into LangGraph so every node, tool call, and LLM invocation lands in a queryable trace
Add rate limiting, prompt sanitization, and per-user context scoping at the API gateway layer
Wrap the LLM provider in connection pooling, exponential-backoff retries, and circuit-breaker protection
Expose Prometheus metrics, build Grafana dashboards, and link traces to metrics from a single trace ID
Run an LLM-as-judge eval pipeline in GitHub Actions that blocks regressions before merge
Drive a Locust stress test against the service and read the dashboards that reveal where it breaks

Curriculum

The production agentic system curriculum

01
Modular foundations
Project structure, configuration strategy, and the Dockerfile that turns code into a runtime.
3 lessons
02
Persistence and security
Data layer with SQLModel and DTOs, plus the security layer that keeps abusive traffic and risky prompts out.
3 lessons
03
Service layer for agents
Wrap the LLM provider in connection pooling, exponential-backoff retries, and circuit-breaker protection so external failure does not cascade.
3 lessons
04
Workflow and API gateway
A LangGraph multi-node workflow plus the FastAPI auth and streaming endpoints clients actually call.
3 lessons
05
Observability, evals, and stress
Langfuse traces on every agent turn, Prometheus + Grafana for the operational view, LLM-as-judge in CI, and a Locust stress test that proves it scales.
3 lessons

Who it's for

Is this for you?

Engineers running an agent in production today

Your agent is live and you have the scars to prove it. You want a coherent reference for the layers you have been bolting on ad-hoc.

AI engineers about to ship their first production agent

You know the agent works in a notebook. You do not yet know which layers to wrap around it before users find the edges.

Backend or platform engineers handed an agent to operationalize

A data scientist threw a notebook over the wall. You need a clean boundary between agent logic and the stack that runs it.

Tech leads designing the agent platform team owns

You want a reference architecture you can show the team and adapt for your own agent without reinventing the operational layer from scratch.

FAQ

Common questions.

How is this different from FastAPI AI Deployment Patterns?
FastAPI AI Deployment Patterns covers deployment patterns for AI services in general. This course is the architectural layers around an agent, focused on Langfuse for tracing, Prometheus for metrics, circuit breakers for resilience, and an eval pipeline in CI. It assumes you already know how to deploy a FastAPI app and goes deeper into the operational layers production agents actually need.
Do I need Langfuse cloud or can I self-host?
Either works. The course uses Langfuse cloud for simplicity, but every Langfuse SDK call in the course is identical against a self-hosted instance. The Prometheus and Grafana stack is fully self-hosted via the included docker-compose.yml.
Does this require Kubernetes?
No. The course runs on docker-compose. Kubernetes is the natural next step but is out of scope. The Serving LLMs at Scale course covers the Kubernetes side.
What if I am using Anthropic or another model provider?
The LLMRegistry abstraction in the source repo accepts any LangChain chat model. The course shows how to swap providers and how the connection pool, retry policy, and circuit breaker apply uniformly.

Pricing

Unlock this course with Pro.

One subscription unlocks every paid course and workshop replay. Pick yearly or monthly.

Unlock with Pro

$30$16/mo

You save 47% with regional pricing

Billed annually. Cancel anytime.

This course plus every paid course
Workshop replays in your library
New releases the day they ship

Still deciding? Ask Param a question

Build the agent system you would happily put on call rotation.

The agent is the easy part. The layers around it are why your service stays up.

Enroll

Production agentic systems with Langfuse

From $16/mo with Pro