Building an agent's brain with LangGraph
In our last post, we proved that a simple, linear RAG pipeline is "brittle." It fails when a user's question requires information from outside its static knowledge base.
To fix this, we need to build a "smarter" system, an agent that can make decisions. Instead of a simple checklist, we'll build a "state machine" or a "graph" that can:
- Look at a question.
- Decide which tool to use (our vector store OR a web search).
- Check if the tool's output was any good.
- Loop back and try a different tool if it failed.
Today, we'll build the "blueprint" for this agent using a powerful library called LangGraph.
Why isn't a linear chain enough?
Our old pipeline was a simple chain.
graph TD
A[Retrieve] --> B[Generate] --> C[Answer]
This is inflexible. If the Retrieve step fails, the whole chain fails.
What is a cyclic graph?
We need a graph that can loop, branch, and make decisions. This is our new blueprint:
graph TD
A[User Query] --> B(Route to Tool)
B -- "Internal Query" --> C[Retrieve from Vector Store]
B -- "External Query" --> D[Search the Web]
C --> E(Grade Documents)
D --> E
E -- "Good Docs" --> F[Generate Answer]
E -- "Bad Docs" --> D
F --> G[Final Answer]
style B fill:#e3f2fd,stroke:#0d47a1
style E fill:#e3f2fd,stroke:#0d47a1
This is a state machine. Route to Tool and Grade Documents are "conditional edges" (decision points). Bad Docs --> Web Search is a "cycle" (our self-correcting loop).
To build this, we'll use LangGraph.
How do you build the agent's components?
LangGraph works by defining "nodes" (the steps) and "edges" (the connections). First, let's build our "nodes."
Brick 1: the "memory" (the graphstate)
Before we build the "nodes," we need to define our agent's "memory." A GraphState is a simple Python object (a TypedDict) that gets passed from node to node. Every node can read from and write to this "memory."
# filename: example.py
# description: Code example from the post.
from typing import List, TypedDict
# This is the "memory" of our agent.
# Every node will have access to this.
class GraphState(TypedDict):
question: str # The user's query
documents: List[str] # The retrieved documents
generation: str # The final answer
Observation: This is the most important concept. By defining a shared "state," our generate node can see what the retrieve node found.
Brick 2: the "tools" (our nodes)
Now, we define our "tools" as plain Python functions. Each function takes the current state as input and returns a dictionary to update that state.
We need two tools to fetch information:
retrieve: Searches our internal ChromaDB (from Post 1).web_search: Searches the public internet.
The "How":
from langchain_community.tools import DuckDuckGoSearchRun
import chromadb
# Initialize our tools
search_tool = DuckDuckGoSearchRun()
chroma_client = chromadb.Client()
collection = chroma_client.get_collection(name="product_docs") # Get our DB from Post 1
# --- Node 1: The Internal Retriever ---
def retrieve(state):
print("---NODE: RETRIEVE---")
question = state["question"]
# Retrieve from our internal vector store
documents = collection.query(
query_texts=[question],
n_results=3
)['documents'][0]
return {"documents": documents, "question": question}
# --- Node 2: The Web Searcher ---
def web_search(state):
print("---NODE: WEB_SEARCH---")
question = state["question"]
# Call the DuckDuckGo tool
search_result = search_tool.run(question)
# We wrap the single string in a list to match our state's type
documents = [search_result]
return {"documents": documents, "question": question}
Observation: We've built the "hands" of our agent. It now has two different ways to find information. But how does it know which one to use? And how does it generate the final answer?
Think About It: Our
web_searchnode is a bit "dumb", it just returns one long string from DuckDuckGo. How could we make this node "smarter"? (Hint: What if it retrieved multiple search results and "chunked" them?)
Next step
We've built our agent's "memory" (GraphState) and its "tools" (retrieve, web_search).
In our next post, we'll build the "brain" itself:
- The
Router: The decision node that chooses which tool to use first. - The
Grader: The "self-correction" node that grades the results. - The
Generator: The node that writes the final answer.
Frequently asked questions
Why is a linear chain not enough for RAG agents?
Linear chains fail when a single tool (vector search or web search) doesn't have the answer. Your pipeline stops. A graph lets you branch: try retrieval, check the result, loop back and try web search if it failed. That flexibility is non-negotiable for production systems that need to handle unexpected questions.
How does state get passed between nodes in LangGraph?
You define an AgentState object that acts as shared memory. Each node reads from and writes to this state as it runs. Your retrieval node populates what it found. Your grading node reads that result and decides next steps. This shared state is how nodes coordinate decisions without passing data through function returns.
When should I use conditional edges in my agent?
Use conditional edges when different paths need different behavior: like retrieving internally versus searching the web, or retrying versus accepting a poor result. Linear flows are simpler but they fail at scale. Conditional edges add complexity but catch the cases where your simple approach just returns wrong answers and keeps moving forward.
For the full reference, see the LangGraph documentation.
Key takeaways
- Linear chains are brittle: When one step fails, the entire pipeline fails, we need graphs that can branch and loop
- State management is critical:
GraphStateallows nodes to share information and make decisions based on previous steps - Tools are just functions: Each tool is a simple Python function that takes state and returns updated state
- LangGraph enables dynamic flow: Unlike linear chains, graphs can have conditional edges and cycles for self-correction
- Foundation before logic: We build the tools first, then add the decision-making nodes that connect them
For more on LangGraph and agent frameworks, see our agent framework comparison.
For more on building production AI systems, check out our AI Bootcamp for Software Engineers.
Take the next step
- Agentic RAG & Text-to-SQL Workshop, Build agentic RAG systems with LangGraph hands-on
Continue Reading
Ready to go deeper?
Go beyond articles. Build production AI systems with hands-on workshops and our intensive AI Bootcamp.