When developing agents with LangGraph, one of the fundamental decisions developers face is choosing between Pydantic’s BaseModel and Python’s TypedDict for state management. Let’s explore these options to help you make the right choice for your agent implementation. For more information on Pydantic read our Pydantic data validation blog.
Understanding the Fundamentals
BaseModel
Pydantic’s BaseModel offers a robust, class-based approach to data modeling with built-in validation. It’s like having a strict but helpful guardian for your agent’s state, ensuring that data remains consistent and valid throughout the agent’s lifecycle.
from pydantic import BaseModel
class AgentState(BaseModel):
current_step: str
memory: list[str]
context: dict
TypedDict
TypedDict, on the other hand, provides a lighter, more streamlined approach. It’s Python’s native way of adding type hints to dictionaries, offering static type checking without the runtime overhead.
from typing import TypedDict
class AgentState(TypedDict):
current_step: str
memory: list[str]
context: dict
Key Differences and Trade-offs
Validation and Type Checking
- BaseModel performs runtime validation and type checking, catching errors as they happen
- TypedDict only provides static type hints, checked by tools like mypy but not during runtime
Feature Set
- BaseModel offers:
- Automatic type coercion
- Nested model support
- Rich validation rules
- Detailed error messages
- TypedDict provides:
- Lightweight type definitions
- Native Python integration
- Minimal overhead
- Basic type checking
When to Use Each Approach
Choose BaseModel When:
- Your agent has complex state requirements
- You need runtime validation
- You’re working with external APIs or untrusted data
- You want rich error messages and debugging support
Choose TypedDict When:
- Performance is a priority
- Your state structure is simple
- Static type checking is sufficient
Performance Considerations
Performance differences become apparent in larger applications:
- BaseModel:
- Higher memory usage
- Additional validation overhead
- Better for complex data structures
- TypedDict:
- Minimal memory footprint
- Fast instantiation
- Ideal for simple data structures
Implementation Examples
Let’s walk through complete agent implementations using both approaches to demonstrate the practical differences.
BaseModel Implementation
from pydantic import BaseModel, Field
from typing import List, Dict, Literal
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.messages import AIMessage, HumanMessage
# Define state with BaseModel
class AgentState(BaseModel):
conversation_history: List[HumanMessage | AIMessage] = Field(default_factory=list)
research_findings: Dict[str, str] = Field(default_factory=dict)
current_task: str = ""
status: Literal["researching", "answering", END] = "researching"
# Pydantic validation ensures these fields maintain correct types
# and we can use default_factory to initialize empty collections
# Define our nodes (agent components)
def researcher(state: AgentState) -> AgentState:
llm = ChatOpenAI(model="gpt-3.5-turbo")
# We can access state attributes directly as properties
prompt = ChatPromptTemplate.from_messages([
("system", "You are a research assistant. Find information about: {task}"),
("human", "I need information about {task}. Provide key facts.")
])
chain = prompt | llm
response = chain.invoke({"task": state.current_task})
# Update state using model methods
updated_state = state.model_copy(deep=True)
updated_state.research_findings[state.current_task] = response.content
updated_state.conversation_history.append(HumanMessage(content=f"Research: {state.current_task}"))
updated_state.conversation_history.append(AIMessage(content=response.content))
updated_state.status = "answering"
return updated_state
def answerer(state: AgentState) -> AgentState:
llm = ChatOpenAI(model="gpt-3.5-turbo")
research = state.research_findings.get(state.current_task, "No research found")
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant. Use the research to answer the question."),
("human", "Question: {task}\n\nResearch: {research}")
])
chain = prompt | llm
response = chain.invoke({"task": state.current_task, "research": research})
# Create updated state with validation
updated_state = state.model_copy(deep=True)
updated_state.conversation_history.append(AIMessage(content=response.content))
updated_state.status = "complete"
return updated_state
# Define conditional edges
def router(state: AgentState) -> str:
return state.status
# Build the graph
workflow = StateGraph(AgentState)
workflow.add_node("researcher", researcher)
workflow.add_node("answerer", answerer)
# Add edges
workflow.add_edge("researcher", "answerer")
workflow.add_edge("answerer", END)
workflow.set_entry_point("researcher")
# Compile the graph
agent = workflow.compile()
# Run the agent
result = agent.invoke({
"current_task": "quantum computing basics",
"status": "researching"
})
# Correct access for the result
# The result is returned as a dict-like object, not directly as our BaseModel
final_answer = result["conversation_history"][-1].content
print(final_answer)
TypedDict Implementation
from typing import TypedDict, List, Dict, Literal, Union, cast
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.messages import AIMessage, HumanMessage, BaseMessage
# Define state with TypedDict
class AgentState(TypedDict, total=False):
conversation_history: List[Union[HumanMessage, AIMessage]]
research_findings: Dict[str, str]
current_task: str
status: Literal["researching", "answering", END]
# Define our nodes (agent components)
def researcher(state: AgentState) -> AgentState:
llm = ChatOpenAI(model="gpt-3.5-turbo")
# With TypedDict, we access via dictionary style
task = state.get("current_task", "")
prompt = ChatPromptTemplate.from_messages([
("system", "You are a research assistant. Find information about: {task}"),
("human", "I need information about {task}. Provide key facts.")
])
chain = prompt | llm
response = chain.invoke({"task": task})
# Need to create a new dictionary for the updated state
# No validation happens here - we must be careful with types
conversation_history = state.get("conversation_history", [])
research_findings = state.get("research_findings", {})
return {
"conversation_history": conversation_history + [
HumanMessage(content=f"Research: {task}"),
AIMessage(content=response.content)
],
"research_findings": {
**research_findings,
task: response.content
},
"current_task": task,
"status": "answering"
}
def answerer(state: AgentState) -> AgentState:
llm = ChatOpenAI(model="gpt-3.5-turbo")
task = state.get("current_task", "")
research_findings = state.get("research_findings", {})
research = research_findings.get(task, "No research found")
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant. Use the research to answer the question."),
("human", "Question: {task}\n\nResearch: {research}")
])
chain = prompt | llm
response = chain.invoke({"task": task, "research": research})
# Create updated state (manual copy required)
conversation_history = state.get("conversation_history", [])
return {
"conversation_history": conversation_history + [
AIMessage(content=response.content)
],
"research_findings": research_findings,
"current_task": task,
"status": "complete"
}
# Define conditional edges
def router(state: AgentState) -> str:
return state.get("status", "researching")
# Build the graph
workflow = StateGraph(AgentState)
workflow.add_node("researcher", researcher)
workflow.add_node("answerer", answerer)
# Add edges
workflow.add_edge("researcher", "answerer")
workflow.add_edge("answerer", END)
workflow.set_entry_point("researcher")
# Compile the graph
agent = workflow.compile()
# Run the agent
result = agent.invoke({
"current_task": "quantum computing basics",
"status": "researching",
"conversation_history": [],
"research_findings": {}
})
# Access results with dictionary syntax
final_answer = result["conversation_history"][-1].content
print(final_answer)
Notice the key implementation differences:
- State Access and Modification: BaseModel uses attribute access and structured copying, while TypedDict uses dictionary-style access
- Default Values: BaseModel handles defaults elegantly with
Field(default_factory=list)
, while TypedDict requires manual defaults with.get()
- Validation: BaseModel enforces types at runtime, while TypedDict won’t raise errors for mismatched types
- State Updates: BaseModel uses
.model_copy()
for proper state updates, while TypedDict requires manual dictionary construction
Best Practices and Recommendations
- Start Simple: Begin with TypedDict if your agent state is straightforward
- Scale Up: Migrate to BaseModel when you need more robust validation
- Consider Context: Match your choice to your use case requirements
- Balance Features: Weigh validation needs against performance requirements
Conclusion
Both BaseModel and TypedDict are valid choices for LangGraph agent state management. BaseModel offers robust validation and rich features at the cost of performance, while TypedDict provides lightweight, efficient state management with static type checking. Choose based on your specific needs for validation, performance, and complexity.