Skip to main content
Multi-Agent Systems

Agent Design Patterns: ReAct, Reflexion, Plan-and-Execute, and Supervisor-Worker

Six proven agent design patterns for autonomous agents — ReAct loops, Reflexion self-critique, Plan-and-Execute, supervisor-worker hierarchies, and memory-augmented reasoning — with Python code examples.

Inductivee Team· AI EngineeringFebruary 25, 2026(updated April 15, 2026)18 min read
TL;DR

ReAct (Reason + Act) is the baseline pattern for tool-using agents and the foundation of most LangChain agent implementations. Supervisor-Worker hierarchy is the production standard for enterprise multi-agent systems requiring specialized reasoning across domains. Memory-Augmented agents are required for any workflow that spans multiple sessions, days, or requires accumulated context — without episodic memory, every agent conversation starts from zero.

What Makes an Agent Autonomous

The defining characteristic of an autonomous agent is the perceive-reason-act loop. An agent observes its current state — which includes tool outputs from previous steps, items retrieved from memory, the user's original request, and any intermediate results accumulated so far. It then reasons about what action to take next by invoking an LLM with structured output, producing either a tool call specification or a final answer. If a tool call is produced, the agent executes it, observes the result, and loops back to the reasoning step. This continues until the agent determines the goal has been achieved or a stopping condition is triggered.

This loop-based architecture is the key distinction between an agent and a chain. A chain is a fixed sequence: step 1 always calls function A, step 2 always calls function B, step 3 always produces output. The execution path is predetermined. An agent has a dynamic execution path — the LLM decides at each step which action to take based on the accumulated state. This decision-making capability is what makes agents capable of handling novel situations that were not explicitly programmed, but it also introduces non-determinism that requires careful engineering to control in production.

Note that autonomy exists on a spectrum. A simple ReAct agent with two tools and a maximum of five iterations is far less autonomous than a multi-agent system with persistent memory, external API access, and the ability to spawn sub-agents. The design patterns below represent progressively more sophisticated points on this spectrum, each appropriate for different classes of enterprise workflow.

Six Core Agentic Design Patterns

Pattern 1: ReAct (Reason + Act)

ReAct is the foundational pattern for tool-using agents, introduced in the 2022 Yao et al. paper. The agent interleaves reasoning traces (Thought: I need to look up the current inventory level for SKU-4421) with tool calls (Action: inventory_lookup(sku='4421')) and observations (Observation: Current stock is 847 units, reorder threshold is 500). This thought-action-observation loop continues until the agent produces a final answer.

ReAct is the right pattern for single-agent, tool-augmented tasks: answering questions that require database lookups, performing calculations, retrieving current information, or executing a sequence of API calls. It forms the foundation of most LangChain create_react_agent implementations. The limitation is that a single ReAct agent has a single reasoning context — it cannot parallelize reasoning across independent subtasks or hand off to specialized agents.

Pattern 2: Reflexion

Reflexion adds a self-evaluation step to the ReAct loop. After the agent produces an output (answer, code, analysis), a reflection prompt asks the LLM to critique its own output: What assumptions did I make? What might be wrong or incomplete? What would I do differently? The critique is appended to the context and the agent tries again.

This pattern improves accuracy on complex reasoning tasks, code generation, and structured output tasks by 15-25% compared to single-shot ReAct, particularly on tasks where errors are detectable programmatically (the code does not compile, the JSON is malformed, the math does not check out). For enterprise use cases, Reflexion is valuable in compliance analysis (does this contract clause violate any of our policies?), report generation (is this analysis internally consistent?), and code generation workflows.

Pattern 3: Plan-and-Execute

Plan-and-Execute separates planning from execution using two distinct LLM calls. A planner agent receives the goal and produces a structured, step-by-step plan as its output — not actions, but a sequence of steps with descriptions of what each step should accomplish. An executor agent then implements each step sequentially, using tools as needed.

The advantage is that the plan is reviewable: a human can inspect the plan before execution begins and modify or reject it. This is particularly valuable for workflows where errors are costly and the plan can be validated against business rules before execution starts. The plan also serves as the agent's context during execution — the executor always knows where it is in the overall goal and what comes next. LangChain's plan-and-execute agent and LangGraph's multi-step workflow patterns both implement variants of this architecture.

Pattern 4: Tool-Use Chain

The Tool-Use Chain pattern is a simplified, more deterministic variant of ReAct. Rather than a fully dynamic loop, the agent follows a partially fixed sequence of tool calls with conditional branching based on tool outputs. For example: always call data_retrieval first, then call analysis_tool if the data meets a quality threshold, then call report_generator. The LLM provides conditional logic and output formatting but does not control the overall sequence.

This pattern is appropriate when the workflow structure is well-known and consistent, but individual steps require LLM reasoning to handle variability in inputs and outputs. It is simpler to debug and test than full ReAct, has lower latency (fewer LLM calls), and is more predictable in production. The tradeoff is reduced flexibility in handling novel situations.

Pattern 5: Supervisor-Worker

The Supervisor-Worker pattern is the production standard for enterprise multi-agent deployments. A supervisor agent receives the task, decomposes it into subtasks, and routes each subtask to the appropriate specialized worker agent. Worker agents execute their subtasks using domain-specific tools and return structured results to the supervisor. The supervisor aggregates results, determines whether the goal has been achieved, and either routes additional subtasks or produces a final output.

The key engineering advantage is specialization: each worker agent has a focused role (data analyst, compliance reviewer, document writer) with a tailored system prompt, appropriate tool set, and bounded scope. Specialized agents consistently outperform generalist agents on domain tasks by 20-35%. The supervisor can also run workers in parallel for independent subtasks, reducing total latency. LangGraph implements this pattern cleanly with a StateGraph where the supervisor node uses conditional edges to route to worker nodes.

Pattern 6: Memory-Augmented

Memory-Augmented agents maintain state across multiple sessions using external memory stores. Episodic memory stores past interactions and their outcomes in a vector database, enabling the agent to retrieve relevant context from previous sessions (this user asked about vendor A last week and we found issue X). Semantic memory stores accumulated knowledge and learned facts that persist across all sessions. Working memory holds the current session state in the LLM context window.

This pattern is essential for long-running enterprise workflows: account management agents that need to remember customer context across months of interactions, compliance monitoring agents that accumulate regulatory change history, and supply chain agents that learn from historical exception patterns. Without episodic memory, every agent interaction starts from zero — the agent cannot improve based on past experience or maintain continuity across a workflow that spans days.

LangGraph Supervisor-Worker Pattern

python
from typing import TypedDict, Annotated, Literal
from langgraph.graph import StateGraph, END
from langgraph.checkpoint.memory import MemorySaver
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage, SystemMessage
import operator

# --- State Definition ---
class AnalysisState(TypedDict):
    task: str                          # Original task description
    data_analysis: str                 # Output from data analyst worker
    report_draft: str                  # Output from report writer worker
    quality_feedback: str              # Output from quality checker worker
    final_report: str                  # Final assembled report
    next_step: str                     # Supervisor routing decision
    iterations: int                    # Guard against infinite loops
    status: str                        # "in_progress" | "complete" | "failed"

llm = ChatOpenAI(model="gpt-4o", temperature=0.1)

# --- Worker Node Functions ---
def data_analyst_node(state: AnalysisState) -> dict:
    """Specialized agent for quantitative data analysis."""
    response = llm.invoke([
        SystemMessage(content=(
            "You are a senior data analyst specializing in enterprise financial and operational data. "
            "Analyze the task provided and produce a structured quantitative analysis with key metrics, "
            "trends, and data-driven findings. Be specific and cite figures."
        )),
        HumanMessage(content=f"Task: {state['task']}\n\nProvide your data analysis.")
    ])
    return {"data_analysis": response.content}

def report_writer_node(state: AnalysisState) -> dict:
    """Specialized agent for structured business writing."""
    response = llm.invoke([
        SystemMessage(content=(
            "You are a senior business analyst who writes clear, executive-ready reports. "
            "Transform the provided data analysis into a well-structured report with an executive summary, "
            "key findings, and recommendations. Write for a CFO/CTO audience."
        )),
        HumanMessage(content=(
            f"Task: {state['task']}\n\n"
            f"Data Analysis:\n{state['data_analysis']}\n\n"
            "Write the report."
        ))
    ])
    return {"report_draft": response.content}

def quality_checker_node(state: AnalysisState) -> dict:
    """Specialized agent for report quality assurance."""
    response = llm.invoke([
        SystemMessage(content=(
            "You are a quality assurance specialist for business reports. "
            "Review the provided report draft for: factual consistency with the source data analysis, "
            "logical coherence of recommendations, completeness of required sections, and executive clarity. "
            "Return either APPROVED with brief rationale, or REVISE with specific feedback."
        )),
        HumanMessage(content=(
            f"Source Data Analysis:\n{state['data_analysis']}\n\n"
            f"Report Draft:\n{state['report_draft']}\n\n"
            "Quality assessment:"
        ))
    ])
    return {"quality_feedback": response.content}

def supervisor_node(state: AnalysisState) -> dict:
    """Routes task to the appropriate next worker based on current state."""
    iterations = state.get("iterations", 0) + 1

    # Hard stop: prevent infinite loops
    if iterations > 6:
        return {
            "next_step": "end",
            "status": "failed",
            "final_report": state.get("report_draft", "Max iterations reached without completion."),
            "iterations": iterations
        }

    # Routing logic based on state
    if not state.get("data_analysis"):
        return {"next_step": "data_analyst", "iterations": iterations, "status": "in_progress"}

    if not state.get("report_draft"):
        return {"next_step": "report_writer", "iterations": iterations, "status": "in_progress"}

    if not state.get("quality_feedback"):
        return {"next_step": "quality_checker", "iterations": iterations, "status": "in_progress"}

    # Check quality checker decision
    feedback = state.get("quality_feedback", "")
    if "APPROVED" in feedback.upper():
        return {
            "next_step": "end",
            "status": "complete",
            "final_report": state["report_draft"],
            "iterations": iterations
        }
    elif "REVISE" in feedback.upper() and iterations <= 4:
        # Request revision — clear draft to trigger re-write with feedback injected into task
        revised_task = (
            f"{state['task']}\n\n"
            f"QUALITY FEEDBACK REQUIRING REVISION:\n{feedback}"
        )
        return {
            "next_step": "report_writer",
            "task": revised_task,
            "report_draft": "",
            "quality_feedback": "",
            "iterations": iterations,
            "status": "in_progress"
        }
    else:
        # Accept current draft after max revision attempts
        return {
            "next_step": "end",
            "status": "complete",
            "final_report": state["report_draft"],
            "iterations": iterations
        }

# --- Router Function for Conditional Edges ---
def route_from_supervisor(state: AnalysisState) -> Literal["data_analyst", "report_writer", "quality_checker", "__end__"]:
    next_step = state.get("next_step", "data_analyst")
    if next_step == "end":
        return END
    return next_step

# --- Graph Construction ---
def build_analysis_graph() -> StateGraph:
    graph = StateGraph(AnalysisState)

    graph.add_node("supervisor", supervisor_node)
    graph.add_node("data_analyst", data_analyst_node)
    graph.add_node("report_writer", report_writer_node)
    graph.add_node("quality_checker", quality_checker_node)

    graph.set_entry_point("supervisor")

    # All workers route back to supervisor after completion
    graph.add_edge("data_analyst", "supervisor")
    graph.add_edge("report_writer", "supervisor")
    graph.add_edge("quality_checker", "supervisor")

    # Supervisor uses conditional edges to route to next worker or end
    graph.add_conditional_edges("supervisor", route_from_supervisor)

    return graph.compile(checkpointer=MemorySaver())

def main():
    graph = build_analysis_graph()
    config = {"configurable": {"thread_id": "procurement-analysis-001"}}

    initial_state = AnalysisState(
        task="Analyze Q1 2026 procurement spend across the top 5 vendor categories. Identify cost reduction opportunities and flag any vendors with deteriorating performance metrics.",
        data_analysis="",
        report_draft="",
        quality_feedback="",
        final_report="",
        next_step="",
        iterations=0,
        status="in_progress"
    )

    result = graph.invoke(initial_state, config)
    print(f"Status: {result['status']} (completed in {result['iterations']} supervisor iterations)")
    print("\n=== FINAL REPORT ===")
    print(result["final_report"])

if __name__ == "__main__":
    main()

A LangGraph Supervisor-Worker pattern with three specialized workers, quality assurance loop, hard stop at 6 iterations, and checkpointing via MemorySaver. In production, replace MemorySaver with langgraph-checkpoint-postgres for durable state across process restarts.

Warning

The most common production failure in autonomous agents is unbounded loops. Always implement three safeguards: (1) a maximum iteration counter that hard-stops the agent and returns the best available output rather than looping indefinitely; (2) progress detection — compare state at step N to state at step N-2 and abort if no meaningful change has occurred; (3) hard stop conditions for irreversible actions — any tool call that writes to an external system, sends a communication, or modifies a record must check a circuit breaker before executing. Without these safeguards, agents can recurse indefinitely, exhaust API rate limits, or cause cascading writes to downstream systems. These are not edge cases in production — they are the first failures you will encounter.

Pattern Selection Guide

PatternUse WhenAvoid WhenLangChain / LangGraph Primitive
ReActSingle-agent tool use; tasks require dynamic tool selection; well-bounded scope with clear completion criteriaMultiple specialized domains required; parallel reasoning needed; workflow spans multiple sessionscreate_react_agent(), AgentExecutor
ReflexionOutput quality can be validated programmatically; code generation; structured output generation; compliance analysis with verifiable rulesLatency-sensitive workflows (adds 1-2 extra LLM calls); tasks without objective quality criteriaCustom reflection loop with LangGraph StateGraph
Plan-and-ExecuteMulti-step workflows where plan is reviewable; tasks where human approval of the plan adds value; predictable domain with known step typesDynamic workflows where plan cannot be determined upfront; real-time requirements; tasks requiring tight feedback loopsplan_and_execute agent (LangChain experimental), LangGraph multi-step
Tool-Use ChainWell-known workflow structure with LLM-variable steps; determinism is valued over flexibility; simpler debugging requirementsTruly dynamic tasks requiring novel tool sequences; complex conditional logic requiring deep reasoningLCEL chain with RunnableBranch, LangGraph linear StateGraph
Supervisor-WorkerMultiple specialist domains; parallel execution of independent subtasks; enterprise production deployments; quality aggregation requiredSimple single-domain tasks where specialization adds overhead; latency constraints below 2 secondsLangGraph StateGraph with conditional edges, multi-agent swarm (LangGraph)
Memory-AugmentedWorkflows spanning multiple sessions; user context must persist; agent must learn from past interactions; long-running operational agentsStateless, one-shot tasks; strict data retention policies prohibit storing interaction history; simple lookup tasksLangGraph checkpointing, LangChain VectorStoreRetrieverMemory, custom episodic memory

5 Reliability Engineering Practices for Production Agents

  • Structured output schemas using Pydantic models for all tool inputs and outputs. Every tool call should validate its inputs against a Pydantic model before execution and validate its output before returning to the agent. This catches type errors, missing required fields, and out-of-range values before they propagate through the agent loop.
  • Distributed tracing with LangSmith or OpenTelemetry on every agent action. Every LLM call, tool call, and state transition should emit a trace event with inputs, outputs, latency, token counts, and the current state snapshot. This is the only way to debug non-deterministic agent failures in production.
  • Idempotent tool implementations that are safe to retry. Every tool that writes to an external system should be designed so that calling it twice with the same inputs produces the same result as calling it once. This enables safe retry logic when tool calls fail transiently, without the risk of double-writes or duplicate records.
  • Human-in-the-loop checkpoints for irreversible actions. Any tool call that sends a communication, approves a financial transaction, modifies a production record, or triggers an external workflow should pause execution and request explicit human approval before proceeding. Use LangGraph's interrupt() primitive or a task queue with human review UI.
  • Circuit breakers for external API calls with fallback behavior. Wrap every external API call in a circuit breaker (pybreaker or tenacity) that opens after N consecutive failures and routes the agent to a fallback — returning cached data, escalating to a human, or gracefully degrading to a manual workflow. An agent that retries a failing API indefinitely can exhaust rate limits and cause cascading failures.

Constitutional AI Guardrails at Inductivee

Across all Inductivee production deployments, every agent operates within a constitutional layer — a set of invariants that are evaluated before any irreversible action is executed. This is the engineering practice that separates a production agentic system from a demo.

The constitutional layer has three components. First, Guardrails.ai validators wrap every tool call with schema validation (is this a well-formed input?), semantic validation (does this action violate our business rules?), and safety checks (does this action fall outside defined operating parameters?). Second, an LLM-as-judge prompt performs semantic evaluation of high-stakes actions: given the business context, the agent's stated goal, and the proposed action, is this action appropriate? LLM judges catch semantic violations that rule-based validators miss — for example, an agent attempting to circumvent an approval workflow by reframing a purchase as an expense. Third, all agent actions are written to an append-only audit log — immutable, timestamped records of every decision, every tool call, every human approval, and every state transition. This audit trail is the evidence base for compliance review, debugging, and retrospective analysis.

This constitutional layer adds 50-150ms of latency per tool call. That cost is worth paying. The alternative — an agent that can take irreversible actions without validation — is not a production system, regardless of how impressive the underlying AI is.

Frequently Asked Questions

What is the ReAct agent pattern?

ReAct — short for Reason + Act — is the foundational pattern for tool-using agents, introduced in the Yao et al. 2022 paper. The agent alternates between reasoning traces, where it thinks through what action to take next, and tool calls, where it executes that action and observes the result, iterating until the task is complete or a stopping condition is triggered. Most LangChain and LangGraph agents use ReAct as their base loop, and it is the right choice for single-agent, tool-augmented tasks: database lookups, API calls, calculations, and bounded question-answering workflows. The key limitation of ReAct is that a single agent has a single reasoning context — it cannot parallelize across independent subtasks or hand off to specialized domain agents, which is why multi-agent patterns like Supervisor-Worker are required for enterprise-scale workflows.

What is the difference between a chain and an agent in LangChain?

A chain executes a fixed, predetermined sequence of operations: step 1 always calls function A, step 2 always calls function B, step 3 always produces output. The execution path is fully known at build time and does not change based on inputs. An agent uses an LLM to dynamically decide which action to take next based on the current state and available tools — it can loop, branch, call tools in novel sequences, and adapt its execution path based on what it observes. Agents are non-deterministic by design: the same input may produce different tool call sequences on different runs depending on LLM sampling. This flexibility makes agents capable of handling novel situations that chains cannot, but it also requires careful engineering — maximum iteration limits, structured output validation, and observability tooling — to control in production.

When should I use a Supervisor-Worker pattern vs a flat multi-agent crew?

Use the Supervisor-Worker pattern when tasks are heterogeneous and require routing logic — when different subtasks need agents with different specializations, tools, or system prompts, and a supervisor needs to decide which worker handles which subtask. Supervisor-Worker is the production standard for complex enterprise workflows like procurement, compliance review, and multi-domain analysis where specialized reasoning per domain consistently outperforms a generalist agent. Use a flat crew — sequential or round-robin — when all agents are peers doing similar work on the same task type, such as a research crew where multiple agents each gather information from different sources. Supervisor-Worker adds routing latency but gains the flexibility to handle heterogeneous, conditional workflows; flat crews are simpler and faster for homogeneous parallel tasks.

How do you prevent autonomous agents from looping indefinitely?

Three safeguards are non-negotiable in every production agent deployment. First, implement a maximum iteration limit on every agent loop — typically 10 to 25 iterations depending on workflow complexity — that hard-stops the agent and returns the best available output rather than looping further. Second, implement progress detection: compare the agent's state at step N to its state at step N-2, and abort if no meaningful change has occurred — this catches the failure mode where an agent keeps calling the same tool with the same arguments. Third, enforce hard stop conditions for all irreversible actions: tool calls that write to external systems, send communications, or trigger financial transactions must check a circuit breaker before executing and must not be retried in a loop. Without all three safeguards, agents will encounter unbounded loops in production — this is not an edge case, it is one of the first failure modes you will see.

How does Inductivee ensure safety in production agentic systems?

Every Inductivee deployment includes a constitutional layer — a set of invariants evaluated before any irreversible action is executed. The layer has three components: Guardrails.ai validators wrap every tool call with schema validation, semantic business rule validation, and safety parameter checks; an LLM-as-judge prompt performs semantic evaluation of high-stakes actions to catch violations that rule-based validators miss, such as an agent attempting to circumvent an approval workflow by reframing a purchase as an expense; and all agent actions are written to an append-only audit log with immutable, timestamped records of every decision, tool call, and state transition for compliance and retrospective analysis. Human-in-the-loop interrupt checkpoints are required before any irreversible action — financial transactions, external communications, and record modifications all require explicit approval before the agent proceeds. This constitutional layer adds 50 to 150ms of latency per tool call, a cost Inductivee treats as mandatory, not optional.

Written By

Inductivee Team — AI Engineering at Inductivee

Inductivee Team

Author

Agentic AI Engineering Team

The Inductivee engineering team — a remote-first group of multi-agent orchestration specialists, RAG pipeline architects, and data liquidity engineers who have shipped 40+ agentic deployments across 25+ enterprises since 2012. Our writing is grounded in what we actually build, break, and operate in production.

Agentic AI ArchitectureMulti-Agent OrchestrationLangChainLangGraphCrewAIMicrosoft AutoGen
LinkedIn profile

Inductivee is a remote-first agentic AI engineering firm with 40+ production deployments across 25+ enterprises since 2012. Our engineering content is written by active practitioners and technically reviewed before publication. Compliance: SOC2 Type II, HIPAA, GDPR, ISO 27001.

Ready to Build This Into Your Enterprise?

Inductivee engineers agentic systems, RAG pipelines, and enterprise data liquidity solutions. Let's scope your project.

Start a Project