Building a Research Agent

This weekend I decided to build a multi-agent research system based on Anthropic's approach to demonstrate how these architectures work in practice. The system breaks research queries into focused subtasks, runs specialized agents on each, then synthesizes results into structured reports.

The system uses an orchestrator-worker pattern. The lead agent analyzes queries, breaks them into subtasks, and coordinates subagents. subagents handle focused research areas and return structured findings.

Key benefits:

Task separation: Each agent gets a specific research focus
Context isolation: subagents maintain dedicated context windows
Parallel execution: Multiple research threads run simultaneously

This mirrors Anthropic's multi-agent research system approach and uses the orchestrator-worker pattern treats subagents as intelligent tool calls rather than autonomous agents.

# subagent as intelligent tool calls  
await sub_agent.run("Research current AI agent architectures")

This reframing changes everything. Instead of managing complex inter-agent communication, we have one lead agent that uses other agents as sophisticated, reasoning tools.

The lead agent handles query analysis, task decomposition, and result synthesis. subagents focus on specific research areas and return structured findings with their own context windows.

from pydantic_ai import Agent
from models import ResearchReport, SubagentFindings

# lead agent configuration
lead_agent = Agent[DateDeps, ResearchReport](
    model='openai:gpt-4o',
    output_type=ResearchReport,
    output_retries=2,
)

# subagent configuration  
sub_agent = Agent[DateDeps, SubagentFindings](
    model='openai:gpt-4o',
    output_type=SubagentFindings,
    output_retries=2,
)

The orchestration works through a tool called run_subagent where the lead agent provides a list of tasks. These tasks get distributed among subagents and run concurrently, with results gathered and returned to the lead agent as a tool result.

@lead_agent.tool
async def run_subagent(ctx: RunContext[DateDeps], tasks: SubagentTasks):
    """Run subagents concurrently. Each task should be specific and focused."""
    results = await asyncio.gather(
        *[
            sub_agent.run(
                f"Research Task: {task.description}\nFocus Area: {task.focus_area}",
                deps=ctx.deps,
            )
            for task in tasks.tasks
        ]
    )
    return [result.output for result in results]

Execution Strategy

The subagents run concurrently, and each subagent have access to tools designed for concurrent/parallel calls internally.

# each subagent can make multiple searches simultaneously
@sub_agent.tool_plain
async def web_search(query: str, count: int = 10):
    """Search the web for research information."""
    return await search_api(query, count)

@sub_agent.tool_plain  
async def web_fetch(url: str, timeout: int = 30):
    """Fetch content from a specific URL."""
    return await fetch(url, timeout)

Structured Outputs

Using Pydantic models for all agent outputs ensures consistency across multiple agents. When subagents return unstructured text, the lead agent must parse and interpret varying formats, which introduces errors and inconsistency. Structured outputs guarantee that each agent returns data in the expected format, making synthesis reliable. This becomes critical when coordinating multiple agents - the lead agent can confidently access specific fields like key_insights or confidence_level without parsing natural language responses.

class SubagentFindings(BaseModel):
    task_description: str
    summary: str
    key_insights: List[str]
    sources_found: int
    confidence_level: str  # "high", "medium", "low"

class ResearchReport(BaseModel):
    title: str
    executive_summary: str
    sections: List[ResearchSection]
    key_takeaways: List[str]

This approach provides consistent structure across all agents, type safety to catch errors early, streaming support for real-time updates, and predictable integration with frontend components.

Key Insights

Treating subagents as intelligent tool calls creates more reliable coordination than autonomous communication, orchestration through a lead agent works better than distributed decision-making, and structured outputs become critical for consistent agent integration.

Multi-agent orchestration proves ideal for open-ended research questions, tasks requiring multiple information sources, and complex problems with unpredictable subtasks, while single agents remain better suited for well-defined queries, linear workflows, and simple information retrieval.

Implementation

The system is built on Pydantic AI for structured agent outputs, FastAPI for streaming responses, and async processing throughout for efficient coordination. The complete implementation is available at https://github.com/daveokpare/deep-research.

These patterns extend beyond research to any complex, multi-step AI tasks requiring reliable coordination between multiple agents.

Future Work

This project lacks proper citations and comprehensive evaluation metrics comparing single vs multi-agent performance.

Adding evaluation metrics would make this a more complete demo project by measuring research quality, completion time, and accuracy against baselines.