Agent Teams Are the Killer Feature of Opus 4.6. Here's How to Actually Use Them.

Agent teams sound futuristic. Multiple AI agents working in parallel, coordinating on subtasks, merging their work back together. The kind of thing you read about in research papers and assume is years away from being practical.

It is not years away. It shipped with Opus 4.6, and I have been running agent teams on real projects since the release. This is a practical guide to how they work, what patterns I have found useful, and where the whole thing falls apart.

What Agent Teams Actually Are

Agent teams in Claude Code let you spawn multiple Claude instances that work on a shared project simultaneously. Each agent gets its own context, its own set of tools, and its own piece of the problem. They coordinate through a shared task list and can send messages to each other directly.

The mental model is straightforward: instead of one Claude working through a long task sequentially, you have a small team working in parallel. One agent handles the frontend, another builds the API layer, a third writes tests. They each do their part and merge results back together.

Anthropic demonstrated the concept at scale with their C compiler project, where 16 parallel agents built a 100,000-line Rust-based C compiler capable of compiling the Linux kernel. That is the ceiling. For day-to-day development work, I have found that two to five agents is the sweet spot.

The key distinction from simply running multiple Claude Code sessions yourself is the coordination layer. Agents can see each other's task progress, send messages, and a team lead agent can assign and reassign work dynamically. It is structured collaboration, not just parallel execution.

Setting Up Your First Team

Setting up an agent team happens inside Claude Code. You do not need to configure anything externally. The workflow revolves around three core concepts: creating a team, spawning teammates, and managing tasks.

Start by asking Claude Code to create a team:

TeamCreate: team_name="my-feature", description="Building user dashboard feature"

This creates a shared team context and a task list that all agents can access. From there, you create tasks and spawn teammates to work on them.

TaskCreate: subject="Build dashboard API endpoints", description="Create REST endpoints for user stats, activity feed, and preferences"
TaskCreate: subject="Build dashboard UI components", description="React components for stats cards, activity timeline, preferences panel"
TaskCreate: subject="Write integration tests", description="Test API endpoints and component rendering with mock data"

Then you spawn agents using the Task tool with a team_name parameter. Each agent gets a name and a role:

Task: name="backend-dev", team_name="my-feature", prompt="You are working on backend API tasks"
Task: name="frontend-dev", team_name="my-feature", prompt="You are working on frontend UI tasks"
Task: name="test-writer", team_name="my-feature", prompt="You are writing tests for the dashboard feature"

Each spawned agent picks up tasks from the shared list, works on them independently, and marks them complete when done. The team lead (you, or a lead agent you designate) can monitor progress, reassign tasks, and coordinate when agents need to interact.

Communication between agents uses the SendMessage tool:

SendMessage: type="message", recipient="backend-dev", content="The API response format needs to include a timestamp field for the activity feed"

There is nothing magical about the setup. It is a task list, some agents, and a messaging system. The power comes from the parallel execution and the coordination patterns you build on top.

Patterns That Work

After running agent teams on several projects, a few patterns have consistently produced good results.

Divide by Domain, Not by Step

The most reliable pattern is giving each agent ownership of a domain rather than a step in a sequence. Instead of "Agent A writes the code, Agent B reviews it, Agent C deploys it," try "Agent A owns the API, Agent B owns the UI, Agent C owns the tests."

Domain ownership works because it minimizes dependencies between agents. Each agent can make progress independently without waiting for another agent to finish. When agents own sequential steps, the whole team moves at the speed of the slowest step.

The Lead Plus Specialists Pattern

The pattern I use most often: one agent acts as the team lead, responsible for planning, task creation, and coordination. The other agents are specialists who focus on execution.

The lead agent runs in plan mode first, breaks the work into tasks, then spawns specialists to handle execution. When a specialist finishes a task, the lead reviews the work and assigns the next one. This avoids the chaos of fully autonomous agents all making independent decisions about what to work on.

Task: name="lead", team_name="my-feature", prompt="You are the team lead. Plan the work, create tasks, and coordinate the specialists."
Task: name="specialist-api", team_name="my-feature", prompt="You handle API and database tasks assigned to you."
Task: name="specialist-ui", team_name="my-feature", prompt="You handle frontend component tasks assigned to you."

Keep Tasks Small and Independent

The best tasks for agent teams are ones that can be completed in a single focused session: implement one endpoint, build one component, write tests for one module. When tasks are too large, agents lose context. When tasks depend on each other, agents block each other.

I aim for tasks that take a single agent roughly five to fifteen minutes. If a task is more complex than that, I break it down further before assigning it.

Use Agents to Verify Each Other

One of the more useful patterns: after an agent finishes a task, have a different agent review or test the output. The "fresh eyes" effect works with AI agents just as it does with human developers. A test-writing agent that did not write the implementation will often catch edge cases the implementing agent missed.

Patterns That Do Not Work

Being honest about failure modes is the only way to actually help someone get started with this. Here is what has gone wrong for me.

Parallelism Collapse

This is the most common failure mode, and Anthropic's own C compiler project demonstrated it at scale. When multiple agents encounter the same problem simultaneously, they all try to fix it independently. You end up with conflicting changes, wasted work, and a merge mess.

In the C compiler case, 16 agents hit the same bug when compiling the Linux kernel. All 16 fixed it independently, then overwrote each other's changes. Having 16 agents provided zero benefit because they were all stuck on the same bottleneck.

The fix is to keep agents working on genuinely independent pieces of the problem. If two agents are touching the same files, you probably need to restructure your task breakdown.

Over-Communication Overhead

It is tempting to have agents constantly update each other on progress. In practice, excessive messaging creates noise that slows everyone down. Agents spend time reading and responding to messages instead of writing code.

I keep inter-agent communication to a minimum: the lead assigns tasks, specialists report completion, and direct messages only happen when there is a genuine dependency or blocker. Anything more than that and you start eating into the productivity gains that parallelism was supposed to provide.

Tightly Coupled Tasks

Agent teams are not a good fit for work where every piece depends on every other piece. If your frontend needs the exact API contract before it can start, and your API needs the database schema before it can start, and your schema depends on the UI requirements, then you do not have parallel work. You have sequential work dressed up as parallelism.

Before spawning a team, I ask myself: can these tasks genuinely run at the same time without blocking each other? If the answer is no, a single agent working sequentially is faster and cheaper.

When a Single Agent Is Just Better

Not every task benefits from a team. For debugging, exploratory coding, or tasks where context needs to accumulate across many small steps, a single agent with full context outperforms a team that has to rebuild context in each parallel session.

I use agent teams for breadth (multiple independent features, parallel test writing, multi-component builds) and single agents for depth (debugging, architecture decisions, complex refactoring of tightly integrated code).

A Real Example

Here is a concrete case where agent teams paid off. I had a project that needed a new feature with three independent pieces: a REST API with four endpoints, a set of React dashboard components, and an updated test suite covering both.

With a single agent, this would have been sequential: build the API, then build the UI, then write the tests. Roughly 40 minutes of wall-clock time.

With a three-agent team, I set it up like this:

TeamCreate: team_name="dashboard-feature"
TaskCreate: subject="Build API endpoints", description="Four REST endpoints: GET /stats, GET /activity, GET /preferences, PUT /preferences"
TaskCreate: subject="Build dashboard components", description="StatsCard, ActivityTimeline, PreferencesPanel components with props interfaces"
TaskCreate: subject="Write test suite", description="Integration tests for all API endpoints, unit tests for all components"

The API agent and the UI agent worked simultaneously on their respective pieces. The test agent started by writing test stubs based on the task descriptions, then filled in the real assertions as the other agents completed their work.

Total wall-clock time: about 18 minutes. The agents did not step on each other because their tasks were genuinely independent (the UI agent built components with mock data, not live API calls). The test agent needed some coordination when the API schema was finalized, which the lead agent handled through a quick message.

The token cost was higher. Three agents consuming tokens in parallel costs roughly three times what a single agent costs for the same total work. But the time savings were real, and for this particular task, the time savings mattered more than the cost.

Tips for Getting Started

If you are new to agent teams, here is what I would tell you before your first run.

Start with two or three agents, not sixteen. The C compiler project used 16 agents because it was a stress test at scale. For normal development work, two to three agents cover most cases. A lead plus two specialists is a strong starting point.

Define clear boundaries. Before spawning agents, write down exactly what each one is responsible for and what it is not responsible for. Overlapping responsibilities lead to conflicting changes.

Use plan mode for the lead agent. Have the lead agent plan the work first, create the task breakdown, and get your approval before spawning specialists. This prevents wasted work from a bad initial decomposition.

Monitor token costs. Agent teams consume tokens fast. Three agents running in parallel for 20 minutes can easily burn through more tokens than a single agent working for an hour. Keep an eye on your usage, especially during early experimentation when you are still learning what works.

Check the task list frequently. The shared task list is your source of truth for what is happening. Get in the habit of checking it to see what is in progress, what is blocked, and what is done.

Have agents verify each other. After an agent completes a task, route the output to a different agent for review or testing. This catches issues early and reduces the cost of mistakes.

Know when to stop. If agents are spending more time coordinating than coding, you have too many agents or your tasks are too coupled. Collapse back to a single agent and regroup.

Looking Forward

Agent teams are not a gimmick. They are a genuine multiplier for the right kind of work: tasks with clear boundaries, independent pieces, and enough scope to justify the coordination overhead.

They are also not a silver bullet. The failure modes are real, the token costs add up, and there is a learning curve to structuring work effectively for parallel execution. But the underlying capability is there, and it improves with each release.

The projects where I have gotten the most value from agent teams share a common trait: they would have been boring to do sequentially. Multiple similar but independent components. Broad test coverage across a codebase. Multi-service feature builds. These are the tasks where parallelism shines and where the coordination overhead is worth the trade.

If you have been curious about agent teams but have not tried them yet, start small. Two agents, one clear task split, plan mode for the lead. See what happens. Then scale up from there.