Parallel AI coding agents: the workflow pattern emerging in October 2025

Engineers are starting to run multiple AI coding agents in parallel on separate tasks. The pattern is real, the tooling is rough, and the human-in-the-loop question is unsettled.

C Charles Lin · October 31, 2025

A small post on r/ChatGPTCoding on October 22 captured a workflow pattern that’s quietly becoming common: “we had 2 weeks to build 5 microservices with 3 devs, tried running multiple AI agents in parallel”. The thread got 53 upvotes — not viral, but the comments are dense and reveal what’s actually happening at AI-coding-heavy shops. The pattern is not one engineer asking one AI agent for help one task at a time. The pattern is one engineer orchestrating 3-5 AI agents in parallel, each on a different task, with the human’s job being to review and merge rather than to write.

This piece is the working analysis on where the pattern is, where it works, where it breaks, and what tooling actually supports it today.

The shape of the pattern

The most common implementation in October 2025 looks like this:

Git worktrees as the isolation primitive. Each parallel agent gets its own worktree of the same repository — a separate working directory with a separate branch. Agents don’t trip over each other’s file edits because they’re not in the same directory.
A coordination layer (often just markdown or a tasklist). A shared TODO/AGENTS.md describes what each agent is supposed to do. Some teams use Linear or GitHub Issues for this; many just have a flat markdown file.
3-5 parallel terminal sessions. One Claude Code or Codex CLI per worktree, each pointed at the same project but a different task scope. The engineer cycles between them.
Review at the diff level. When an agent finishes a task, the engineer reviews the resulting diff (often in a separate tmux pane or a quick git diff in a fifth terminal) and decides to merge, request revisions, or kill the work.

The October 22 thread describes a startup version of this — three devs, five microservices, two weeks. The OP claims it worked. The most-upvoted comment is the appropriate eyebrow-raise: “Now your boss knows he can dump shit on you last minute.” A more substantive 24-upvote reply gets at the real critique: “Honestly have a hard time believing you’d walked into most of that had you planned it out properly… Auto generated API contracts and less relying on test output and more on real one (you can have AI do this).”

The interesting thing is that the skepticism in the comments is mostly about scope/feasibility, not about whether the parallel-agent pattern itself is real or workable. By late 2025, the pattern is enough of an established thing that the debate has moved on from “does this work?” to “what are the limits, and what’s the human role?”

The earlier signal: the multi-agent swarm post

A July 2025 thread titled “I cancelled my Cursor subscription. I built multi-agent swarms with Claude Code instead” was an early signal of where this was heading. The pattern was less mature in July — fewer tools, less convention — but the framing was already “I run multiple agents on different parts of the codebase concurrently, then I become the integrator.” Three months later, that’s just how serious users do AI-heavy coding.

A separate September thread, RooCode + parallel agents + LSP tools + runtime debugging, showed the tooling side maturing. RooCode (an open-source VS Code extension) added explicit parallel-agent support around the same time. Claude Code’s sub-agent system, which I wrote about elsewhere, gave first-party parallelism via the SubAgent abstraction. The Codex CLI side is rougher — no equivalent to sub-agents — but running multiple codex terminal sessions in parallel works.

Where parallel agents actually work

Three task categories where I’ve found parallel agents to be a clear win after running this workflow for three months:

1. Multiple independent feature implementations from a clear spec

If you have 4 features specced out — each with a clear input/output, each touching mostly disjoint code paths, each with a defined test surface — running them in parallel works. Agent 1 builds feature A, Agent 2 builds feature B, etc. The integration step at the end is usually fast because the changes don’t conflict. This is the microservices case from the October 22 thread, and it’s the cleanest fit for the pattern.

The signal that you’re in this regime: you can write each task as a self-contained markdown brief that doesn’t reference the other tasks. If the briefs cross-reference, the tasks aren’t independent, and parallelism will hurt more than it helps.

2. Wide refactors across loosely coupled modules

Renaming an internal API, migrating a logger, updating a dependency that’s imported in 30 files — these are perfect for parallel agents. Each agent gets a slice (Agent 1 handles files in src/auth/*, Agent 2 handles src/payments/*, etc.) and they don’t conflict because the file paths are disjoint. A 4-hour refactor becomes a 1-hour refactor + 30 minutes of review.

3. Spike + implementation in parallel

When you don’t know the right approach yet, run two agents on the same problem with different approaches and pick the one whose result you prefer. Agent 1 tries approach A, Agent 2 tries approach B. You read both diffs, take the better one, discard the other. This is “AI-assisted comparison shopping” and it works when the problem is well-defined enough to make the comparison meaningful.

Where parallel agents break

Three failure modes that I’ve consistently hit and that show up in the Reddit discussions:

1. The human becomes the bottleneck

The most-quoted comment from the October 22 thread (5 upvotes but heavily reposted in other threads): “The bottleneck is a human to review the work. And not just glance at the diffs and checking if the concurrency looks right, but understanding the data flow and staying on top of the architecture.”

This is the central tension of the parallel-agent pattern. The agents can be fast in parallel; the human reviewing their output cannot be. By the time you’re running 5 agents simultaneously, you’re either (a) reviewing carelessly and accepting work you’d reject if it were yours, or (b) the agents are sitting idle waiting for you to finish reviewing the previous batch. Neither is good. The honest cap is somewhere between 3-5 agents per engineer for tasks that need real review, and lower for high-risk changes.

2. Shared-state coordination is still hard

When agents need to coordinate on a shared resource — a single test database, a single migration sequence, a single piece of architecturally-central code — parallelism fights you. The agents will step on each other. The October 22 OP solved this in their microservices case by having truly disjoint services; if the services had needed shared types or shared DB schemas, the wins evaporate.

Some experimental tooling is trying to fix this — shared markdown coordination files, lock-based access to specific files, MCP servers exposing a shared task queue — but none of it is robust enough to recommend as a default in October 2025.

3. Quality drift across the swarm

Each agent is independently doing the “best it can” on its task. When you compose their work into a single PR, the quality dispersion is visible. Some files are over-engineered, some are under-engineered, some have subtle style inconsistencies that wouldn’t have happened if one engineer (or one agent) had written all of them. The integration step is genuinely more work than reviewing a single agent’s output — not less.

The 23-upvote “My experience in AI coding” thread captured the candid version of this: an engineer who ran multiple agents but ended up doing significant cleanup work afterward and still finished faster than serial, but not as fast as the naive math would suggest.

The tooling that supports this pattern in October 2025

What’s working today:

Git worktrees as isolation. The clean way to give each agent its own working copy without separate clones. git worktree add ../agent-1 main and you’re done.
Claude Code’s sub-agent system. First-party parallelism inside one Claude Code session — sub-agents can dispatch concurrently for research or implementation tasks, with the parent agent integrating.
Multiple parallel terminal sessions. Plain old tmux or iTerm panes, one Claude Code or Codex CLI per pane. Low-tech and reliable.
Cline / Roo Code / Kilo Code extensions. VS Code extension tier — some have explicit multi-agent UI, others just let you run multiple instances.
Shared markdown coordination files. AGENTS.md, CLAUDE.md, TODO.md — primitive but workable.

What’s not yet working but is being attempted:

Centralized agent orchestration tools. Several startups are trying to build this. None are credibly winning yet. The space is wide open.
First-party multi-session UIs. Anthropic and OpenAI both have hooks for this in their CLI tools but nothing fully baked.
Shared-state coordination primitives. A small but real research area — MCP servers that expose locks, queues, and consensus state to multiple agents — but nothing battle-tested.

The honest stance on whether you should use this pattern

If you’re doing serious AI-heavy coding and you haven’t tried parallel agents at all, you’re probably leaving 30-50% productivity on the table on tasks that fit the well-separated-feature mold. Adopting the pattern is worth the upfront learning cost.

If you’re already using parallel agents and you’re trying to push past 5-7 concurrent — stop and think hard about the human review bottleneck. More agents doesn’t equal more output past a certain point.

If you’re tempted to use parallel agents on a single tightly-coupled feature with shared state — don’t. The serialization cost will exceed the parallelism win.

The honest take: parallel agents are real, the productivity gains are real, but the failure mode where the human stops actually reviewing and starts rubber-stamping is also real and is the central risk of the pattern. The skill the parallel-agent-fluent engineer develops is not orchestration — it’s discipline about what to review carefully versus what to glance at versus what to send back. That’s the part nobody’s automated yet and probably won’t soon.

The two YouTube videos that actually show the workflow

The cleanest single explanation of the isolation primitive the pattern depends on is Samuel Gregory’s “Git Worktrees: The secret sauce to multi-agent workflows!” (4 min, 44K views, August 2025). His framing is exactly the one that makes the workflow click for engineers who have not seen it before: “What are you doing while Claude Code is running those long-horizon tasks? Sometimes if I’m not thinking about the next features and making notes on things I notice, I’m often just twiddling my thumbs. But what if we could have multiple Claude Code instances running on separate branches, working on multiple features?” The mechanic he walks through is git worktree add -B feature/add-mfa ../add-mfa origin/main — one command that gives you a separate working directory on a separate branch sharing the same .git storage. He then opens a Warp terminal in the new directory, runs claude in it, and works on the new feature while the original Claude Code session continues in the original directory. The framing — “it’s just sharing a lot of the resources, you don’t have to wait for it to download that clone anymore, it keeps both branches up to date with any changes that happen on the remote” — is the right one. Git worktrees are not new; using them as the isolation primitive for parallel AI agents is the 2025 idea.

The harder-to-watch but more informative video is BridgeMind’s “I Ran 7 GPT 5 Codex Agents at One Time” (27 min, 4K views, September 2025). The view count is low because the video is long and the cool factor is muted, but the substantive content is rare. The author opens with the migration story (“this past week I’ve transitioned completely from using Cursor to using Codex… I think Cursor is a thing of the past”) and then walks through running seven simultaneous Codex CLI sessions across a single repo, organised by code area: two scoped to specific subdirectories (UI and database), four with full-repo context for cross-cutting tasks. His specific orchestration pattern is worth noting: he writes a TODO list with seven tasks, drops UI screenshots in for each agent, uses voice-to-text (Whisper Flow) to dictate each prompt, then cycles between the sessions reviewing and merging as each finishes. The honest framing he lands on is the one this article keeps returning to: Codex CLI sessions are slower per task than Cursor, but running them in parallel changes the equation entirely. “When I first started using Codex, I didn’t like it because it took too long to respond. But then I realized that really what you need to do is you just need to launch multiple agents at once.”

The integration cost he glosses over is the part the Reddit dataset surfaces more honestly. Seven agents finishing in parallel means seven diffs to review in roughly the same window, which is exactly the human-bottleneck problem.

What YouTube usually gets wrong

YouTube videos about multi-agent coding workflows in late 2025 tend to focus on the cool factor — five terminals open, agents working in parallel, “look at all this code being written.” Reddit’s response is more grounded: the work is real, the tooling is rough, the human review bottleneck is the central limitation, and most of the videos do not show the integration / cleanup step that is where the actual time goes.

The reconciliation between YouTube and Reddit on this one: YouTube captures the moment that parallel agents are working. Reddit captures the work that happens after. Both are true. If you are new to the pattern, watch Samuel Gregory’s 4-minute git-worktree primer to understand the mechanic, then BridgeMind’s longer video to see what it actually looks like in practice. Then read the Reddit threads to understand what to expect when you try it on real work.

The decade of “one engineer, one tool, one task at a time” is closing for AI-heavy coding. The next pattern is here, it’s rough, and the engineers who learn to work with it now are setting up the practice that’ll be standard in 2026.

Sources

Every reference behind this piece. If we make a claim, it's because at least one of these said so — or we lived it ourselves.

Firsthand Three months running parallel Claude Code + Codex CLI sessions on real projects with git worktrees
Blog r/ChatGPTCoding — "We had 2 weeks to build 5 microservices with 3 devs, tried running multiple AI agents in parallel" (53 ups) — r/ChatGPTCoding
Blog r/ChatGPTCoding — "I cancelled my Cursor subscription. I built multi-agent swarms with Claude Code instead" (33 ups) — r/ChatGPTCoding
Blog r/ChatGPTCoding — "RooCode + parallel agents + LSP tools + runtime debugging" (22 ups) — r/ChatGPTCoding
Blog r/ChatGPTCoding — "My experience in AI coding: brief summary" (23 ups, multi-tool stack) — r/ChatGPTCoding
YouTube Git Worktrees: The secret sauce to multi-agent workflows! — Samuel Gregory
YouTube I Ran 7 GPT 5 Codex Agents at One Time — BridgeMind