AI Coding Agents Compared (2026): Claude Code vs Cursor vs Copilot vs Codex & More

Freshness warning. This category changes monthly — prices, model names, and even owners shift (Windsurf became Devin Desktop; Roo Code shut down; Copilot moved to credits; Gemini retired its individual hosted CLI path). Every figure is tagged (as of mid-2026); confirm current pricing on the vendor's page before committing. Model names reflect mid-2026 (Claude Opus 4.8 / Sonnet 4.6, OpenAI GPT-5.5, Google Gemini 3.1 Pro).

Quick Reference — pick by what you're doing

If you want to…	Start with	Why
Drive agentic work from the terminal	Claude Code, OpenAI Codex CLI, Aider	Terminal-native, scriptable, strong autonomy; Aider is BYO-key/open-source.
Stay in a full AI-first IDE	Cursor, Devin Desktop, Zed	Inline edits, multi-file agent, tab-completion in one editor.
Add AI to your existing editor	GitHub Copilot, Cline, JetBrains Junie	Extensions for VS Code / JetBrains / etc.; keep your setup.
Delegate whole tasks asynchronously	Devin, Copilot coding agent, Codex cloud, Jules	Assign an issue → it works in a sandbox → opens a PR.
Bring your own API key / any model	Aider, Cline, Zed, Cursor	Pay model cost directly; mix providers; run local models.
Maximize MCP / tool integrations	Cline, Claude Code	Cline's one-click MCP marketplace; Claude Code's deep MCP + sub-agents.
Spend $0	Copilot Free, Gemini CLI (OSS), Aider, Cline, Zed Personal	Free tiers or BYO-key open-source (you still pay model API if BYO).
Spec-driven / enterprise AWS	Kiro	Spec→plan→build workflow (Amazon Q is being sunset toward Kiro).

Fundamentals
How to think about the category

Four form factors (and the autonomy spectrum)

Almost every tool is one (or several) of these. Many products now span multiple surfaces sharing one engine.

CLI Terminal agent — runs in your shell, edits files, runs commands, scriptable into CI. Claude Code, Codex CLI, Aider, Gemini CLI.
IDE Standalone AI IDE — a full editor (often a VS Code fork) built around AI. Cursor, Devin Desktop, Kiro, Zed.
EXT Editor extension — bolts AI onto your current editor. GitHub Copilot, Cline, JetBrains Junie/AI Assistant.
CLOUD Autonomous cloud agent — you assign a task; it works async in a sandbox and returns a PR. Devin, Copilot coding agent, Codex cloud, Jules, Cursor background agents.

Autonomy spectrum: autocomplete (Tab/Copilot completions) → chat/edit (ask, apply a diff) → in-loop agent (multi-step, you approve) → autonomous (delegated, returns a PR). More autonomy = more leverage and more tokens, more review burden, and more ways to go wrong unsupervised.

MCP, BYO-key, and the cost model that actually bites

MCP (Model Context Protocol) — the now-standard way agents connect to external tools/data (your DB, GitHub, Slack, docs). Broad support is table stakes in 2026; Cline's one-click MCP marketplace and Claude Code's deep MCP are standouts.
Subscription vs BYO-key — a flat plan is a cap (predictable, but rate limits / rolling windows can block bursts). Bring-your-own-key is a meter (no floor, no ceiling — a single complex agentic run can outspend a week of chat).

Cost rule of thumb: BYO-key tends to be cheaper for light/intermittent use (especially leveraging ~90%-off cache reads and 50%-off batch), while a subscription is cheaper and more predictable for sustained heavy agentic sessions. An agent can burn ~1M tokens/day vs a chat user's ~10k — which is exactly why the whole industry repriced in 2026 (see below).

Comparison matrix (as of mid-2026 — verify prices)

Tool	Type	Entry price	Models	BYO key	MCP
Claude Code	CLI +IDE/web	Pro $17/mo (annual); Max from $100/mo; API PAYG	Claude Opus 4.8 / Sonnet 4.6 / Haiku 4.5 / Fable 5 (1M ctx)	Yes (3rd-party)	Yes
Cursor	IDE	Hobby $0; Pro $20; Pro+ $60; Ultra $200; Teams $40/seat	Multi-vendor + in-house Composer 2.5; Auto mode	Yes	Yes
GitHub Copilot	EXT +CLOUD	Free; Pro $10; Pro+ $39; Max $100; Business $19/seat	Picker: Claude, OpenAI GPT-5.x, Gemini, MS MAI	No (managed)	Yes
OpenAI Codex	CLI +CLOUD +IDE	In ChatGPT Go $8 / Plus $20 / Pro $100; API PAYG	GPT-5.5 (default), GPT-5.4	No (OpenAI only)	Yes
Devin Desktop (ex-Windsurf)	IDE	Free; Pro $20; Teams $40/seat; Max $200	In-house SWE-1.6 + Claude/GPT/Gemini/Grok	Unconfirmed	Yes
Gemini (CLI / Code Assist / Jules)	CLI EXT CLOUD	Code Assist Std ~$19, Ent ~$45/seat; Jules in AI Pro ~$20	Gemini 3.1 Pro / 3.5 Flash	Yes (CLI)	Yes
Aider	CLI (OSS)	Free — you pay model API	Any LLM (incl. local / OpenAI-compatible)	Yes	Limited
Cline	EXT (OSS)	Free — BYO key (optional hosted PAYG)	Any (200+ via OpenRouter, Ollama, LM Studio)	Yes	Yes ★
Kiro (AWS)	IDE	Free; Pro $20; Pro+ $40; Pro Max $100; Power $200	Claude + open-weight + Auto router	No (managed)	Yes
Devin (Cognition)	CLOUD	Free; Pro $20; Max $200; Teams $80 + $40/seat	In-house SWE-1.6 + configurable frontier	Config	Yes
JetBrains Junie + AI Assistant	EXT	AI Pro $10/mo (personal); AI Ultimate $30	Claude/GPT/Gemini/Grok + Mellum; local	Yes	Yes
Zed	IDE (OSS)	Personal $0 (BYO unlimited); Pro $10; Business $30/seat	Hosted Claude/GPT/Gemini; BYO no-markup	Yes	Yes

★ Cline ships a one-click MCP marketplace and on-the-fly tool creation — the strongest MCP onboarding. "Type" pills: CLI terminal · IDE standalone editor · EXT editor extension · CLOUD autonomous cloud agent.

Tool profiles

Claude Code — Anthropic · terminal-first, multi-surface

What: A terminal coding agent that also runs as a VS Code extension, JetBrains plugin (beta), desktop app, and web (claude.ai/code) — all sharing one engine, CLAUDE.md, MCP config, and settings. Strong at multi-step autonomous work, sub-agents, and automation (GitHub Actions, Slack→PR, scheduled jobs).

Price (mid-2026): Free tier excluded; Pro $17/mo (billed annually, $20 monthly); Max from $100/mo (5× usage) up to $200/mo (20×); Team/Enterprise per-seat; or API pay-as-you-go. Usage is pooled across Claude chat + Code on a 5-hour rolling window plus weekly caps.

Models: Opus 4.8, Sonnet 4.6, Haiku 4.5, Fable 5; 1M-token context on the larger models; opusplan mode (Opus plans, Sonnet executes). BYO via third-party providers supported.

Strength: cross-surface continuity and deep automation. Watch-out: limits are pooled with chat and not published as exact numbers; the JetBrains plugin is still beta.

install

curl -fsSL https://claude.ai/install.sh | bash
# or:  npm install -g @anthropic-ai/claude-code   (needs Node 18+)

Cursor — Anysphere · the popular AI IDE

What: A VS Code-fork IDE built around AI: unlimited Tab completions, an Agent/Composer for multi-file changes, background (cloud) agents, and a BugBot code reviewer.

Price (mid-2026): Hobby $0; Pro $20/mo; Pro+ $60; Ultra $200; Teams $40/user. In June 2025 it moved from request-counts to a dollar-credit / token-cost model with a separate, much cheaper pool for Auto + the in-house Composer 2.5 model — the repricing drew backlash and a CEO apology.

Models: OpenAI, Anthropic, Google, xAI, and more, plus proprietary Composer 2.5; Auto mode picks by cost/reliability. BYO key supported (Teams adds a small BYOK platform fee).

Strength: a polished all-in-one AI IDE with multi-model + BYOK and a cheap Composer pool. Watch-out: usage-based costs can quietly exceed the flat fee on heavy days.

GitHub Copilot — Microsoft/GitHub · the incumbent, everywhere

What: The broadest-reach assistant — extensions for VS Code, Visual Studio, JetBrains, Eclipse, Xcode; web, mobile, CLI; an in-IDE agent mode and an async coding agent (assign an Issue → it works in a GitHub Actions sandbox → opens a PR, with a ~59-min session cap).

Price (mid-2026): Free (2,000 completions/mo); Pro $10; Pro+ $39; new Max $100; Business $19/seat; Enterprise $39/seat. On June 1, 2026 it switched from "premium requests" to token-based AI Credits (1 credit = $0.01); code completions + Next Edit Suggestions stay unlimited/free.

Models: the widest catalog — Claude (to Opus 4.8/Fable 5), OpenAI GPT-5.x, Google Gemini 3.x, Microsoft MAI — selectable, with reasoning levels.

Strength: deepest GitHub integration, broadest IDE + model coverage. Watch-out: heavy agentic users saw bills spike under usage billing.

OpenAI Codex — OpenAI · CLI + cloud + IDE, ChatGPT-bundled

What: A relaunched suite — an open-source Rust Codex CLI, a cloud agent that runs parallel sandboxed tasks → PRs, an IDE extension, a desktop app, plus GitHub @codex and Slack. Sandboxed by default; agent-phase internet off by default.

Price (mid-2026): included in ChatGPT Free/Go $8/Plus $20/Pro $100 (and Business/Enterprise), with rolling 5-hour usage windows; API pay-as-you-go also available.

Models: GPT-5.5 is the recommended default, with GPT-5.4 variants (the older codex-1/GPT-5-codex names are retired).

Strength: strong agentic coding, parallel cloud tasks, good sandbox/approval controls, bundled into ChatGPT plans. Watch-out: OpenAI-model lock-in — no Claude/Gemini, no self-host.

install

npm install -g @openai/codex

Devin Desktop (formerly Windsurf) — Cognition · the renamed AI IDE

What: The AI IDE formerly called Windsurf/Codeium. After a turbulent 2025 (an OpenAI acquisition fell through; Google DeepMind hired its leadership; Cognition acquired the remaining product/team), it was rebranded "Devin Desktop" on June 2, 2026. Its Cascade agent became "Devin Local."

Price (mid-2026): Free; Pro $20; Teams $40/seat; Max $200; Enterprise custom — now on auto-refreshing daily/weekly quotas (credits retired).

Models: in-house SWE-1.5/1.6 (0-cost) and an Adaptive router, plus Claude/GPT-5.x/Gemini 3.x/Grok/others. MCP + ACP support.

Watch-out: the rename + ownership churn means a lot of stale "Windsurf" docs online; verify against Cognition's current pages, and note legacy Cascade is being phased out.

Google Gemini: CLI · Code Assist · Jules — Google · three different products

What: Three distinct things. Gemini CLI (open-source, strong MCP); Gemini Code Assist (IDE extension); and Jules (async cloud agent → GitHub PRs).

Price (mid-2026): Code Assist Standard ~$19/seat, Enterprise ~$45/seat. Jules has Free (15 tasks/day) / Pro (100/day) / Ultra (300/day) tiers, bundled into Google AI Pro (~$20/mo) / Ultra.

Big change: the individual hosted Gemini CLI + Code Assist free path stopped serving requests on June 18, 2026 and migrated to "Antigravity CLI"; Standard/Enterprise and Jules are unaffected. Check Google's current docs for the live path.

Models: Gemini 3.1 Pro (1M context) flagship, 3.5 Flash. Strength: huge context, generous quotas, deep Google Cloud ties.

Aider — open-source · terminal pair programmer, BYO key

What: A beloved open-source (Apache-2.0) terminal pair programmer. Architect/editor mode, auto-commits with generated messages, 100+ languages, works with virtually any model including local ones.

Price: free — your only cost is the model API you point it at (BYO key). Maximum control and transparency.

Strength: model-agnostic, scriptable, no lock-in, cheap if you use efficient models. Watch-out: fewer "product" niceties; you manage keys and cost yourself. (Its public leaderboard is a useful but point-in-time signal — read its date.)

install

uv tool install aider-chat     # or: pipx install aider-chat

Cline (& the Roo Code note) — open-source · VS Code/JetBrains agent

What: An open-source agent extension for VS Code (and JetBrains) with Plan/Act modes and the strongest MCP story — a one-click MCP Marketplace and on-the-fly tool creation. BYO key across Anthropic/OpenAI/Gemini/Bedrock/Vertex/OpenRouter (200+)/Ollama/LM Studio; an optional hosted PAYG provider and an Enterprise tier exist.

Note: the popular fork Roo Code was discontinued (repo archived) in May 2026; if you want its role-based modes, use Cline or the community fork "ZooCode" — not Roo Code.

Kiro & Amazon Q Developer — AWS · spec-driven IDE

What: Kiro is AWS's spec-driven agentic IDE (a Code-OSS fork, GA Nov 2025): you write a spec, it plans, then builds — with hooks, steering, and subagents. Runs Claude (Sonnet/Haiku/Opus) + open-weight models + an Auto router; MCP supported.

Price (mid-2026): Free (50 credits); Pro $20 (1,000); Pro+ $40 (2,000); Pro Max $100 (5,000); Power $200 (10,000); overage $0.04/credit.

Note: Amazon Q Developer is being sunset (new signups blocked May 2026; end-of-service Apr 2027) with newest models steered to Kiro. If you're picking fresh in the AWS ecosystem, choose Kiro.

Devin — Cognition · the autonomous SWE agent

What: The original "fully autonomous software engineer" — a cloud agent you delegate scoped tasks to (web, CLI, Slack, VS Code/JetBrains, GitHub/Linear/Jira), which works in its own environment and opens PRs. MCP marketplace with 30+ servers.

Price (mid-2026): Free; Pro $20; Max $200; Teams $80/mo min + $40/seat; Enterprise custom. The old PAYG "ACU" self-serve model was retired (Apr 2026) in favor of quota tiers; ACUs (≈15 min of work each) survive only for Enterprise.

Strength: hands-off delegation of well-scoped tickets. Watch-out: best on bounded, well-specified tasks; treat headline autonomy claims and third-party benchmark figures skeptically.

JetBrains Junie + AI Assistant · Zed — IDE-native options

JetBrains Junie (GA June 2026) is an autonomous agent across the IDE + tool window + CLI, sharing a credit pool with AI Assistant. Price: AI Pro $10/mo personal (10 credits, ~$1 each), AI Ultimate $30/mo (35 credits); models incl. Claude/GPT/Gemini/Grok + in-house Mellum + local via Ollama; BYO key; MCP. Junie burns credits fast — Ultimate is realistic for heavy use.

Zed is a fast Rust editor (open-source core) with an Agent Panel, parallel agents, an open-source edit-prediction model, MCP, and external-agent support via ACP. Price: Personal $0 (unlimited with your own keys), Pro $10, Business $30/seat; BYO key with no markup on all tiers; Mac/Linux/Windows. Strength: speed + generous BYOK; watch-out: smaller ecosystem than VS Code.

The 2026 pricing shift — read this before you subscribe

Across 2025–2026, nearly every major tool abandoned flat/fixed-request pricing for usage / token / credit / quota models, clustering around mid-2026:

Copilot: premium requests → token-based AI Credits (Jun 1, 2026; 1 credit = $0.01).
Cursor: request counts → dollar-credit/token usage (Jun 2025; backlash + CEO apology + refunds).
Devin Desktop & Devin: credits/ACU PAYG → auto-refreshing quota tiers (2026).
Claude Code: added weekly limits on Pro/Max (Aug 2025); 5-hour limits doubled (May 2026).
OpenAI Codex: per-message → API-token-aligned usage (Apr 2026).

Why: frontier-model inference is expensive and agents burn tokens at ~100× a chat user's rate, so flat plans were subsidizing heavy users; GPU/compute supply is the binding constraint (Anthropic explicitly tied a limit increase to a compute deal). Real-world reports include Copilot bills jumping ~25× and orgs capping per-engineer agent spend in the thousands/month.

Reference: API prices per million tokens (in / out) — for BYO-key math

Model (mid-2026)	Input /MTok	Output /MTok
Claude Opus 4.8	$5	$25
Claude Sonnet 4.6	$3	$15
Claude Haiku 4.5	$1	$5
Claude Fable 5	$10	$50
OpenAI GPT-5.5	$5	$30
OpenAI GPT-5.4	$2.50	$15
Google Gemini 3.1 Pro (≤200K)	$2	$12
Google Gemini 3.5 Flash	$1.50	$9

All three providers offer roughly 90%-off cache reads and 50%-off batch — material for BYO-key cost. Note: newer Claude tokenizers can consume ~30–35% more tokens for the same text, which offsets some per-token savings. Figures as of mid-2026; verify on each provider's pricing page.

Benchmarks reality check (SWE-bench)

Don't pick a tool from a single benchmark percentage. SWE-bench Verified scores split into non-comparable buckets (different scaffolds, thinking budgets, parallel test-time compute), ~99% are vendor self-reported, and the benchmark is saturating — OpenAI publicly argued it "no longer measures frontier coding capabilities," and an audit found a large share of remaining tasks are mis-scored.

Credible self-reported highs (mid-2026): Claude Opus 4.8 ≈ 88.6% and OpenAI GPT-5.5 ≈ 88.7% on SWE-bench Verified. Gemini 3.1 Pro ≈ low-80s%.
Treat ~95% "leaderboard" claims with suspicion — some aggregator figures circulating for the newest models aren't corroborated by the vendors' own published numbers.
The field is moving on to harder, contamination-resistant suites (SWE-bench Pro — leaders in the ~60s% — Multilingual, Terminal-Bench).

Better signal than a leaderboard: run a 1–2 week trial on your codebase and tasks. Judge by accepted-diff rate, review burden, how it handles your stack/tests, and total cost — not a headline percent.

How to choose

Solo dev / hobbyist: start free — Copilot Free, Aider or Cline with a BYO key, or Zed Personal. Add Claude Code Pro or Cursor Pro when AI becomes core to your day.

Terminal-native / power user: Claude Code or Codex CLI for autonomy; Aider if you want full model choice and no lock-in.

Team standardizing in an IDE: Cursor or Copilot (Business) for shared config and admin; JetBrains Junie if you live in IntelliJ/PyCharm.

Delegating tickets at scale: Devin, Copilot coding agent, Codex cloud, or Jules — assign well-scoped issues and review the PRs.

Privacy / on-prem: Cline, Aider, or Zed with local models (Ollama/LM Studio) and BYO keys — no code leaves your control beyond the model you choose.

AWS / spec-driven shops: Kiro. (Don't start new on Amazon Q — it's being retired.)

Don't over-commit: most tools have free tiers or trials, MCP makes switching cheaper, and the leaders leapfrog every few months. Optimize for "easy to try and leave," and re-evaluate quarterly.

Common mistakes & anti-patterns

Choosing on a single benchmark %. Self-reported, non-comparable, saturating. Trial on your own code instead.

Ignoring usage billing. Flat plans now have caps; BYO-key has none. A runaway agent can cost real money — set spend limits.

Merging agent PRs unreviewed. Autonomy ≠ correctness. Review diffs, run tests, watch for subtle/security bugs.

Trusting stale docs. Windsurf→Devin Desktop, Roo Code gone, Gemini CLI path changed — verify against the live vendor page.

Deep lock-in too early. Leaders change quarterly. Favor MCP-friendly, BYO-key-capable tools so switching is cheap.

Skipping a project context file. A CLAUDE.md/rules/AGENTS file dramatically improves output — most users never write one.

One giant prompt for a huge change. Scope tasks tightly; agents do best on bounded, well-specified work with tests.

Leaking secrets/keys. Don't paste credentials into prompts; use env vars/secret managers and sandboxed/approval modes.

Quick Reference — pick by what you're doing

FundamentalsHow to think about the category

Comparison matrix (as of mid-2026 — verify prices)

Tool profiles

The 2026 pricing shift — read this before you subscribe

Reference: API prices per million tokens (in / out) — for BYO-key math

Benchmarks reality check (SWE-bench)

How to choose

Common mistakes & anti-patterns

Fundamentals
How to think about the category