Cursor Composer 2.5 Launch
TECH

Cursor Composer 2.5 Launch

23+
Signals

Strategic Overview

  • 01.
    Cursor released Composer 2.5 on May 18, 2026, framing it as a substantial step up from Composer 2 with stronger sustained work on long-running tasks and more reliable complex instruction following.
  • 02.
    The model is now the default in Cursor's model picker, with Composer 2 retained as an opt-in fallback.
  • 03.
    To drive adoption, Cursor is doubling included Composer 2.5 usage for the first week after launch.
  • 04.
    Composer 2.5 is built on the same open-source Moonshot Kimi K2.5 checkpoint as Composer 2, but with 85% of total compute devoted to Cursor's own post-training and reinforcement learning.

The mechanism: post-training did the heavy lifting

The mechanism: post-training did the heavy lifting
Composer 2.5 vs Opus 4.7 vs GPT-5.5 on SWE-Bench Multilingual, Terminal-Bench 2.0, and CursorBench v3.1 — Cursor's model lands within a point of Opus on the first two and edges past it on the third.

Composer 2.5 didn't get a new brain — it got a much better tutor. The model still rides on Moonshot's open-source Kimi K2.5 checkpoint, the same base Cursor used for Composer 2 [3]. What changed is how Cursor spent its compute on top: roughly 85% of total compute went into Cursor's own post-training and reinforcement learning, including 25x more synthetic tasks than Composer 2 [1][6]. The signature trick is what Cursor calls Targeted RL with Textual Feedback: instead of grading an entire multi-step coding rollout against a single end-of-episode reward, the system inserts localized hints precisely at the point where the model errs, sharpening credit assignment over rollouts that span hundreds of thousands of tokens [1].

Cursor also leans on a 'feature deletion' technique where the agent is forced to reimplement removed code against an existing test suite — a synthetic curriculum that pushes the model toward sustained, verifiable work rather than one-shot completions [1][5]. The whole stack runs through a Sharded Muon optimizer that hits a 0.2-second optimizer step time on a 1T-parameter model, which is what makes throwing 25x more synthetic curriculum at the model economically tractable in the first place [1][6].

The economics: a 10x price gap, with caveats

Composer 2.5's headline pricing is $0.50 per million input tokens and $2.50 per million output tokens, with a faster, more expensive variant at $3/$15 [1]. That lands roughly an order of magnitude below Opus 4.7's published rate card for the same tier of work [6][7].

For Cursor, the strategic upside is COGS: every developer who flips the default from Claude to Composer 2.5 turns an external API bill into amortized post-training and inference cost on a model they own. The r/cursor community framed it as roughly a 10x cost gap alongside the double-usage promo, but the discussion underneath is less convinced — practitioners running long agentic sessions argued that when each turn re-processes tens of thousands of context tokens, the realized cost gap narrows considerably, blunting the per-token math.

Strategic decoupling from Anthropic

Cursor and Anthropic have always had an awkward relationship: Claude has been the default coding model inside Cursor while Anthropic ships Claude Code as a direct rival. Making Composer 2.5 the default in the model picker — with Composer 2 still selectable but Claude no longer privileged — is the clearest signal yet that Cursor wants to control its own destiny [4].

The internal proof point Cursor highlights is that 35% of merged pull requests at the company are now created by autonomous agents, an implicit pitch that the model is good enough to run their own dogfood loop [6]. Hacker News commenters reading the launch framed it the same way: Cursor is hedging against a single-vendor coding stack [8].

The skeptics' read: benchmaxxing and no moat

Reddit's reception was sharper than the official narrative. On r/vibecoding, the launch screenshot drew accusations of benchmaxxing — commenters argued that the chart's framing buried Opus 4.7 leading on most cells, and at least one developer reported that Composer 2.5 broke their codebase before they rolled back to Opus and got the task done in a single shot.

The harshest community read — that Cursor 'just has absolutely no moat' as a VS Code fork — is exactly the critique Composer 2.5 is meant to rebut by turning the IDE into a vertically integrated model-plus-harness product. Whether that argument lands depends on how Composer 2.5 performs in the long, messy sessions that benchmarks underweight, not on the screenshots [8].

The xAI tell: Composer 2.5 is the appetizer

Buried in the technical write-up is the most interesting forward signal: Cursor's next-generation model is being co-trained with SpaceXAI on the Colossus 2 cluster using roughly 10x more total compute — million-H100-equivalent scale [1][7]. Combined with Cursor's Sharded Muon optimizer hitting a 0.2-second optimizer step time on a 1T-parameter model, the operational picture is that Composer 2.5 is a proof point on a borrowed base, and the real bet is a much larger Cursor-native model still in training [1][6].

If that ships within Cursor's typical cadence, the Kimi attribution debate that dogged Composer 2 stops mattering — because the next model won't need the base. The transparency concession Aman Sanger made about under-disclosing Kimi on Composer 2 [5]reads less like an apology and more like a tidying-up before the room is no longer relevant.

Historical Context

2026-03
Released Composer 2, the first Cursor in-house model built on the Kimi K2.5 base, priced at the same $0.50/$2.50 per M input/output token tier later reused for Composer 2.5.
2026-05-18
Released Composer 2.5 with matching base pricing, a one-week double-usage promotion, and benchmarks within touching distance of Opus 4.7 and GPT-5.5 on Cursor's published suite.

Power Map

Key Players
Subject

Cursor Composer 2.5 Launch

CU

Cursor

Developer and deployer of Composer 2.5; positions it as the new default model inside the Cursor IDE, replacing Claude for many users by shipping a cheaper in-house alternative.

MO

Moonshot AI

Provides the open-source Kimi K2.5 base checkpoint that Composer 2.5 fine-tunes on top of; gains visibility but receives no direct revenue from Cursor's commercial deployment.

AN

Anthropic (Claude)

Incumbent default coding model inside Cursor; loses share as Composer 2.5 ships at benchmarks within striking distance of Opus 4.7 at roughly one-tenth the per-token price.

OP

OpenAI (GPT-5.5)

Frontier rival; Composer 2.5 matches or trails GPT-5.5 across the published benchmarks but undercuts it dramatically on cost.

SP

SpaceXAI / xAI

Compute partner; co-training a 10x-larger Cursor successor model on the Colossus 2 cluster (roughly million-H100-equivalent scale).

AM

Aman Sanger (Cursor co-founder)

Public spokesperson; conceded that Cursor under-disclosed the Kimi base in the original Composer 2 messaging, framing transparency for the 2.5 release.

Fact Check

8 cited
  1. [1] Composer 2.5
  2. [2] Composer 2.5 Changelog
  3. [3] Composer 2
  4. [4] Composer 2.5 is now live
  5. [5] Cursor Releases Composer 2.5, Saying It's Better at Sustained Coding Work
  6. [6] Cursor Composer 2.5 Benchmarks: How Does It Compare To Opus 4.7 And GPT-5.5
  7. [7] Cursor's Composer 2.5 matches Opus 4.7 and GPT-5.5 benchmarks at a fraction of the cost
  8. [8] Cursor Composer 2.5 (Hacker News discussion)

Source Articles

Top 1

THE SIGNAL.

Analysts

"Conceded the original Composer 2 announcement should have disclosed the Kimi K2.5 base from the start, addressing transparency criticism going into the 2.5 release: "It was a miss to not mention the Kimi base in our blog from the start.""

Aman Sanger
Co-founder, Cursor

"Frames Composer 2.5 as a test of whether superior post-training alone can meaningfully improve a coding agent without changing the base model or raising entry prices."

Winbuzzer analysis
Industry publication
The Crowd

"Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we're doubling the included usage of the model."

@@cursor_ai0

"Composer 2.5 just dropped and it's matching Claude Opus 4.7 across the board. Terminal-Bench: 69.3% vs 69.4%. SWE-Bench Multilingual: 79.8% vs 80.5%. Nearly identical scores from a Cursor proprietary model. Cursor built their own model that competes with Anthropic's."

@@bridgemindai0

"Composer 2.5 is a significant step up from Composer 2. This is the very start of our work with SpaceXAI. Hope to have more improvements out soon."

@@mntruell0

"Composer 2.5 has been released (2x usage for the next week)"

@u/lrobinson2011117
Broadcast
Cursor Kimi K2.5 Drama Explained

Cursor Kimi K2.5 Drama Explained

[This is Incredible] Cursor's New "Composer 2.5" Model: Has It Evolved to Opus-Level? Massive Per...

[This is Incredible] Cursor's New "Composer 2.5" Model: Has It Evolved to Opus-Level? Massive Per...

Cursor Composer 2.5: What's Hype, What's Real

Cursor Composer 2.5: What's Hype, What's Real