TECH

Meituan LongCat-2.0: a 1.6-trillion-parameter model trained end-to-end on Chinese chips

29+

Signals

Strategic Overview

01.
On June 30, 2026, Meituan open-sourced LongCat-2.0, a 1.6-trillion-parameter Mixture-of-Experts model purpose-built for agentic coding, releasing it under an MIT license on GitHub, Hugging Face, and its own platform. The model activates roughly 48 billion parameters per token on average (a dynamic range of 33B to 56B) and ships with a native 1-million-token context window.
02.
The headline claim is geopolitical, not just technical: Meituan says LongCat-2.0 was trained end-to-end - both pre-training and inference - on a 50,000-card cluster of domestic Chinese AI ASICs, with no Nvidia A100/H100 or AMD MI300X in the loop. If it holds under third-party scrutiny, it is the first trillion-parameter model claimed to complete a full training run without U.S. accelerators.
03.
Before the reveal, LongCat-2.0-Preview had been quietly leading OpenRouter's developer charts for roughly two months under the anonymous handle Owl Alpha, accumulating around 10 trillion tokens of monthly throughput while nobody knew it was Chinese. The June 30 open-source release also unmasked that identity.

The stealth run: how Owl Alpha topped OpenRouter before anyone knew it was Chinese

The most striking part of this launch is that the market had already voted before it knew who was on the ballot. LongCat-2.0-Preview spent roughly two months leading OpenRouter's developer charts as an anonymous model called Owl Alpha, pulling in around 10 trillion tokens of monthly throughput ^[4]. Developers were routing real coding workloads to it on merit, unaware it had been trained in China. The June 30 open-source release doubled as the unmasking: Meituan confirmed that Owl Alpha was LongCat-2.0-Preview all along ^[3]. That sequencing matters. A model that wins blind, on a neutral marketplace, is a harder data point to dismiss than a benchmark table published by its own maker - which is exactly why the reveal landed as more than a routine release.

HCCL in place of NCCL: the software plumbing that made non-Nvidia training work

The reason large-scale training is normally chained to Nvidia is not only the silicon - it is the software stack, especially NCCL, Nvidia's collective-communication library that orchestrates how thousands of chips exchange gradients. To train across a 50,000-card domestic ASIC cluster without it, Meituan integrated Huawei's Collective Communication Library (HCCL) for chip-to-chip communication, adding exception handling and elastic card scaling so the run could survive hardware faults ^[6]. The claim that carries weight here is operational: the training reportedly completed with no rollbacks and no catastrophic loss spikes, and daily throughput exceeded 1 trillion tokens per day with training MFU improved 1.5x ^[2]. Meituan says this covered both full pre-training and inference on domestic hardware, which is the specific bar earlier Chinese efforts had not publicly cleared ^[1].

Architecture: sparse attention plus an N-gram embedding layer to stretch a 1M-token window

LongCat-2.0's architecture is built around keeping a 1-million-token context window affordable. It uses LongCat Sparse Attention (LSA), which attacks the memory bottleneck of long context through streaming-aware, cross-layer, and hierarchical indexing rather than attending densely across the full sequence ^[5]. On top of the MoE experts - which average around 48 billion active parameters per token out of 1.6 trillion total - it adds a separate 135-billion-parameter N-gram Embedding layer (n-gram size 5) that expands the embedding space roughly 100x, improving parameter efficiency and local-context semantic capture while keeping inference sparse ^[5]. Reddit's more technical readers pegged the design as an evolution of DeepSeek Sparse Attention paired with 3-step multi-token prediction and around 97 percent MoE sparsity - a lineage that fits the DeepSeek-playbook framing.

The contrarian read: DUV 'good enough' chips and a networking edge, not a 4nm breakthrough

The chip-sovereignty headline deserves a colder look, and the sharpest version of it came from the community rather than the press releases. The likely hardware is Huawei Ascend 910C-class superpods, and the harder-nosed argument is that these are DUV-era 'good enough' ASICs leaning heavily on Huawei's networking and interconnect edge - not domestic 4nm nodes, and still perhaps a 10-to-15-year horizon from EUV parity. Independent verification is also still pending: the chip maker was left unnamed and benchmarking firms such as Artificial Analysis had not yet assessed LongCat-2.0 at release ^[2]. So the accurate framing is narrower than 'China caught up': it is that a well-engineered software stack plus abundant older-node accelerators can, at least once, brute-force a trillion-parameter run - which is still enough to strain the export-control thesis ^[1].

Mixed benchmarks and a weights-not-yet-released caveat

On coding, LongCat-2.0 is genuinely competitive: it posts SWE-bench Pro 59.5 (topping Gemini 3.1 Pro and GPT-5.5), SWE-bench Multilingual 77.3, Terminal-Bench 2.1 70.8, and strong reasoning scores like GPQA-diamond 88.9 and IMO-AnswerBench 81.8 ^[2]. But it is not a clean sweep - it falls short of Claude Opus 4.7/4.8 on SWE-bench and trails on broad general-agent tests such as FORTE (73.2) and BrowseComp (79.9) ^[5]. Two caveats keep the win asterisked: some third-party benchmark numbers appear to simply mirror Meituan's own blog rather than independent runs, and at release the open weights had not actually been uploaded to Hugging Face yet, listed as 'coming soon' ^[2]. For now the honest label is near-frontier on coding, unverified on hardware, and API-first until the weights land.

Historical Context

2025-08

Meituan released LongCat-Flash (a 560B MoE) alongside LongCat-Flash-Thinking, marking its entry into the frontier AI race.

2026-04-24

LongCat-2.0-Preview entered public testing and surfaced on OpenRouter days later under the anonymous handle Owl Alpha.

2026-06-30

Meituan open-sourced the full LongCat-2.0 and revealed that it was the model behind Owl Alpha.

Power Map

Key Players

Subject

Meituan LongCat-2.0: a 1.6-trillion-parameter model trained end-to-end on Chinese chips

Meituan (LongCat team)

Developer and open-source publisher - a Chinese food-delivery giant with an AI team only about two years old, now shipping a near-frontier model under an MIT license and expanding from logistics into frontier AI.

Huawei

Implied hardware and software backbone - LongCat leans on Huawei's HCCL communication library and a reported Atlas-950/Ascend SuperPod ASIC cluster. Bernstein estimated Huawei held roughly 40 percent of China's AI-chip market in 2025.

OpenRouter

Distribution platform where LongCat-2.0-Preview, posing as Owl Alpha, ranked first globally and accumulated roughly 10 trillion-plus monthly tokens before its identity was revealed.

Nvidia

Incumbent AI-chip supplier whose GPUs are export-restricted to China. LongCat-2.0 undercuts the premise that cutting off Nvidia chips bottlenecks Chinese frontier training; Nvidia's China share is projected to fall about 8 percent in 2026.

US frontier labs (Anthropic, OpenAI, Google)

Competitive reference points - LongCat-2.0 is benchmarked against Claude Opus, GPT-5.5, and Gemini 3.1 Pro, leading some coding tests while trailing on broader general-agent benchmarks.

Fact Check

6 cited

Source Articles

Top 5

THE SIGNAL.

Analysts

"Amodei has argued that efficient Chinese model releases make U.S. export controls more important, not less, because efficiency gains tend to get reinvested into ever more expensive frontier training rather than changing the underlying economics. His verbatim line on the earlier DeepSeek episode: 'DeepSeek does not do for $6 million what cost US AI companies billions'."

Dario Amodei, CEO, Anthropic

General stance on Chinese model releases and export controls (not a LongCat-2.0-specific reaction).

The Crowd

"Introducing LongCat-2.0 🐱 1.6T parameters · MoE with ~48B active · 1M context The full model behind Owl Alpha on @OpenRouter — now available. Built for agentic coding from the ground up: ◆ LongCat Sparse Attention (LSA) — scales efficiently for 1M-context tokens"

@@Meituan_LongCat3219

"Meituan's LongCat-2.0 reportedly lands near GPT-5.5 on SWE-bench. So I threw 5 HTML canvas animation prompts at both. 🥷 Paper sliced fruit-ninja style. 💧 An ink drop diffusing in water. 🔥 A letter burning. 🗑️ Paper crumpling into a ball. ✂️ A strip-cut shredder. Here's how"

@@stevibe244

"a chinese model, longcat-2.0, is here and it's beating gpt-5.5 at coding.. china's meituan (yes, the food delivery company) trained it without a single nvidia chip. meet longcat-2.0, from meituan, china's doordash, basically. their AI team is barely two years old and just"

@@heyrobinai24

"Introducing LongCat-2.0 - a large-scale MoE language model with 1.6 trillion total parameters and ~48 billion activated per token. This was the stealth model that was on Openrouter under the name 'owl-alpha'."

@u/AnticitizenPrime431

Broadcast

Meituan LongCat 2.0 (Tested): China's 1.6T OPEN MODEL looks CRAZY!

LongCat-2.0: China Breaks Free From Nvidia to Train a 1.6T Model

Meituan LongCat 2.0 is HERE (Real Tests and Review)