TECH

Zhipu AI releases GLM-5.2 open-weight model

30+

Signals

Strategic Overview

01.
Zhipu AI (Z.ai) released GLM-5.2, a 753-billion-parameter Mixture-of-Experts model with a 1-million-token context window and full MIT-licensed open weights.
02.
GLM-5.2 reached GLM Coding Plan subscribers on June 13, 2026, with wider open weights and benchmarks following the week of June 16.
03.
It leads the Artificial Analysis Intelligence Index among open models and ranks second only to Claude Opus 4.8 across three multi-hour coding benchmarks.
04.
Zhipu's Hong Kong-listed parent, Knowledge Atlas Technology, saw its stock surge as much as 48% on the release.

Open weights are now a coding-benchmark point or two behind the closed frontier

The headline isn't that another open model shipped, it's how little daylight is left at the top. On Terminal-Bench 2.1, GLM-5.2 scores 81.0, a large jump from GLM-5.1's 63.5 and within striking distance of Claude Opus 4.8's 85.0 ^[2]^[3]. On FrontierSWE it posts 74.4% against GPT-5.5's 72.6% and Opus 4.8's 75.1%, and on SWE-bench Pro it edges both closed rivals at 62.1 versus 58.6 and 58.4 ^[2]^[5]. DeepLearning.AI's The Batch summarizes the result cleanly: GLM-5.2 ranks second only to Claude Opus 4.8 across three multi-hour coding benchmarks, and substantially ahead of its own predecessor ^[3]. It also tops the Artificial Analysis Intelligence Index v4.1 at 51, well above the next open models MiniMax-M3 and DeepSeek V4 Pro, which max out around 44 ^[1]^[2].

The caveat from The Decoder is that the gap is narrowest on coding marathons specifically; on broader reasoning it still trails Opus 4.8 and Gemini 3.1 Pro ^[2]. Developer reception tracked this story closely: hands-on reviewers on YouTube ran their own coding gauntlets and arrived at numbers a few points below Opus rather than at parity, and community testers probed the 'beats GPT-5.5' claim directly rather than taking it on faith. The shape of the conversation was less 'is it the best' and more 'how close, and on which tasks' — a question that would have been unthinkable for an open model a year ago.

The cost paradox: cheap per token, expensive per task

GLM-5.2's pricing is the loudest selling point and its quietest catch. API rates are $1.4 per million input tokens and $4.4 per million output tokens, and the entry Coding Plan Lite runs about $12.60 a month billed annually, roughly a tenth of Anthropic's premium tiers; VentureBeat frames the same comparison as beating GPT-5.5 for about one-sixth the cost ^[1]^[4]^[5]. But Simon Willison's measurement complicates the math: GLM-5.2 burns about 43k output tokens per Intelligence Index task, of which roughly 37k is reasoning, up from 26k for GLM-5.1 ^[1]. The Decoder reaches the same conclusion from another angle, calling it one of the least token-efficient models among the open competition ^[2].

The implication for builders is that a low per-token price does not guarantee a low per-task bill; a model that thinks more to reach the same answer can quietly erase its sticker advantage on long agentic workloads. Reddit's developer threads made cost the dominant conversation, with the most-shared anecdotes pairing GLM-5.2 with cheap FP8 hosting to run millions of tokens for a couple of dollars, alongside skepticism that the FP8 precision tradeoff is part of what makes that math work.

Timing and license made this a geopolitical event, not just a release

The release read as a market move as much as a model launch. According to SCMP and The Decoder, Anthropic reportedly suspended flagship model access citing a US export-control directive, and Zhipu open-sourced GLM-5.2 within 24 hours under an MIT license with no regional restrictions ^[2]^[4]. The combination of a permissive license, no geographic gating, and Coding Plan pricing around a tenth of Anthropic's premium tiers is explicitly aimed at developers shut out of or priced out of Western frontier models ^[4]^[5].

Markets responded immediately: Knowledge Atlas Technology stock spiked as much as 48% to HK$1,620 and closed up 32.8% at HK$1,457, now up roughly 820% since its January IPO, with JPMorgan lifting its price target to HK$1,400 from HK$950 ^[4]. The asterisk is governance, not capability: TechTimes notes that using the hosted Z.ai API carries China data-handling risk, which is precisely why the open weights matter for anyone who needs to self-host rather than route sensitive code through a hosted endpoint ^[6].

A 'win for local AI' that isn't actually local

The open-source community celebrated GLM-5.2 as a win for local AI, but the framing deserves a reality check. At 753B parameters and roughly 1.51TB of weights, the model is far too large for home hardware, and even the most enthusiastic community threads conceded that an ~800GB download puts true local inference out of reach for nearly everyone ^[1]. The genuine win is structural: the MIT license with no regional restrictions makes the weights freely distillable and deployable, and the community's excitement centered on distillation potential — fine-tuning smaller models on GLM-5.2's outputs — rather than running the full model at home.

Architecturally, GLM-5.2 leans on a Mixture-of-Experts design that activates only about 40B of its 753B parameters per token, keeping inference far cheaper than the raw size suggests, plus an efficiency trick called IndexShare that reuses one indexer across every four sparse attention layers to cut per-token compute 2.9x at the 1M-token context length ^[1]^[3]. Two limitations recurred in community discussion: it is text-input only with no vision or multimodal support, a dealbreaker for screenshot-driven workflows, and its reliability at the full 1M-token window remains an open question that The Decoder flags as easy to claim and harder to keep reliable in practice ^[2].

Historical Context

2024-06-05

Released GLM-4, the foundation release of the GLM-4 family.

2025-09

GLM-4.6 shipped with the first FP8/Int4 quantization on domestic Cambricon chips.

2025-12

Released GLM-4.6V and GLM-4.7, adding a vision tier and an agentic coding tier.

2026-01-08

IPO'd on the Hong Kong Stock Exchange as Knowledge Atlas Technology, the first publicly listed Chinese AI lab.

2026-02-11

Launched GLM-5 (744B MoE, 40B active) with a 30% price increase; stock jumped roughly 30-34%.

2026-04-08

Released GLM-5.1 as an open-weight model with a 200K-token context window.

2026-06-13

GLM-5.2 reached Coding Plan subscribers with a 1M-token context; MIT open weights followed the week of June 16.

Power Map

Key Players

Subject

Zhipu AI releases GLM-5.2 open-weight model

Zhipu AI / Z.ai (Knowledge Atlas Technology)

Developer of GLM-5.2; HKEX-listed Chinese AI lab whose stock surged on the release.

Anthropic

Closed-source benchmark leader (Claude Opus 4.8); reportedly suspended flagship model access citing a US export-control directive, opening a gap Zhipu filled within 24 hours.

OpenAI (GPT-5.5)

Competitor; GLM-5.2 reportedly beats GPT-5.5 on multiple long-horizon coding benchmarks at roughly one-sixth the cost.

JPMorgan

Raised its price target on Knowledge Atlas Technology to HK$1,400 from HK$950 following the release.

Fact Check

6 cited

Source Articles

Top 5

THE SIGNAL.

Analysts

"Calls GLM-5.2 probably the most powerful text-only open-weight LLM and the new leader on the Artificial Analysis Intelligence Index, but flags that it consumes far more output tokens per task than peers, at roughly 43k tokens versus GLM-5.1's 26k."

Simon Willison

Independent AI developer and analyst, simonwillison.net

"Describes GLM-5.2 as the strongest open-source model, closing most of the gap to closed-source leaders on coding marathons, while noting it trails Claude Opus 4.8 and Gemini 3.1 Pro on reasoning and is among the least token-efficient models."

The Decoder

AI publication

"Reports GLM-5.2 ranks second only to Claude Opus 4.8 across three multi-hour coding benchmarks and lands substantially ahead of GLM-5.1."

DeepLearning.AI (The Batch)

AI newsletter led by Andrew Ng

The Crowd

"Introducing GLM-5.2: Frontier Intelligence, Open Weights - Significant improvements in coding and agentic tasks - Strong long-horizon capabilities with a 1M context window - Two levels of reasoning effort: GLM-5.2 (max) pushes the limits, while GLM-5.2 (high) strikes a strong https://t.co/SjGPSVhePJ"

@@Zai_org10263

"GLM-5.2 (Max) by @Zai_org ranks #10 on the new Agent Arena leaderboard, closely matching Claude-Opus-4.8 (non-thinking) and is the #1 open model by a wide margin! In Agent Arena, we measure models on millions of real-world, long-horizon agentic tasks from a global community of https://t.co/J1EDyJSc6A"

@@arena720

"ZAI: GLM-5.2 is now available on huggingface! > It comes with a 1M context window and 2 levels of reasoning effort, max and high. MIT license and same pricing as GLM-5.1. > GLM-5.2 scores 46.2% on DeepSWE, the SOTA score among open-weight models. https://t.co/mPB0KqhF87"

@@testingcatalog483

"GLM-5.2 is a win for local AI"

@u/Wrong_Mushroom_7350989

Broadcast

GLM-5.2 (Fully Tested): I got EARLY ACCESS & This MODEL is CRAZY!

GLM 5.2 - The Top NEW Open Weights Model

Testing GLM 5.2 on Easy, Medium, and Hard Coding Tasks