DeepSeek V4 model release
TECH

DeepSeek V4 model release

35+
Signals

Strategic Overview

  • 01.
    DeepSeek released a preview of its V4 series on April 24, 2026, including DeepSeek-V4-Pro (1.6T total / 49B active params) and DeepSeek-V4-Flash (284B total / 13B active params), both supporting a 1-million-token context window.
  • 02.
    V4 introduces a novel hybrid attention mechanism combining token-wise compression and DeepSeek Sparse Attention (DSA) to enable efficient 1M-token context processing.
  • 03.
    V4-Pro is priced at $1.74 per million input tokens and $3.48 per million output tokens; V4-Flash comes in at $0.14/$0.28 per million tokens.
  • 04.
    V4 is DeepSeek's first flagship model optimized for domestic Chinese chips, particularly Huawei Ascend, with model weights available on Hugging Face under the MIT License.

How V4 collapsed the price floor: a compute-efficiency story, not a margin story

How V4 collapsed the price floor: a compute-efficiency story, not a margin story
Single-token inference FLOPs and KV cache footprint at 1M-token context, indexed to V3.2 = 100 (MIT Tech Review)

The headline number — V4-Pro at roughly one-sixth the per-token cost of Claude Opus 4.7 and one-seventh the cost of GPT-5.5 — is easy to read as a market-share land grab funded by patient Chinese capital. The architecture tells a more interesting story. DeepSeek paired a token-wise compression layer with its DeepSeek Sparse Attention (DSA) mechanism so that a 1.6 trillion parameter MoE only activates 49 billion parameters per token, and at the 1-million-token context setting V4-Pro consumes just 27% of the single-token inference FLOPs and 10% of the KV cache footprint that V3.2 needed for the same workload. That is the actual reason the price sheet looks the way it does. DeepSeek is not selling intelligence at a loss — it has rebuilt the cost basis underneath it.

For builders, this matters more than the benchmark scores. A 1M-token context is largely theatrical at GPT-5.5 prices because nobody can afford to fill it; at $1.74 per million input tokens the long-context regime becomes the default rather than the special case. DeepSeek has also signaled further price cuts as Huawei's Ascend 950 supernode supply ramps in the second half of 2026, which means the V4 sticker is a ceiling rather than a floor. Frontier closed-source labs now have to decide whether to defend price or defend gross margin — and the open-weights MIT license under V4 means even the choice to defend price will not stop self-hosters from undercutting them in private deployments.

The Huawei Ascend story: real, but smaller than the headline

The framing in much of the U.S. coverage is that V4 was trained on Huawei chips, full stop. The reality, surfaced by Tsinghua's Liu Zhiyuan, is more measured: 'DeepSeek appears to have adapted only part of V4's training process for Chinese chips.' Some training stages still leaned on Nvidia silicon; what changed is that Huawei Ascend 950 hardware is now a credible part of the pipeline, and Huawei has publicly committed to full supernode deployment support around V4. SMIC shares jumping 10% on the news suggests the market read the signaling — not the engineering — as the headline event.

That distinction matters because the geopolitics flow from it. If V4 had been trained end-to-end on Ascend, U.S. export controls would be effectively obsolete on day one. What actually happened is closer to a credible threat: DeepSeek demonstrated that a flagship-scale model can incorporate Chinese chips into a frontier training run without collapsing in quality, and that subsequent generations can plausibly migrate further. Jensen Huang's reaction — 'the day that [DeepSeek comes out on Huawei] first, that is a horrible outcome for [the U.S.]' — reads less like a comment on this release and more like a forecast about the next one.

The hallucination paradox: top of the open-weights leaderboard, bottom of the trust scoreboard

Independent benchmarking from Artificial Analysis places V4-Pro as the #2 open-weights reasoning model behind Kimi K2.6 and crowns it the top open-weight model on the GDPval-AA agentic benchmark with a score of 1554. Vals AI separately calls V4-Pro the '#1 open-weight model on our Vibe Code Benchmark, and it's not close.' By any normal reading, this is a state-of-the-art release. Then comes the asterisk: Artificial Analysis measured hallucination rates of 94% on V4-Pro and 96% on V4-Flash on probes designed to test whether a model knows what it does not know.

That is not a rounding error — it is a different kind of model than the benchmarks suggest. V4 appears to be optimized for performing well when it has the answer and confidently fabricating when it does not, which is exactly the failure mode that breaks agentic workflows in production. The practical read: V4 is excellent for tasks where outputs are immediately verifiable (code that compiles or fails, math that can be checked) and dangerous for tasks where the user assumes the model will say 'I don't know.' Anyone building retrieval pipelines on top of V4-Pro to exploit the cheap 1M context should pair it aggressively with grounding and citation checks rather than treating it as a closed-book oracle.

Why this is not the R1 moment, even though it looks like one

When DeepSeek-R1 dropped in January 2025, Nvidia lost roughly $600 billion of market capitalization in a single trading session and the entire U.S. AI capex thesis was briefly in question. The V4 release is structurally similar — a Chinese open-weights model leapfrogging the public price/performance frontier — but the market reaction has been an order of magnitude smaller. Nvidia fell only 1.41% intraday, SMIC rose 10%, and Chinese rivals MiniMax and Knowledge Atlas dropped 9% or more. Investors are repricing the competitive landscape inside Chinese AI, not the global compute thesis.

The difference is that the surprise has already been priced. Everyone now expects that DeepSeek and its peers will keep shipping cost-efficient frontier models on constrained compute; the existence of such a model is no longer the news. What V4 adds is the Huawei chip narrative and the IP-theft escalation from the Trump administration, both of which are political stories rather than market-structure stories. Michael Kratsios's framing — 'There is nothing innovative about systematically extracting and copying the innovations of American industry' — and the parallel rollout of new IP-theft accusations suggest the U.S. response will increasingly happen through trade and policy channels rather than through Nvidia's stock price. The shock is migrating from Wall Street to Washington.

Historical Context

2024-12
Released DeepSeek-V3, a 671B-parameter MoE base model that established the V-series architecture.
2025-01-20
Launched DeepSeek-R1, the reasoning model that wiped roughly $600B off Nvidia's market cap and shocked global markets.
2025-08
Released V3.1, a hybrid model combining V3 and R1 capabilities.
2025-12-01
Released V3.2 (and V3.2-Speciale reasoning variant), the immediate predecessor to V4.
2026-04-24
Released V4 preview (V4-Pro 1.6T and V4-Flash 284B) with 1M context, optimized for Huawei Ascend chips.

Power Map

Key Players
Subject

DeepSeek V4 model release

DE

DeepSeek

Chinese AI lab; released V4 preview, controlling pricing and open-source release strategy; reportedly seeking $20B valuation funding from Tencent/Alibaba.

HU

Huawei

Provided Ascend 950 chips used in parts of V4 training and pledged full Ascend supernode deployment support, accelerating China's chip self-sufficiency narrative.

NV

Nvidia

Faces a credibility hit as DeepSeek shifts away from its GPUs; stock fell 1.41% on V4 release day amid stalled H200 sales to China.

OP

OpenAI / Anthropic / Google

Frontier closed-source competitors (GPT-5.5, Claude Opus 4.7, Gemini 3.1-Pro) whose pricing is dramatically undercut by V4-Pro.

SM

SMIC

Chinese chip foundry; shares jumped 10% on the V4/Huawei integration news.

U.

U.S. government (Trump administration)

Escalating IP-theft accusations against DeepSeek and Chinese AI firms in parallel with the V4 launch.

Source Articles

Top 3

THE SIGNAL.

Analysts

"Tempers expectations on chip independence — only part of V4 training was actually adapted to Chinese chips. 'DeepSeek appears to have adapted only part of V4's training process for Chinese chips.'"

Liu Zhiyuan
Computer Science Professor, Tsinghua University

"Acknowledges that compute-constrained Chinese researchers innovate efficient algorithms, and views a Huawei-first DeepSeek as a bad outcome for U.S. competitiveness: 'the day that [DeepSeek comes out on Huawei] first, that is a horrible outcome for [the U.S.].'"

Jensen Huang
CEO, Nvidia

"Frames DeepSeek's progress as IP appropriation rather than independent innovation: 'There is nothing innovative about systematically extracting and copying the innovations of American industry.'"

Michael Kratsios
Science Advisor to the Trump Administration

"V4-Pro is the largest open-weights model and 'the cheapest of the larger frontier models', though it slightly trails GPT-5.4 and Gemini-3.1-Pro on raw intelligence benchmarks."

Simon Willison
Independent AI researcher and writer

"V4-Pro is the #2 open-weights reasoning model behind Kimi K2.6, but flags a critical reliability caveat: 'V4 Pro and V4 Flash both have a very high hallucination rate of 94% and 96% respectively.'"

Artificial Analysis
AI benchmarking firm

"V4 leads the open-weight pack in coding-style 'vibe' benchmarks by a clear margin: '#1 open-weight model on our Vibe Code Benchmark, and it's not close.'"

Vals AI
AI evaluation firm
The Crowd

"🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length. 🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models. 🔹 DeepSeek-V4-Flash: 284B total / 13B active params."

@@deepseek_ai0

"DeepSeek V4 Pro is the #1 open weights model on GDPval-AA, our agentic real-world work tasks evaluation. @deepseek_ai has released V4 Pro (1.6T total / 49B active) and V4 Flash (284B total / 13B active). V4 is DeepSeek's first new size since V3, with all intermediate models"

@@ArtificialAnlys0

"🎉 Day-0 support for @deepseek_ai V4 Pro and Flash on vLLM — a new generation of DeepSeek model, purpose-built for tasks up to 1M tokens. Alongside the release, we're publishing a first-principles walkthrough of the new long-context attention and how we implemented it in vLLM."

@@vllm_project0
Broadcast
My Honest Thoughts about Deepseek

My Honest Thoughts about Deepseek

GPT 5.5 Arrives, DeepSeek V4 Drops, and the Compute War Intensifies

GPT 5.5 Arrives, DeepSeek V4 Drops, and the Compute War Intensifies

Deepseek V4, GPT-5.5, Kimi K2.6, MiMo Pro, video game agents, 4K editing: AI NEWS

Deepseek V4, GPT-5.5, Kimi K2.6, MiMo Pro, video game agents, 4K editing: AI NEWS