DeepSeek makes 75% price cut on V4-Pro permanent
TECH

DeepSeek makes 75% price cut on V4-Pro permanent

28+
Signals

Strategic Overview

  • 01.
    On May 23, 2026, DeepSeek converted what had been a promotional 75% discount on its flagship V4-Pro model into the standing price, locking developer API costs at roughly a quarter of launch levels.
  • 02.
    V4-Pro now ranges from $0.003625 per million tokens for cached input to $0.87 per million output tokens, down from $0.0145-$3.48; the cut also drops input cache-hit pricing across the entire API line to roughly one-tenth of launch levels.
  • 03.
    For context, OpenAI charges roughly $30 and Anthropic roughly $25 per million output tokens — DeepSeek is now undercutting both by about 30x.
  • 04.
    The mechanism is structural, not promotional: V4 was rewritten to run natively on Huawei Ascend 950PR chips via the CANN framework, and Huawei plans to ship roughly 750,000 Ascend 950 units in 2026.

The $30-to-$0.87 Gap Isn't a Discount, It's a Different Cost Curve

The temptation is to read V4-Pro's $0.87 per million output tokens as DeepSeek bleeding margin to grab share. The numbers say otherwise. OpenAI charges roughly $30 per million output tokens and Anthropic roughly $25 [1]— DeepSeek is now about 30x cheaper than either. A discount that deep cannot be sustained on goodwill; it requires a structurally different cost stack.

That stack is Huawei Ascend 950PR, which DeepSeek rewrote V4 to target natively via the CANN framework [1]. Western labs pay Nvidia-tier margins on H100/H200 capacity; DeepSeek pays domestic-chip economics, and Huawei plans to ship roughly 750,000 Ascend 950 units in 2026, split 70% PR / 30% DT [5]. The implication is uncomfortable: even if OpenAI or Anthropic matched the headline price, they would burn cash doing so, because the price reflects a hardware supply chain Western providers cannot access. The 'asymmetric pricing war' framing in industry coverage [4]is not hyperbole — it is accounting.

Cache Hits Are Almost Free — That's Where the Real Story Hides

Most coverage anchors on the 75% headline. The quieter line in the announcement matters more for working developers: DeepSeek cut input cache-hit pricing across the entire API line to roughly one-tenth of launch levels [3]. V4-Pro cache-hit input is now $0.003625 per million tokens [2]. For any application that re-sends the same system prompt, tools schema, or document context across turns — that is, almost every agent and coding tool in production — this line item collapses to effectively zero.

Combined with V4-Pro's 1.6T-parameter MoE backbone and 1M-token context window [1], the cache tier turns previously cost-prohibitive patterns into default behavior: full-codebase context for coding agents, multi-document RAG without aggressive trimming, long-running multi-turn dialogues. The cache tier is not 75% off, it is roughly 90% off, and it rewards prompt-engineering discipline (stable prefix, variable suffix) more than any other model on the market right now.

Why Permanence Is the Real News, Not the Percentage

The 75% cut already existed before May 23 — as a promo originally set to expire May 31, 2026 [2][4]. What changed is that DeepSeek converted it into the standing price. Three downstream effects follow.

First, application developers can now build business models on V4-Pro pricing without a sunset-clause risk premium. Second, competing labs — both Chinese and Western — face the new floor as a permanent benchmark rather than a temporary land-grab they can wait out. Third, and most strategically, it telegraphs DeepSeek's confidence in the Huawei supply ramp. The dev.to analysis puts it bluntly: 'DeepSeek is not cutting prices because it has spare compute. It's cutting prices to lock in routing commitments before the hardware arrives' [5]. With demand already at roughly 140 trillion tokens per day as of March 2026 — a 1,000x jump in two years, growing about 40% per quarter — the company is pre-selling capacity to chips Huawei has not yet shipped [5]. That is an AWS-style move: pricing for the scale you plan to reach, not the scale you have.

The China Commoditization Playbook, Now Pointed at Inference

Solar panels. Electric vehicles. Batteries. Drones. Consumer electronics. In each case China followed the same arc: aggressive cost leadership, domestic supply-chain vertical integration, foreign competitors either retreat or restructure around the new floor. Multiple analysts now place DeepSeek squarely in that lineage [7][8].

The cost-engineering culture is itself a product of export controls — limited access to top-end Nvidia hardware forced Chinese labs to obsess over MoE efficiency and inference optimization, which structurally lowered the per-token cost they must charge [1][9]. Everest Group's Oishi Mazumder summarizes the moat erosion: 'DeepSeek's 685B open-source model accelerates commoditization of raw AI capability, eroding the closed-model moat of US players' [6]. The short-term collateral damage is visible — shares in MiniMax and Knowledge Atlas dropped more than 9% after V4 pricing moves [10], suggesting the cut squeezes Chinese rivals harder than cash-rich Western incumbents. The long-term effect is geopolitical: pricing power built on Huawei chips ratifies the side effects of U.S. export controls, accelerating Chinese self-sufficiency in the AI stack rather than slowing China's frontier progress.

What Developers Are Actually Doing With It

The community reaction skews bullish on the developer side but is sharper than 'cheap model good.' On X, the DeepSeek announcement post drew substantial engagement, and aggregator accounts immediately broke out the cache-hit pricing tier as the underrated headline. On developer YouTube, walkthroughs frame the cut as the death of the $200-per-month AI-coding-subscription ceiling, with sub-$1 entry-tier framings dominating titles and thumbnails.

Reddit's r/DeepSeek is where the real nuance lives. The top discussion thread — titled 'If DeepSeek V4 can do the same coding task for $5, why are people still paying $100 for Claude Code?' — does not arrive at full replacement. A widely upvoted community benchmark places V4-Pro on its Max-reasoning setting at roughly 80% of Claude Opus 4.7 quality on coding tasks, so the dominant pattern is hybridization: V4-Pro for the bulk of completions, Opus for the hard 20%. Contrarian threads flag real frictions repeatedly raised by users: token bloat erodes the per-token advantage on verbose reasoning chains; latency and throughput can lag Western frontier models; data-sovereignty concerns persist for enterprise (API inputs/outputs are stored in China unless explicit opt-out); and harness ecosystem stickiness — Claude Code is the model plus the skills, MCPs, and tooling around it — keeps many subscribers from defecting wholesale. The structural news is the price; the adoption pattern is hybridization, not replacement.

Historical Context

2025-01
The original 'DeepSeek shock' — an open-source reasoning model triggered a global rerating of AI training costs and Chinese AI capability.
2026-04-24
Unveiled V4 family (Pro + Flash) with rock-bottom prices and full Huawei Ascend integration, pricing V4 roughly 97% below OpenAI's GPT-5.5.
2026-04
Ascend 950PR entered mass production with plans to ship roughly 750,000 units through the year, supplying DeepSeek and ByteDance.
2026-05-23
Made the 75% V4-Pro discount permanent, converting a promo set to expire May 31, 2026 into the standing price.
2026-05
DeepSeek's pricing has triggered a global price war, prompting OpenAI, Google, and Anthropic to cut API costs.

Power Map

Key Players
Subject

DeepSeek makes 75% price cut on V4-Pro permanent

DE

DeepSeek

Chinese AI lab making the permanent cut; positioning API cost as a durable competitive moat while raising a ~70B yuan (~$10B) round at a reported $20B valuation.

HU

Huawei

Chip supplier whose Ascend 950PR processors run V4-Pro natively via CANN; mass production of ~750,000 Ascend 950 units in 2026 underwrites further cost reductions.

OP

OpenAI, Anthropic, Google

Western frontier model providers facing structural pricing pressure; OpenAI's ~$30 and Anthropic's ~$25 per-million-output-token rates now sit roughly 30x above V4-Pro.

NV

Nvidia

U.S. chipmaker affected by export controls; restricted access to top-end GPUs in China has enabled Huawei's chip-share gains that underpin DeepSeek's cost structure.

MI

MiniMax & Knowledge Atlas

Chinese competitor labs whose shares dropped more than 9% on V4 pricing news — collateral damage from DeepSeek's domestic price war.

BY

ByteDance

Reportedly placed a ~40B yuan Ascend chip order — anchor demand validating the domestic chip stack DeepSeek now runs on.

Fact Check

11 cited
  1. [1] DeepSeek V4 AI model: price, performance, China, open source
  2. [2] DeepSeek permanently reduces the price of its flagship V4 model by 75 percent
  3. [3] DeepSeek V4 Pro price cut 75 percent — what it means
  4. [4] DeepSeek Makes 75% V4-Pro API Cut Permanent
  5. [5] 75,000 Chips, 140 Trillion Tokens: The Math Behind DeepSeek's Permanent Price Cut
  6. [6] DeepSeek AI Agent Ambitions: Strategic Threat and Catalyst for Innovation in the AI Ecosystem
  7. [7] EV #571
  8. [8] How DeepSeek Made China Investible
  9. [9] DeepSeek Inference Cost Explained
  10. [10] DeepSeek V4 Price Cut + Huawei Chip + AI
  11. [11] DeepSeek API — Pricing

Source Articles

Top 3

THE SIGNAL.

Analysts

"Argues DeepSeek's open-source push is accelerating commoditization of raw AI capability and eroding the closed-model moat of U.S. providers: 'DeepSeek's 685B open-source model accelerates commoditization of raw AI capability, eroding the closed-model moat of US players.'"

Oishi Mazumder
Senior Analyst, Everest Group

"Frames the cut as a routing-commitment land-grab modeled on AWS's early underpricing — 'DeepSeek is not cutting prices because it has spare compute. It's cutting prices to lock in routing commitments before the hardware arrives.'"

Lanternproton (DEV Community technical analysis)
Industry commentator, DEV Community
The Crowd

"We are making our discount permanent! 🎉 Enjoy building with DeepSeek-V4-Pro and bring your innovative ideas to life! 🚀 https://t.co/V8atbTaogH"

@@deepseek_ai21318

"DeepSeek permanently reduced pricing for DeepSeek V4 Pro by 75%! > $0.003625 per million input tokens (with cache) > $0.435 per million input tokens. > $0.87 per million output tokens. Cache is almost free 👀 https://t.co/6XrrA8wgaU"

@@testingcatalog1786

"The DeepSeek-V4-Pro discount has been extended until May 31, 2026, 15:59 UTC! https://t.co/kd0taLhZhH"

@@deepseek_ai6753

"deepseek v4 pro price will be reduced to the current price permanently accoding to official website"

@u/InsideElk6329650
Broadcast
UNLIMITED FREE Deepseek-V4 PRO AI Coder: THIS IS CRAZY!

UNLIMITED FREE Deepseek-V4 PRO AI Coder: THIS IS CRAZY!

DeepSeek v4 Pro Only For $1?!

DeepSeek v4 Pro Only For $1?!

Why DeepSeek V4 Has Everyone Freaking Out

Why DeepSeek V4 Has Everyone Freaking Out