TECH

Google Launches Gemma 4 Open-Weight AI Models Under Apache 2.0

25+

Signals

Strategic Overview

01.
Google DeepMind released Gemma 4 on April 2, 2026, a family of open-weight AI models built on the same research and technology as Gemini 3, available in four sizes: E2B, E4B, 26B MoE, and 31B Dense.
02.
Gemma 4 is the first in the Gemma family released under the Apache 2.0 license, a significant shift from Google's previous proprietary licensing, enabling unrestricted commercial and research use.
03.
The models support multimodal inputs including images, video, and audio across 140+ languages, with context windows up to 256K tokens, native function calling, and structured JSON outputs for autonomous agents.
04.
The 26B MoE variant activates only 3.8B parameters during inference while the 31B dense model ranked 3rd on the Arena AI text leaderboard with a score of 1452, demonstrating strong performance relative to much larger models.

Why This Matters

Gemma 4 represents a pivotal strategic shift for Google in the open-weight AI race. The move to Apache 2.0 licensing — abandoning the proprietary terms that governed every previous Gemma release — signals that Google recognizes the competitive threat posed by Chinese open-weight models like DeepSeek and Qwen, which have been rapidly attracting developer mindshare. By removing licensing friction, Google is betting that a permissive ecosystem will generate more value through adoption than restrictive terms ever could.

The timing is equally significant. With 400 million downloads and over 100,000 community-created variants already in the Gemma ecosystem, Google is capitalizing on existing momentum while addressing the single biggest complaint developers had: the licensing terms. As Nathan Lambert of Interconnects noted, Gemma 4's success will be "entirely determined by ease of use," and Apache 2.0 removes one of the most significant barriers to enterprise and startup adoption. This release effectively rebalances the U.S. versus Chinese open-weight competition, giving Western developers a top-tier permissively licensed alternative.

How It Works

Gemma 4 employs a two-tier architecture designed for different deployment scenarios. The workstation tier includes the 31B dense model and the 26B Mixture-of-Experts (MoE) model, both targeting developers and researchers with powerful hardware. The edge tier features the E2B and E4B models optimized for smartphones, IoT devices, and other resource-constrained environments, delivering up to 4x faster inference and 60% less battery consumption compared to previous versions, according to Google's Android AI Core developer preview announcement.

The 26B MoE model is particularly notable for its efficiency: while it contains 26 billion total parameters, it activates only 3.8 billion during any given inference pass. This sparse activation pattern allows it to achieve near-parity with the 31B dense model's quality while requiring dramatically less compute. All models support multimodal inputs spanning images, video, and audio across more than 140 languages, with context windows reaching up to 256K tokens. Native function calling and structured JSON output capabilities make the models immediately suitable for building autonomous AI agents — a key growth area for the industry.

By The Numbers

Gemma 4's benchmark results, as published on the official DeepMind model page, demonstrate a remarkable leap over its predecessor. On AIME 2026, the flagship models scored 89.2%, compared to roughly 20% for Gemma 3 27B — as @kimmonismus noted on X, the jump is "absolutely enormous" over just 12 months, with GPQA scores also roughly doubling. LiveCodeBench v6 performance reached 80.0%, MMMU Pro hit 76.9%, GPQA Diamond scored 84.3%, and MMLU Multilingual achieved 85.2%. The tau2-bench agentic evaluation scored 86.4%, underscoring the models' strength in tool-use scenarios.

On the competitive Arena AI text leaderboard, the 31B dense model ranked 3rd overall with a score of 1452, while the 26B MoE model placed 6th — both remarkable achievements for models of their size. Google's official announcement on X garnered over 19K likes, reflecting massive community interest. On the hardware front, @FrameworkPuter endorsed the 26B MoE as the best model for the 32GB Framework Desktop, highlighting its practical appeal for local deployment. The ecosystem metrics are equally striking: Gemma models have accumulated over 400 million downloads to date, with more than 100,000 community-created variants. The models are available across six major platforms including Hugging Face, Kaggle, Ollama, LM Studio, Google AI Studio, and Google Cloud.

Impacts and What's Next

The immediate impact of Gemma 4 will be felt across several fronts. Enterprise adoption should accelerate significantly thanks to Apache 2.0 licensing, which eliminates the legal review overhead that proprietary licenses impose. Companies that previously hesitated to build products on Gemma can now do so without restrictions on commercial use, modification, or redistribution. VentureBeat specifically highlighted that the license change "may matter" more than the technical improvements for driving real-world adoption.

However, early adopters have flagged practical challenges. Community reports on r/LocalLLaMA identified that the 31B model's KV cache at full 262K context requires approximately 22GB of VRAM — a significant constraint for local deployment. Day-one tooling gaps, including a tokenizer bug in llama.cpp that required an urgent fix, underscore Nathan Lambert's point about ease of use being the decisive factor. Hardware partners NVIDIA and AMD are working to smooth deployment with optimized runtimes across their product lines, from NVIDIA's RTX GPUs and Jetson Orin Nano to AMD's Instinct, Radeon, and Ryzen AI platforms.

The YouTube creator ecosystem has responded rapidly, with Google for Developers publishing an official "What's new in Gemma 4" overview (269K views), Sam Witteveen releasing a hands-on walkthrough with Colab demos (83K views), and Matthew Berman capturing the community's enthusiasm in his reaction video (69K views). This swift tutorial proliferation signals strong developer interest and should accelerate the learning curve for new adopters.

The Bigger Picture

Gemma 4 arrives at a moment when the open-weight AI landscape is being reshaped by geopolitical competition and shifting developer expectations. Google's licensing pivot was reportedly motivated in part by losing developer attention to DeepSeek and other Chinese models that offered comparable capabilities with fewer strings attached. The Seoul Economic Daily reported that Google "opens up Gemma 4 after losing developers to DeepSeek," framing the release as a direct competitive response.

This move reflects a broader industry trend toward more permissive open-weight licensing. Meta's Llama models and Google's previous Gemma releases both used restrictive custom licenses that Nathan Lambert characterized as "horrible" — and he expressed hope that these would prove to be merely an "~18-month transient dynamic." If Gemma 4's Apache 2.0 approach proves successful in driving adoption and ecosystem growth, it could pressure other major AI labs to follow suit. Combined with the two-tier deployment strategy spanning cloud workstations to edge devices, Google is positioning Gemma 4 not just as a model release but as the foundation for a ubiquitous AI development platform that extends from data centers to the Android devices in billions of pockets worldwide.

Historical Context

2024-02-21

Released the original Gemma models in 2B and 7B parameter sizes, marking Google's first entry into open-weight AI models.

2024-06-27

Released Gemma 2 in 9B and 27B parameter sizes, scaling up the open-weight model family.

2025-03-10

Released Gemma 3 in four sizes (1B, 4B, 12B, 27B) with 128K context windows and multimodal capabilities.

2026-04-02

Released Gemma 4 in four sizes (E2B, E4B, 26B MoE, 31B Dense) under Apache 2.0, the first Gemma release with a fully permissive open-source license.

Power Map

Key Players

Subject

Google Launches Gemma 4 Open-Weight AI Models Under Apache 2.0

Google DeepMind

Developer and releaser of Gemma 4; shifted from proprietary to Apache 2.0 licensing to compete with open-source rivals and grow its developer ecosystem of 400M+ downloads and 100K+ community variants.

NVIDIA

Key hardware partner providing day-one optimization for RTX GPUs, DGX Spark, and Jetson Orin Nano, enabling local and edge deployment of Gemma 4 models.

AMD

Hardware partner offering Day Zero support across its Instinct, Radeon, and Ryzen AI product lines, broadening Gemma 4's deployment footprint beyond NVIDIA hardware.

DeepSeek and Qwen

Chinese open-weight model competitors whose rising popularity among developers pressured Google into adopting the permissive Apache 2.0 license for Gemma 4.

Hugging Face, Kaggle, and Ollama

Primary distribution platforms making Gemma 4 accessible to developers and researchers, alongside Google AI Studio, LM Studio, and Google Cloud.

THE SIGNAL.

Analysts

""Google is building its lead in AI, not only by pushing Gemini, but also open models with the Gemma 4 family. These are important for building an ecosystem of AI developers, and will help the company to tap into functional and vertical use cases on different device form factors.""

Holger Mueller

Analyst, Constellation Research

""Gemma 4's success is going to be entirely determined by ease of use, to a point where a 5-10% swing on benchmarks wouldn't matter at all." Also noted: "I will personally be so happy if the horrible Llama licenses and Gemma terms of service were an ~18-month transient dynamic.""

Nathan Lambert

Author, Interconnects

The Crowd

"We just released Gemma — our most intelligent open models to date. Built from the same world-class research as Gemini 3, Gemma brings breakthrough intelligence directly to your own hardware for advanced reasoning and agentic workflows. Released under a commercially permissive Apache 2.0 license."

@@Google19000

"A 12-month time difference between Gemma 3 27b and Gemma 4 31b. The jump is absolutely enormous. Just look at the evaluations between the two models. GPQA doubled, AIME 2026 went from ~20% to ~90%, and so on. Crazy."

@@kimmonismus629

"Google's new Gemma 4 is excellent, and the 26B MoE version is likely the best model to run on a 32GB Framework Desktop. It's fast, smart, and also great for tool calling if you use it with @openclaw or other local agent platforms."

@@FrameworkPuter500

"My biggest Issue with the Gemma-4 Models is the Massive KV Cache!!"

@u/unknown0

Broadcast

What's new in Gemma 4

Gemma 4 Has Landed!

Google just dropped Gemma 4... (WOAH)