TECH

Meta TRIBE v2 brain foundation model

29+

Signals

Strategic Overview

01.
Meta released TRIBE v2 (TRImodal Brain Encoder) on March 26, 2026, an open-source foundation model that draws on 500+ hours of fMRI recordings from 700+ people to predict how the human brain responds to visual, auditory, and linguistic stimuli.
02.
The model achieves a 70-fold increase in spatial resolution over its predecessor and 2-3x improvement in zero-shot prediction accuracy for new subjects, enabling reliable brain response predictions without any retraining.
03.
TRIBE v2 combines three existing Meta AI models — V-JEPA2 for video, Wav2Vec-BERT for audio, and LLaMA 3.2 for text — and focuses on encoding (predicting brain reactions to stimuli) rather than decoding private thoughts.
04.
The model, code, weights, and an interactive demo were released under a Creative Commons BY-NC (non-commercial) license, enabling researchers worldwide to conduct virtual brain experiments without costly fMRI infrastructure.

Why This Matters

TRIBE v2 represents a significant milestone in computational neuroscience: the first open-source foundation model capable of predicting whole-brain fMRI responses across three sensory modalities simultaneously. Unlike previous approaches that required training separate models for each individual subject, TRIBE v2 generalizes to entirely new individuals in a zero-shot manner, eliminating the need for costly per-subject fMRI calibration sessions.

The practical implications are substantial. By providing a reliable digital proxy for brain responses, TRIBE v2 could allow researchers to run preliminary virtual experiments at near-zero marginal cost, reserving physical fMRI sessions only for validation of the most promising hypotheses. This democratizes neuroscience research, particularly for institutions in resource-constrained settings that lack access to expensive imaging infrastructure.

The open-source release under a CC BY-NC license signals Meta's strategy of building research credibility and ecosystem influence in neuroscience AI, an area with long-term implications for brain-computer interfaces, accessibility technology, and next-generation human-computer interaction paradigms.

How It Works

TRIBE v2's architecture is built on a trimodal encoding pipeline that processes video, audio, and text stimuli through three specialized foundation models before mapping their representations to predicted brain activity patterns. Video input passes through V-JEPA2, Meta's self-supervised video encoder. Audio is processed by Wav2Vec-BERT, a speech and sound representation model. Text stimuli are encoded using LLaMA 3.2, Meta's large language model.

The key innovation lies in how these three modality-specific representations are fused and mapped to whole-brain fMRI predictions. Rather than simply concatenating features, the model learns cross-modal alignment layers that capture how the brain integrates information across senses. This is particularly important for high-level associative cortices — brain regions that process abstract, multi-sensory concepts — where the multimodal approach significantly outperforms unimodal models.

Critically, TRIBE v2 is an encoding model, not a decoding model. It predicts how the brain will respond to a given stimulus (input to brain activity), rather than reconstructing what a person is thinking from their brain scans (brain activity to output). This distinction is important both technically and ethically: the model cannot read thoughts or reconstruct private mental content.

By The Numbers

TRIBE v2 was trained on fMRI data from over 700 subjects, a massive scale-up from the original TRIBE's 4-subject training set drawn from the Courtois NeuroMod dataset. The model processes 500+ hours of fMRI recordings spanning visual, auditory, and linguistic stimuli including movies, TV shows, and audiobooks.

The spatial resolution improvement is dramatic: TRIBE v2 achieves 70x finer spatial granularity than its predecessor, allowing it to predict brain activity at a much more detailed anatomical level. In zero-shot prediction accuracy — predicting brain responses for subjects the model has never seen — TRIBE v2 delivers a 2-3x improvement over previous state-of-the-art methods for both movie and audiobook stimuli.

According to a LessWrong community paper review of the original TRIBE architecture, the model had 1B parameters and achieved normalized Pearson correlations of 0.54 plus or minus 0.1, explaining approximately 54% of the explainable variance in brain activity. The original TRIBE won first place at the Algonauts 2025 competition, beating 262 competing teams. TRIBE v2 builds on this foundation with substantially larger training data and architectural refinements that push prediction accuracy significantly higher across all three modalities.

Impacts and What's Next

The immediate impact of TRIBE v2 falls into three categories: research acceleration, clinical potential, and commercial applications. For researchers, the ability to run virtual brain experiments using the open-source model could accelerate hypothesis iteration, allowing rapid testing of how the brain processes multi-sensory information before committing to expensive physical fMRI studies.

In clinical neuroscience, models like TRIBE v2 could eventually help identify atypical brain response patterns in conditions such as aphasia or sensory processing disorders — though significant clinical validation work remains before any diagnostic applications would be viable. The model's ability to generalize across subjects without retraining is particularly valuable here, as clinical populations are inherently diverse.

For Meta specifically, TRIBE v2 aligns with Reality Labs' long-term vision for brain-computer interfaces and AR/VR systems that adapt to user perception. The non-commercial license, however, limits direct commercial adoption by third parties, suggesting Meta views TRIBE v2 primarily as a research asset and ecosystem play rather than a commercial product.

The Bigger Picture

TRIBE v2 arrives at a moment when the intersection of AI and neuroscience is accelerating rapidly. Foundation models have transformed natural language processing, computer vision, and code generation over the past several years, and neuroscience is now emerging as the next frontier for this paradigm. The core insight — that large-scale pretraining on diverse data enables powerful generalization — applies to brain data just as it does to text and images.

Meta's open-source approach mirrors its strategy with LLaMA and other AI releases: build credibility, attract researcher adoption, and establish ecosystem influence while retaining commercial optionality. By releasing TRIBE v2 under CC BY-NC, Meta ensures broad academic adoption while preventing competitors from directly commercializing the technology.

The social media reception has been strong on X.com, with the official @AIatMeta announcement garnering approximately 9,600 likes and 2,200 retweets. YouTube and Reddit coverage has not yet emerged, likely due to the recency of the March 26, 2026 announcement. As the research community digests the release and begins experimenting with the open-source tools, broader public discussion and independent evaluations are expected to follow in the coming weeks.

Historical Context

2025-05

Meta FAIR's Brain & AI team won first place at the Algonauts 2025 brain modeling competition with the original TRIBE, a 1B parameter trimodal brain encoder, beating 262 competing teams.

2025-07

Original TRIBE research paper published, demonstrating whole-brain fMRI response prediction using trimodal encoding trained on 4 subjects from the Courtois NeuroMod dataset, achieving normalized Pearson correlations of 0.54 explaining 54% of explainable brain variance.

2026-03-26

Meta released TRIBE v2, scaling from 4 subjects to 700+ subjects and achieving 70x higher spatial resolution, with full open-source release of model weights, code, research paper, and interactive demo under CC BY-NC license.

Power Map

Key Players

Subject

Meta TRIBE v2 brain foundation model

Meta FAIR (Fundamental AI Research)

Developer of TRIBE v2; Meta's Brain & AI research team built and released the model as open-source under CC BY-NC license

Meta Reality Labs

Meta's AR/VR division whose brain-computer interface research aligns with TRIBE v2's capabilities for predicting user perception in next-generation interfaces

Courtois NeuroMod Project

Provided the dense fMRI training dataset used for the original TRIBE model, including approximately 80 hours of recordings per person watching TV shows and films

Algonauts 2025 Competition

Prestigious brain modeling competition where the original TRIBE model won first place out of 262 teams, establishing the foundation for TRIBE v2

Global Neuroscience Research Community

Target users of the open-source release who can use TRIBE v2 as a digital test subject to run virtual experiments without expensive fMRI sessions

THE SIGNAL.

Analysts

"Without any retraining, TRIBE v2 can reliably predict the brain responses of individuals it has never seen before, achieving a nearly 2-3x improvement over previous methods for both movies and audiobooks. We are releasing the model, codebase, paper, and demo to help researchers."

AI at Meta

Official Meta AI Account, X.com

"Highlighted that TRIBE is the first pipeline that is simultaneously non-linear, multi-subject, and multi-modal. Noted that unimodal models can reliably predict their corresponding cortical networks but are systematically outperformed by the multimodal model in high-level associative cortices."

LessWrong Community Analysis

Community Paper Review, LessWrong

"Noted important limitations: fMRI has relatively low temporal resolution compared to actual neural firing, meaning the model works with indirect and averaged measurements. Emphasized that the model is currently focused on encoding rather than decoding."

Neuroscience News

Editorial Analysis, Neuroscience News

The Crowd

"Today we're introducing TRIBE v2 (Trimodal Brain Encoder), a foundation model trained to predict how the human brain responds to almost any sight or sound."

@@AIatMeta9600

"This is so freaking awesome, Meta just released TRIBE v2, a foundation model that predicts human brain activity across vision, sound and language."

@@ai_for_success423

@@AIatMeta609

Broadcast