The Research Desk.

The most upvoted and starred AI research crossing the community today.

Last Updated: May 7, 2026, 9:41 AM PT

X.com Research Buzz

Who's in Charge? Disempowerment Patterns in Real-World LLM Usage
X.com
4735

Who's in Charge? Disempowerment Patterns in Real-World LLM Usage

Mrinank Sharma, Miles McCain, Raymond Douglas +1 more

#nlp#MrinankSharma
Agents of Chaos
X.com
4104

Agents of Chaos

Natalie Shapira, Chris Wendler, Avery Yen +7 more

#reinforcement-learning#openclaw
Language models transmit behavioural traits through hidden signals in data
X.com
2101

Language models transmit behavioural traits through hidden signals in data

Alex Cloud, Minh Le, James Chua +6 more

#nlp#MinhxLe

AlphaXiv Trending

Thinking with Visual Primitives
AlphaXiv
219

Thinking with Visual Primitives

Ruijie Lu, Yiyang Ma, Xiaokang Chen

#computer-vision#mitkox
Let ViT Speak: Generative Language-Image Pre-training
AlphaXiv
97

Let ViT Speak: Generative Language-Image Pre-training

ByteDance, Yan Fang, Mengcheng Lan +1 more

#machine-learning#computer-vision#YanFangCS
MolmoAct2: Action Reasoning Models for Real-world Deployment
AlphaXiv
61

MolmoAct2: Action Reasoning Models for Real-world Deployment

Haoquan Fang, Jiafei Duan, Donovan Clay

#reasoning#allenai
Model Spec Midtraining: Improving How Alignment Training Generalizes
AlphaXiv
57

Model Spec Midtraining: Improving How Alignment Training Generalizes

Chloe Li, Sara Price, Samuel Marks

#machine-learning#safety-alignment#chloeli-15
On-Policy Distillation
AlphaXiv
51

On-Policy Distillation

Thinking Machines, Kevin Lu

#reinforcement-learning#efficiency
A Theory of Generalization in Deep Learning
AlphaXiv
46

A Theory of Generalization in Deep Learning

Elon Litman, Gabe Guo

#machine-learning

HuggingFace Daily Papers

MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction
HuggingFace
4

MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction

Junbo Cui, Bokai Xu, Chongyi Wang +2 more

#OpenBMB
SWE-WebDevBench: Evaluating Coding Agent Application Platforms as Virtual Software Agencies
HuggingFace
2

SWE-WebDevBench: Evaluating Coding Agent Application Platforms as Virtual Software Agencies

Siddhant Saxena, Nilesh Trivedi, Vinayaka Jyothi

#reinforcement-learning#reasoning#snowmountainAi
TT4D: A Pipeline and Dataset for Table Tennis 4D Reconstruction From Monocular Videos
HuggingFace
1

TT4D: A Pipeline and Dataset for Table Tennis 4D Reconstruction From Monocular Videos

Nima Rahmanian, Daniel Kienzle, Thomas Gossard +2 more

#computer-vision
The First Token Knows: Single-Decode Confidence for Hallucination Detection
HuggingFace
1

The First Token Knows: Single-Decode Confidence for Hallucination Detection

Mina Gabriel

#safety-alignment
When to Think, When to Speak: Learning Disclosure Policies for LLM Reasoning
HuggingFace
1

When to Think, When to Speak: Learning Disclosure Policies for LLM Reasoning

Jiaqi Wei, Xuehang Guo, Pengfei Yu +2 more

#nlp#reasoning
CreativityBench: Evaluating Agent Creative Reasoning via Affordance-Based Tool Repurposing
HuggingFace
1

CreativityBench: Evaluating Agent Creative Reasoning via Affordance-Based Tool Repurposing

Cheng Qian, Hyeonjeong Ha, Jiayu Liu +2 more

#reinforcement-learning#reasoning#CreativityBench