The Research Desk.

The most upvoted and starred AI research crossing the community today.

Last Updated: Mar 21, 2026, 6:03 AM PT

HuggingFace Daily Papers

Tinted Frames: Question Framing Blinds Vision-Language Models
HuggingFace
13

Tinted Frames: Question Framing Blinds Vision-Language Models

Wan-Cyuan Fan, Jiayun Luo, Declan Kutscher +2 more

#nlp#computer-vision#multimodal#davidhalladay
SimulU: Training-free Policy for Long-form Simultaneous Speech-to-Speech Translation
HuggingFace
13

SimulU: Training-free Policy for Long-form Simultaneous Speech-to-Speech Translation

Amirbek Djanibekov, Luisa Bentivogli, Matteo Negri +1 more

#machine-learning#reinforcement-learning#speech-audio
What Really Controls Temporal Reasoning in Large Language Models: Tokenisation or Representation of Time?
HuggingFace
1

What Really Controls Temporal Reasoning in Large Language Models: Tokenisation or Representation of Time?

Gagan Bhatia, Ahmad Muhammad Isa, Maxime Peyrard +1 more

#nlp#reasoning#gagan3012
PARSA-Bench: A Comprehensive Persian Audio-Language Model Benchmark
HuggingFace
1

PARSA-Bench: A Comprehensive Persian Audio-Language Model Benchmark

Mohammad Javad Ranjbar Kalahroodi, Mohammad Amini, Parmis Bathayan +2 more

#nlp#speech-audio#reasoning
DreamPartGen: Semantically Grounded Part-Level 3D Generation via Collaborative Latent Denoising
HuggingFace
0

DreamPartGen: Semantically Grounded Part-Level 3D Generation via Collaborative Latent Denoising

Tianjiao Yu, Xinzhuo Li, Muntasir Wahed +2 more

#computer-vision
VID-AD: A Dataset for Image-Level Logical Anomaly Detection under Vision-Induced Distraction
HuggingFace
0

VID-AD: A Dataset for Image-Level Logical Anomaly Detection under Vision-Induced Distraction

Hiroto Nakata, Yawen Zou, Shunsuke Sakai +2 more

#computer-vision#reasoning#nkthiroto