TEMPO: Scaling Test-time Training for Large Reasoning Models Paper • 2604.19295 • Published 2 days ago • 26
Tstars-Tryon 1.0: Robust and Realistic Virtual Try-On for Diverse Fashion Items Paper • 2604.19748 • Published 2 days ago • 85
GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient Rectification Paper • 2604.14258 • Published 8 days ago • 22
MathNet: a Global Multimodal Benchmark for Mathematical Reasoning and Retrieval Paper • 2604.18584 • Published 3 days ago • 8
DR^{3}-Eval: Towards Realistic and Reproducible Deep Research Evaluation Paper • 2604.14683 • Published 7 days ago • 32
TIPSv2: Advancing Vision-Language Pretraining with Enhanced Patch-Text Alignment Paper • 2604.12012 • Published 10 days ago • 6
QuantCode-Bench: A Benchmark for Evaluating the Ability of Large Language Models to Generate Executable Algorithmic Trading Strategies Paper • 2604.15151 • Published 7 days ago • 12
UniDoc-RL: Coarse-to-Fine Visual RAG with Hierarchical Actions and Dense Rewards Paper • 2604.14967 • Published 7 days ago • 14
From P(y|x) to P(y): Investigating Reinforcement Learning in Pre-train Space Paper • 2604.14142 • Published 8 days ago • 28
WebCompass: Towards Multimodal Web Coding Evaluation for Code Language Models Paper • 2604.18224 • Published 3 days ago • 20
MultiWorld: Scalable Multi-Agent Multi-View Video World Models Paper • 2604.18564 • Published 3 days ago • 39
OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation Paper • 2604.18486 • Published 3 days ago • 76
Elucidating the SNR-t Bias of Diffusion Probabilistic Models Paper • 2604.16044 • Published 6 days ago • 70
TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration Paper • 2604.14116 • Published 8 days ago • 13