P1: Mastering Physics Olympiads with Reinforcement Learning Paper • 2511.13612 • Published Nov 17 • 134
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper • 2511.04570 • Published Nov 6 • 210
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization Paper • 2510.08540 • Published Oct 9 • 109