Language Server CLI Empowers Language Agents with Process Rewards Paper • 2510.22907 • Published Oct 27, 2025 • 4
Robust Layerwise Scaling Rules by Proper Weight Decay Tuning Paper • 2510.15262 • Published Oct 17, 2025 • 5
CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization Paper • 2507.06181 • Published Jul 8, 2025 • 44
On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning Paper • 2505.17508 • Published May 23, 2025 • 8
On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning Paper • 2505.17508 • Published May 23, 2025 • 8
FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models Paper • 2505.02735 • Published May 5, 2025 • 33