Region-Constraint In-Context Generation for Instructional Video Editing Paper • 2512.17650 • Published 16 days ago • 49
Finch: Benchmarking Finance & Accounting across Spreadsheet-Centric Enterprise Workflows Paper • 2512.13168 • Published 20 days ago • 49
SWE-QA: Can Language Models Answer Repository-level Code Questions? Paper • 2509.14635 • Published Sep 18, 2025 • 34
RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation Paper • 2509.16198 • Published Sep 19, 2025 • 126
MachineLearningLM: Continued Pretraining Language Models on Millions of Synthetic Tabular Prediction Tasks Scales In-Context ML Paper • 2509.06806 • Published Sep 8, 2025 • 63
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR Paper • 2508.14029 • Published Aug 19, 2025 • 118
VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos Paper • 2505.23693 • Published May 29, 2025 • 53
Table-R1: Inference-Time Scaling for Table Reasoning Paper • 2505.23621 • Published May 29, 2025 • 93
Sherlock: Self-Correcting Reasoning in Vision-Language Models Paper • 2505.22651 • Published May 28, 2025 • 48
BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs Paper • 2504.18415 • Published Apr 25, 2025 • 47
Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark Paper • 2504.16427 • Published Apr 23, 2025 • 18
DianJin-R1: Evaluating and Enhancing Financial Reasoning in Large Language Models Paper • 2504.15716 • Published Apr 22, 2025 • 12
CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training Paper • 2504.13161 • Published Apr 17, 2025 • 93
SEAP: Training-free Sparse Expert Activation Pruning Unlock the Brainpower of Large Language Models Paper • 2503.07605 • Published Mar 10, 2025 • 66
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published Feb 13, 2025 • 191
NousResearch/DeepHermes-3-Llama-3-8B-Preview Text Generation • 8B • Updated Apr 10, 2025 • 270 • • 354