What Does It Take to Be a Good AI Research Agent? Studying the Role of Ideation Diversity Paper • 2511.15593 • Published Nov 19 • 57
DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion Paper • 2510.20766 • Published Oct 23 • 34
Advancing Speech Understanding in Speech-Aware Language Models with GRPO Paper • 2509.16990 • Published Sep 21 • 18
Auto-Regressive vs Flow-Matching: a Comparative Study of Modeling Paradigms for Text-to-Music Generation Paper • 2506.08570 • Published Jun 10 • 33
StressTest Collection Model and Data from the paper - StressTest: Can YOUR Speech LM Handle the Stress? • 5 items • Updated Nov 11 • 1
Don't Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning Paper • 2505.17813 • Published May 23 • 58
WHISTRESS: Enriching Transcriptions with Sentence Stress Detection Paper • 2505.19103 • Published May 25 • 13
Follow the Flow: On Information Flow Across Textual Tokens in Text-to-Image Models Paper • 2504.01137 • Published Apr 1 • 21
SIMS Collection Models and evaluation data from the paper: "Scaling Analysis of Interleaved Speech-Text Language Models" • 4 items • Updated Apr 4 • 3
Scaling Analysis of Interleaved Speech-Text Language Models Paper • 2504.02398 • Published Apr 3 • 31
RewardSDS: Aligning Score Distillation via Reward-Weighted Sampling Paper • 2503.09601 • Published Mar 12 • 16
Slamming: Training a Speech Language Model on One GPU in a Day Paper • 2502.15814 • Published Feb 19 • 69
Slam Collection All resources for SpeechLMs from "Slamming: Training a Speech Language Model on One GPU in a Day". We provide tokeniser, lm, and datasets • 7 items • Updated May 22 • 13
Can this Model Also Recognize Dogs? Zero-Shot Model Search from Weights Paper • 2502.09619 • Published Feb 13 • 35
Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation Paper • 2501.03059 • Published Jan 6 • 22