The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding Paper • 2512.19693 • Published 19 days ago • 62
Emu3.5: Native Multimodal Models are World Learners Paper • 2510.26583 • Published Oct 30, 2025 • 108
Uniform Discrete Diffusion with Metric Path for Video Generation Paper • 2510.24717 • Published Oct 28, 2025 • 40
🐻 URSA Collection URSA: Uniform Discrete Diffusion with Metric Path for Video Generation • 6 items • Updated Nov 2, 2025 • 6
From Pixels to Words -- Towards Native Vision-Language Primitives at Scale Paper • 2510.14979 • Published Oct 16, 2025 • 66
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation Paper • 2504.02782 • Published Apr 3, 2025 • 57
Position: Interactive Generative Video as Next-Generation Game Engine Paper • 2503.17359 • Published Mar 21, 2025 • 61
EVEv2: Improved Baselines for Encoder-Free Vision-Language Models Paper • 2502.06788 • Published Feb 10, 2025 • 13
Autoregressive Video Generation without Vector Quantization Paper • 2412.14169 • Published Dec 18, 2024 • 14