The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding Paper • 2512.19693 • Published 17 days ago • 62
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation Paper • 2510.08673 • Published Oct 9, 2025 • 125
Learning to See and Act: Task-Aware View Planning for Robotic Manipulation Paper • 2508.05186 • Published Aug 7, 2025
ObjectClear: Complete Object Removal via Object-Effect Attention Paper • 2505.22636 • Published May 28, 2025 • 2
Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration Paper • 2406.18516 • Published Jun 26, 2024 • 4
ObjCtrl-2.5D: Training-free Object Control with Camera Poses Paper • 2412.07721 • Published Dec 10, 2024 • 9
ObjCtrl-2.5D: Training-free Object Control with Camera Poses Paper • 2412.07721 • Published Dec 10, 2024 • 9
ObjCtrl-2.5D: Training-free Object Control with Camera Poses Paper • 2412.07721 • Published Dec 10, 2024 • 9 • 2
Image Conductor: Precision Control for Interactive Video Synthesis Paper • 2406.15339 • Published Jun 21, 2024 • 9
FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models Paper • 2406.16863 • Published Jun 24, 2024 • 11