N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models Paper • 2512.16561 • Published 8 days ago • 19
N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models Paper • 2512.16561 • Published 8 days ago • 19
RePlan: Reasoning-guided Region Planning for Complex Instruction-based Image Editing Paper • 2512.16864 • Published 8 days ago • 10
Mask Transfiner for High-Quality Instance Segmentation Paper • 2111.13673 • Published Nov 26, 2021
Cascade-DETR: Delving into High-Quality Universal Object Detection Paper • 2307.11035 • Published Jul 20, 2023
Gaussian Grouping: Segment and Edit Anything in 3D Scenes Paper • 2312.00732 • Published Dec 1, 2023 • 3
DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos Paper • 2405.02280 • Published May 3, 2024
SLAck: Semantic, Location, and Appearance Aware Open-Vocabulary Tracking Paper • 2409.11235 • Published Sep 17, 2024
RobotArena $\infty$: Scalable Robot Benchmarking via Real-to-Sim Translation Paper • 2510.23571 • Published Oct 27 • 8
MotionEdit: Benchmarking and Learning Motion-Centric Image Editing Paper • 2512.10284 • Published 16 days ago • 25
MotionEdit: Benchmarking and Learning Motion-Centric Image Editing Paper • 2512.10284 • Published 16 days ago • 25 • 3
MotionEdit: Benchmarking and Learning Motion-Centric Image Editing Paper • 2512.10284 • Published 16 days ago • 25