kcz's picture

kcz

kcz358

·

kcz358

AI & ML interests

None yet

Recent Activity

updated a dataset 19 days ago

kcz358/imgedit

published a dataset 19 days ago

kcz358/imgedit

updated a collection 24 days ago

View all activity

Organizations

upvoted a paper about 1 month ago

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

Paper • 2511.20785 • Published Nov 25, 2025 • 182

upvoted a collection about 1 month ago

LongVT

8 items • Updated 24 days ago • 8

upvoted a paper about 1 month ago

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Paper • 2511.16334 • Published Nov 20, 2025 • 92

upvoted a collection about 1 month ago

OpenMMReasoner

5 items • Updated Nov 24, 2025 • 11

upvoted a paper about 2 months ago

Diffusion Language Models are Super Data Learners

Paper • 2511.03276 • Published Nov 5, 2025 • 128

upvoted a paper 6 months ago

MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19, 2025 • 134

upvoted a collection 10 months ago

EgoLife

CVPR 2025 - EgoLife: Towards Egocentric Life Assistant. Homepage: https://egolife-ai.github.io/ • 10 items • Updated Mar 7, 2025 • 20

upvoted 2 papers 11 months ago

Fast Video Generation with Sliding Tile Attention

Paper • 2502.04507 • Published Feb 6, 2025 • 51

Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos

Paper • 2501.13826 • Published Jan 23, 2025 • 23

upvoted a paper about 1 year ago

Large Multi-modal Models Can Interpret Features in Large Multi-modal Models

Paper • 2411.14982 • Published Nov 22, 2024 • 19

upvoted a collection about 1 year ago

Multimodal-SAE

The collection of the sae that hooked on llava • 5 items • Updated Mar 4, 2025 • 8

upvoted 2 papers about 1 year ago

MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures

Paper • 2410.13754 • Published Oct 17, 2024 • 75

LLaVA-Critic: Learning to Evaluate Multimodal Models

Paper • 2410.02712 • Published Oct 3, 2024 • 37

upvoted a collection about 1 year ago

LLaVA-Critic

as a general evaluator for assessing model performance • 6 items • Updated Oct 6, 2024 • 10

upvoted a paper about 1 year ago

Video Instruction Tuning With Synthetic Data

Paper • 2410.02713 • Published Oct 3, 2024 • 39

upvoted 4 collections over 1 year ago

LLaVA-Video

Models focus on video understanding (previously known as LLaVA-NeXT-Video). • 8 items • Updated Feb 21, 2025 • 64

LLaVA-Onevision

LLaVa_Onevision models for single-image, multi-image, and video scenarios • 9 items • Updated Sep 18, 2024 • 16

LongVA

Long Context Transfer From Text To Vision: https://lmms-lab.github.io/posts/longva/ • 5 items • Updated Oct 4, 2024 • 13

LLaVA-OneVision

a model good at arbitrary types of visual input • 17 items • Updated Sep 17, 2025 • 31

upvoted a paper over 1 year ago

LLaVA-OneVision: Easy Visual Task Transfer

Paper • 2408.03326 • Published Aug 6, 2024 • 61