RLVR-World
updated
RLVR-World: Training World Models with Reinforcement Learning
Paper
• 2505.13934
• Published • 16
thuml/rt1-frame-tokenizer
Updated • 20
thuml/rt1-world-model-single-step-base
0.1B • Updated • 17
thuml/rt1-world-model-single-step-rlvr
thuml/rt1-compressive-tokenizer
Updated • 17
thuml/rt1-world-model-multi-step-base
0.1B • Updated • 130
thuml/rt1-world-model-multi-step-rlvr
thuml/webarena-world-model-cot
Viewer
• Updated • 6.41k • 134
thuml/webarena-world-model-sft
2B • Updated • 8
thuml/webarena-world-model-rlvr
2B • Updated • 4
thuml/bytesized32-world-model-cot
Viewer
• Updated • 304k • 58
• 3
thuml/bytesized32-world-model-sft
2B • Updated • 7
thuml/bytesized32-world-model-rlvr-binary-reward
2B • Updated • 5
thuml/bytesized32-world-model-rlvr-task-specific-reward
2B • Updated • 5