Models

73,473

Full-text search

Active filters: reinforcement-learning

Simplified-Reasoning/SU-01

Text Generation • 31B • Updated 2 days ago • 880 • 21

zghhui/OmniNFT

Any-to-Any • Updated 3 days ago • 59 • 25

twnlp/ChineseErrorCorrector4-4B

Text Generation • 4B • Updated 3 days ago • 172 • 4

Kwai-Klear/GoLongRL-4B

Text Generation • 4B • Updated about 1 hour ago • 82 • 4

nvidia/NitroGen

Reinforcement Learning • Updated Feb 5 • 536

Mercury7353/MetaAgent-X

Reinforcement Learning • 8B • Updated 7 days ago • 87 • 4

6kplus/PhyMotion-CausalForcing-1.3B

Text-to-Video • Updated 6 days ago • 3

JohnRoger/SU-01-Q4_K_M-GGUF

Reinforcement Learning • 31B • Updated 7 days ago • 250 • 3

MeiGen-AI/GenEvolve

Image-Text-to-Text • 9B • Updated about 4 hours ago • 2

leorc/Simulus

Reinforcement Learning • Updated Feb 21, 2025 • 1

NousResearch/DeepHermes-ToolCalling-Specialist-Atropos

Reinforcement Learning • 8B • Updated Apr 28, 2025 • 73 • 18

Arc-Intelligence/ATLAS-8B-Thinking

Text Generation • 8B • Updated Sep 12, 2025 • 12 • 6

gagansuie/oxidize-models

Other • Updated about 1 hour ago • 2.64k • 4

AQ-MedAI/PulseMind-72B

Image-Text-to-Text • 73B • Updated Jan 30 • 29 • 2

nvidia/GEAR-SONIC

Reinforcement Learning • Updated Apr 11 • 43

nvidia/EGM-8B

Image-Text-to-Text • 9B • Updated Apr 10 • 478 • 9

XunmeiLiu/VFIG-4B

Reinforcement Learning • 4B • Updated Mar 27 • 134 • 6

bue0912/ToolOmni-Qwen3-4B

Text Generation • 4B • Updated Apr 16 • 12 • 3

lllyx/Qwen3-4B-Base-GRPO

Text Generation • 4B • Updated 19 days ago • 233 • 3

Jincenzi/SocialR1-8B

Text Generation • 4B • Updated 10 days ago • 48 • 2

mradermacher/SocialR1-8B-GGUF

Reinforcement Learning • 4B • Updated 10 days ago • 789 • 1

mradermacher/SocialR1-8B-i1-GGUF

Reinforcement Learning • 4B • Updated 10 days ago • 3.5k • 1

ccnets/causal-gpt-rl

Reinforcement Learning • Updated 3 days ago • 70 • 2

YuvrajSingh9886/LFM2.5-350M-grpo-summarization-quality-bleu

Summarization • 0.4B • Updated 8 days ago • 263 • 2

JosedelaPepe/dqn-SpaceInvadersNoFrameskip-v4

Reinforcement Learning • Updated 7 days ago • 47 • 1

axi0mX/SU-01-GGUF

Text Generation • 31B • Updated 4 days ago • 2.05k • 1

Alopezcordero/ppo-LunaLander-v3

Reinforcement Learning • Updated 6 days ago • 146 • 1

mradermacher/AgentHijack-Agent-GGUF

Reinforcement Learning • 8B • Updated 4 days ago • 717 • 1

svbk2012/therapist

Reinforcement Learning • Updated about 4 hours ago • 1

Igriscodes/qwen3-4b-tool

Text Generation • 4B • Updated about 17 hours ago • 1