Models

217

Full-text search

Active filters: GRPO

zghhui/OmniNFT

Any-to-Any • Updated 5 days ago • 59 • 28

beyoru/Luna-Ethos

Text Generation • 4B • Updated 26 days ago • 482 • • 3

mradermacher/Luna-Ethos-GGUF

4B • Updated Apr 10 • 49 • 2

Jincenzi/SocialR1-8B

Text Generation • 4B • Updated 12 days ago • 48 • 2

Kimhi/AWARES-Qwen2.5-VL-7B

Image-Text-to-Text • 8B • Updated 12 days ago • 57 • 2

mradermacher/SocialR1-8B-GGUF

Reinforcement Learning • 4B • Updated 12 days ago • 835 • 1

mradermacher/SocialR1-8B-i1-GGUF

Reinforcement Learning • 4B • Updated 12 days ago • 3.65k • 1

mradermacher/AWARES-Qwen2.5-VL-7B-GGUF

8B • Updated 11 days ago • 516 • 1

mradermacher/AWARES-Qwen2.5-VL-7B-i1-GGUF

8B • Updated 11 days ago • 766 • 1

Ihor/Text2Graph-R1-Qwen2.5-0.5b

Text Generation • 0.5B • Updated Aug 18, 2025 • 14 • • 24

prithivMLmods/Bellatrix-Tiny-1B-R1

Text Generation • 1B • Updated Feb 2, 2025 • 30 • • 1

mradermacher/Bellatrix-Tiny-1B-R1-GGUF

1B • Updated Feb 3, 2025 • 100

mradermacher/Bellatrix-Tiny-1B-R1-i1-GGUF

1B • Updated Feb 3, 2025 • 157

Novaciano/Bellatrix-1B-R1_Erotiquant3_IQ4_XS-GGUF

Text Generation • 1B • Updated Feb 3, 2025 • 20

Novaciano/Bellatrix-1B-R1_Erotiquant3_Q5_K_M-GGUF

Text Generation • 1B • Updated Feb 3, 2025 • 19

tecosys/Nutaan-RL1

Reinforcement Learning • Updated Feb 7, 2025

mradermacher/Text2Graph-R1-Qwen2.5-0.5b-GGUF

0.5B • Updated Aug 18, 2025 • 112 • 1

mradermacher/Text2Graph-R1-Qwen2.5-0.5b-i1-GGUF

0.5B • Updated Aug 18, 2025 • 258 • 1

alpha-ai/Deep-Reason-SMALL-V0-GGUF

3B • Updated Feb 26, 2025 • 32 • 1

alpha-ai/Deep-Reason-SMALL-V0

Text Generation • 3B • Updated Feb 26, 2025 • 8 • 2

mradermacher/Deep-Reason-SMALL-V0-GGUF

3B • Updated Feb 9, 2025 • 84 • 2

mradermacher/Deep-Reason-SMALL-V0-i1-GGUF

3B • Updated Feb 9, 2025 • 228 • 1

alpha-ai/qwen2.5-reason-thought-lite-GGUF

3B • Updated Apr 28, 2025 • 43

alpha-ai/qwen2.5-reason-thought-lite

Text Generation • 3B • Updated Apr 28, 2025 • 5 •

alpha-ai/llama-3.2-3B-Reason-Reflect-Lite-GGUF

3B • Updated Feb 26, 2025 • 39 • 2

alpha-ai/llama-3.2-3B-Reason-Reflect-Lite

Text Generation • 3B • Updated Feb 26, 2025 • 7

mradermacher/Cogito-R1-GGUF

33B • Updated Jul 31, 2025 • 164

accuracy-maker/Llama-3.2-1B-GRPO-gsm8k

Text Generation • 1B • Updated Feb 12, 2025 • 4 •

mradermacher/Cogito-R1-i1-GGUF

33B • Updated Feb 13, 2025 • 704

AaryanK/Qwen_2.5_3B_GRPO_Reasoning_XIOSERV

3B • Updated Feb 17, 2025 • 140 • 1