Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

101

Full-text search

Active filters: quark

matmelis/Llama_3.2_1B_w_uint4_qronos

0.4B • Updated Aug 7, 2025 • 2

matmelis/Llama_3.2_3B_w_mxfp4_a_mxfp4_qronos

2B • Updated Aug 7, 2025 • 3

EliovpAI/Qwen3-0.6B-FP8-KV

Text Generation • 0.6B • Updated Aug 2, 2025 • 3

matmelis/Llama_3.2_3B_w_mxfp4_a_mxfp4_gptq

2B • Updated Aug 6, 2025 • 1

matmelis/Llama_3.2_1B_w_int3_qronos

0.6B • Updated Aug 12, 2025 • 2

matmelis/Llama_3.2_3B_w_uint4_qronos

0.8B • Updated Aug 7, 2025 • 2

matmelis/Llama_3.2_3B_w_int3_qronos

1B • Updated Aug 12, 2025 • 2

matmelis/Llama_3.2_3B_w_int2_gptq

3B • Updated Aug 7, 2025 • 2

matmelis/Llama_3.2_3B_w_int3_gptq

3B • Updated Aug 7, 2025 • 1

matmelis/Llama_3.2_3B_w_int2_qronos

3B • Updated Aug 7, 2025 • 2

matmelis/Llama_3.2_1B_w_int3_gptq

0.6B • Updated Aug 11, 2025 • 2

matmelis/Llama_3.2_3B_w_uint4_gptq

0.8B • Updated Aug 7, 2025

matmelis/Llama_3.2_1B_w_int2_qronos

1B • Updated Aug 7, 2025 • 2

matmelis/Llama_3.2_1B_w_int2_gptq

1B • Updated Aug 7, 2025 • 2

matmelis/Llama_3.2_1B_w_uint4_smoothquant_qronos

0.4B • Updated Aug 7, 2025 • 2

amd/DeepSeek-R1-0528-MXFP4-ASQ

342B • Updated 27 days ago • 43 • 1

haoyang-amd/output_oss_120b_moe_w_mxfp4_a_mxfp4

174B • Updated Aug 19, 2025 • 2

haoyang-amd/output_oss_20b_moe_w_mxfp4_a_bfloat16

11B • Updated Aug 20, 2025 • 2

matmelis/Llama_3.2_1B_w_mxfp4_a_mxfp4_gptq

0.8B • Updated Aug 20, 2025 • 2

matmelis/Llama_3.2_1B_w_mxfp4_a_mxfp4_qronos

0.8B • Updated Aug 20, 2025 • 2

amd/Mixtral-8x7B-Instruct-v0.1-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8

37B • Updated Nov 3, 2025 • 31

amd/Llama-2-70b-chat-hf-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8

55B • Updated Sep 26, 2025 • 7

Keozon/GLM-4.5-Air-fp8_e4m3-quark-gfx1100

107B • Updated Sep 1, 2025 • 5 • 1

amd/Qwen3-8B-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8

6B • Updated Sep 26, 2025 • 5.02k • 2

amd/Qwen2.5-1.5B-Instruct-ptpc-Quark-ts

2B • Updated Sep 11, 2025 • 2.02k

EmbeddedLLM/Qwen3-Coder-480B-A35B-Instruct-FP8-Dynamic

480B • Updated Sep 13, 2025 • 5

Quark-NPU-Workshop/Phi-3-mini-4k-instruct

0.6B • Updated Oct 7, 2025 • 5

Quark-NPU-Workshop/Hermes-3-Llama-3.2-3B-awq-g128-int4-asym-bf16-onnx-hybrid

0.8B • Updated Oct 8, 2025 • 1

Quark-NPU-Workshop/po-phi3-mini-4k-ins

0.6B • Updated Oct 8, 2025 • 2

playable/playable1-int4-bfloat16

1B • Updated Oct 14, 2025 • 5