Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

101

Full-text search

Active filters: quark

aigdat/Llama-3.2-3B-Instruct-awq-uint4-float16

0.8B • Updated Apr 24, 2025 • 1

aigdat/Phi-3.5-mini-instruct-awq-uint4-float16

0.6B • Updated Apr 29, 2025 • 1

aigdat/DeepSeek-R1-Distill-Qwen-1.5B_quantized_int4_bfloat16

0.4B • Updated Apr 29, 2025 • 2

aigdat/Qwen3-0.6B_quantized_int4_float16

0.2B • Updated Apr 30, 2025 • 2

aigdat/Arch-Function-Chat-3B_quantized_int4_float16

0.7B • Updated May 5, 2025 • 4

aigdat/DeepCoder-14B-Preview_quantized_int4_float16

3B • Updated May 5, 2025 • 4

aigdat/Qwen2.5-Coder-1.5B-Instruct_quantized_int4_bfloat16

0.4B • Updated May 5, 2025 • 3

aigdat/Qwen2.5-Coder-7B-Instruct_quantized_int4_bfloat16

1B • Updated May 6, 2025 • 4

aigdat/Qwen2.5-3B-Instruct_quantized_int4_bfloat16

0.7B • Updated May 8, 2025 • 3

aigdat/Qwen2.5-Coder-32B-Instruct_quantized_int4_bfloat16

5B • Updated May 9, 2025 • 2

aigdat/Llama-xLAM-2-8b-fc-r_quantized_int4_bfloat16

2B • Updated May 9, 2025 • 3

fxmarty/qwen_1.5-moe-a2.7b-mxfp4

8B • Updated May 13, 2025 • 40

amd/Llama-3.3-70B-Instruct-MXFP4-Preview

38B • Updated Nov 6, 2025 • 57k • 2

fxmarty/deepseek_r1_3_layers_mxfp4

8B • Updated May 15, 2025 • 4 • 1

fxmarty/Llama-4-Scout-17B-16E-Instruct-2-layers-mxfp4

5B • Updated Oct 6, 2025 • 11 • 1

amd/DeepSeek-R1-MXFP4-Preview

371B • Updated Nov 6, 2025 • 1.89k • 5

mohitsha/Llama-2-7b-hf-w_mx_fp4_per_group_sym

4B • Updated May 23, 2025 • 3

amd/Llama-3.1-405B-Instruct-MXFP4-Preview

218B • Updated Nov 6, 2025 • 347 • 1

amd/DeepSeek-R1-MXFP4-ASQ

363B • Updated Nov 6, 2025 • 21 • 1

haoyang-amd/qwen1.5-0.5B-ptpc

0.5B • Updated Jul 1, 2025 • 63

amd/DeepSeek-R1-0528-MXFP4-Preview

363B • Updated Nov 27, 2025 • 9.41k • 1

fxmarty/Llama-3.1-70B-Instruct-2-layers-mxfp6

3B • Updated Jul 9, 2025 • 4

fxmarty/qwen1.5_moe_a2.7b_chat_w_fp4_a_fp6_e2m3

8B • Updated Jul 11, 2025 • 40

fxmarty/qwen1.5_moe_a2.7b_chat_w_fp6_e2m3_a_fp6_e2m3

11B • Updated Jul 11, 2025 • 3

fxmarty/qwen1.5_moe_a2.7b_chat_w_fp6_e3m2_a_fp6_e3m2

11B • Updated Jul 11, 2025 • 33

amd/Llama-2-70b-chat-hf-WMXFP4-AMXFP4-KVFP8-Scale-UINT8-MLPerf-GPTQ

37B • Updated Aug 5, 2025 • 125

sudhab1988/rakuten-7b-awq-g128-int4-asym-fp16-hf

1B • Updated Jul 15, 2025 • 3

matmelis/Llama_3.2_1B_w_uint4_gptq

0.4B • Updated Jul 16, 2025 • 4

EliovpAI/Qwen3-14B-FP8-KV

Text Generation • 15B • Updated Aug 1, 2025 • 13 • 2

matmelis/Llama_3.2_1B_w_uint4_autosmoothquant_gptq

0.4B • Updated Aug 1, 2025 • 5