-
-
-
-
-
-
Inference Providers
Active filters:
quark
matmelis/Llama_3.2_1B_w_uint4_qronos
0.4B
•
Updated
•
2
matmelis/Llama_3.2_3B_w_mxfp4_a_mxfp4_qronos
2B
•
Updated
•
3
EliovpAI/Qwen3-0.6B-FP8-KV
Text Generation
•
0.6B
•
Updated
•
3
matmelis/Llama_3.2_3B_w_mxfp4_a_mxfp4_gptq
2B
•
Updated
•
1
matmelis/Llama_3.2_1B_w_int3_qronos
0.6B
•
Updated
•
2
matmelis/Llama_3.2_3B_w_uint4_qronos
0.8B
•
Updated
•
2
matmelis/Llama_3.2_3B_w_int3_qronos
1B
•
Updated
•
2
matmelis/Llama_3.2_3B_w_int2_gptq
3B
•
Updated
•
2
matmelis/Llama_3.2_3B_w_int3_gptq
3B
•
Updated
•
1
matmelis/Llama_3.2_3B_w_int2_qronos
3B
•
Updated
•
2
matmelis/Llama_3.2_1B_w_int3_gptq
0.6B
•
Updated
•
2
matmelis/Llama_3.2_3B_w_uint4_gptq
0.8B
•
Updated
matmelis/Llama_3.2_1B_w_int2_qronos
1B
•
Updated
•
2
matmelis/Llama_3.2_1B_w_int2_gptq
1B
•
Updated
•
2
matmelis/Llama_3.2_1B_w_uint4_smoothquant_qronos
0.4B
•
Updated
•
2
amd/DeepSeek-R1-0528-MXFP4-ASQ
342B
•
Updated
•
43
•
1
haoyang-amd/output_oss_120b_moe_w_mxfp4_a_mxfp4
174B
•
Updated
•
2
haoyang-amd/output_oss_20b_moe_w_mxfp4_a_bfloat16
11B
•
Updated
•
2
matmelis/Llama_3.2_1B_w_mxfp4_a_mxfp4_gptq
0.8B
•
Updated
•
2
matmelis/Llama_3.2_1B_w_mxfp4_a_mxfp4_qronos
0.8B
•
Updated
•
2
amd/Mixtral-8x7B-Instruct-v0.1-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8
37B
•
Updated
•
31
amd/Llama-2-70b-chat-hf-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8
55B
•
Updated
•
7
Keozon/GLM-4.5-Air-fp8_e4m3-quark-gfx1100
107B
•
Updated
•
5
•
1
amd/Qwen3-8B-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8
6B
•
Updated
•
5.02k
•
2
amd/Qwen2.5-1.5B-Instruct-ptpc-Quark-ts
2B
•
Updated
•
2.02k
EmbeddedLLM/Qwen3-Coder-480B-A35B-Instruct-FP8-Dynamic
480B
•
Updated
•
5
Quark-NPU-Workshop/Phi-3-mini-4k-instruct
0.6B
•
Updated
•
5
Quark-NPU-Workshop/Hermes-3-Llama-3.2-3B-awq-g128-int4-asym-bf16-onnx-hybrid
0.8B
•
Updated
•
1
Quark-NPU-Workshop/po-phi3-mini-4k-ins
0.6B
•
Updated
•
2
playable/playable1-int4-bfloat16
1B
•
Updated
•
5