Firworks
/

Qwen3-Coder-30B-A3B-Instruct-nvfp4

8-bit precision

compressed-tensors

Model card Files Files and versions

Firworks commited on Oct 25

Commit

0644434

·

verified ·

1 Parent(s): 8053f73

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -7,8 +7,8 @@ base_model:
 ---
 # Qwen3-Coder-30B-A3B-Instruct-nvfp4
-**Format:** NVFP4 — weights & activations quantized to FP4 with dual scaling.
-**Base model:** `Qwen/Qwen3-Coder-30B-A3B-Instruct`
 **How it was made:** One-shot calibration with LLM Compressor (NVFP4 recipe), long-seq calibration with nvidia/OpenCodeInstruct.
 > Notes: Keep `lm_head` in high precision; calibrate on long, domain-relevant sequences.

 ---
 # Qwen3-Coder-30B-A3B-Instruct-nvfp4
+**Format:** NVFP4 — weights & activations quantized to FP4 with dual scaling.
+**Base model:** `Qwen/Qwen3-Coder-30B-A3B-Instruct`
 **How it was made:** One-shot calibration with LLM Compressor (NVFP4 recipe), long-seq calibration with nvidia/OpenCodeInstruct.
 > Notes: Keep `lm_head` in high precision; calibrate on long, domain-relevant sequences.