You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

NVFP4-W4A16 quantized version of LGAI-EXAONE/K-EXAONE-236B-A23B. Only the unshared expert layers are quantized.

Evaluation

The following package versions can be used to evalaute the quantized model.

vllm: 0.15.1

compressed-tensors: 0.13.0

transformers: 5.1.0

vllm serve FuriosaAIShareNotaAI/K-EXAONE-236B-A23B-NVFP4A16-GPTQ \
    --reasoning-parser deepseek_v3 \
    --tensor-parallel-size 8 \
    --enable-auto-tool-choice \
    --tool-call-parser hermes \
    --max-model-len 131072 \
    --max-num-seqs 8

python -m simple-evals.simple_evals --eval gpqa --model FuriosaAIShareNotaAI/K-EXAONE-236B-A23B-NVFP4A16-GPTQ --custom --temperature 1.0 --top_p 0.95 --max_tokens None --extra_body '{"chat_template_kwargs": {"enable_thinking": true}}' --n-threads 8

Accuracy Results

The evaluation was done on H100 PCIE x 8

Benchmark	furiosa-ai/K-EXAONE-236B-A23B-NVFP4A16	LGAI-EXAONE/K-EXAONE-236B-A23B (report)
GPQA-DIAMOND(Reasoning)	76.21 ± 1.51% (N=10)	79.1

Downloads last month: 39

Safetensors

Model size

137B params

Tensor type

F32

BF16

F8_E4M3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support