from unsloth : https://huggingface.co/unsloth/Qwen3.5-122B-A10B-GGUF

I simply took the UD-IQ4_XS and merged all the shards (0001.gguf, 0002.gguf, 0003.gguf) and merged them into one .gguf I used llama.cpp gguf-split tool to merge : https://github.com/ggml-org/llama.cpp/tree/master/tools/gguf-split

Usefull for example for vLLM because they don't allow multishards

Feel free to check out my website : https://cheapllm.shop for unlimited FREE inference of this model (during the beta, after that the pricing will be $0.02/M input and $0.10/M output so the cheapest provider by a big margin)

If you're interested in D&D/RP, you can also check out https://fablia.fr for free D&D/RP experiences

Downloads last month
609
GGUF
Model size
122B params
Architecture
qwen35moe
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Volko76/Qwen3.5-122B-A10B-UD-IQ4_XS-GGUF-MERGED

Quantized
(95)
this model