from unsloth : https://huggingface.co/unsloth/Qwen3.5-122B-A10B-GGUF
I simply took the UD-IQ4_XS and merged all the shards (0001.gguf, 0002.gguf, 0003.gguf) and merged them into one .gguf I used llama.cpp gguf-split tool to merge : https://github.com/ggml-org/llama.cpp/tree/master/tools/gguf-split
Usefull for example for vLLM because they don't allow multishards
Feel free to check out my website : https://cheapllm.shop for unlimited FREE inference of this model (during the beta, after that the pricing will be $0.02/M input and $0.10/M output so the cheapest provider by a big margin)
If you're interested in D&D/RP, you can also check out https://fablia.fr for free D&D/RP experiences
- Downloads last month
- 609
Hardware compatibility
Log In to add your hardware
4-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
Model tree for Volko76/Qwen3.5-122B-A10B-UD-IQ4_XS-GGUF-MERGED
Base model
Qwen/Qwen3.5-122B-A10B