Efficient Intelligence and Systems

community

Efficient-ML

AI & ML interests

Low-bit Quantization of Large Language Models (LLMs)

Recent Activity

AaronHuangWei authored a paper about 14 hours ago

MC#: Mixture Compressor for Mixture-of-Experts Large Models

AaronHuangWei authored a paper about 14 hours ago

Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models

AaronHuangWei submitted a paper 1 day ago

Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models

View all activity

Efficient-ML 's models 52

Efficient-ML/GPTQ-for-Qwen3

Efficient-ML/Qwen3-awq

Efficient-ML/Qwen3-8B-gptq-w8-perchannel

Efficient-ML/Qwen3-14B-gptq-w4-perchannel

Efficient-ML/Qwen3-14B-gptq-w4-128

Updated May 7 • 1

Efficient-ML/Qwen3-14B-gptq-w8-perchannel

Efficient-ML/Qwen3-14B-gptq-w8-128

Efficient-ML/Qwen3-14B-base-gptq-w8-perchannel

Efficient-ML/Qwen3-14B-base-gptq-w8-128

Efficient-ML/Qwen3-8B-gptq-w8-128

Efficient-ML/Qwen3-8B-gptq-w4-perchannel

Efficient-ML/Qwen3-8B-gptq-w4-128

Efficient-ML/Qwen3-4B-gptq-w8-perchannel

Efficient-ML/Qwen3-4B-gptq-w8-128

Efficient-ML/Qwen3-4B-gptq-w4-perchannel

Efficient-ML/Qwen3-4B-gptq-w4-128

Efficient-ML/Qwen3-1.7B-gptq-w8-perchannel

Efficient-ML/Qwen3-1.7B-gptq-w8-128

Efficient-ML/Qwen3-1.7B-gptq-w4-perchannel

Efficient-ML/Qwen3-1.7B-gptq-w4-128

Efficient-ML/Qwen3-0.6B-gptq-w8-perchannel

Efficient-ML/Qwen3-0.6B-gptq-w8-128

Efficient-ML/Qwen3-0.6B-gptq-w4-perchannel

Efficient-ML/Qwen3-0.6B-gptq-w4-128

Efficient-ML/Qwen3-14B-base-gptq-w4-perchannel

Efficient-ML/Qwen3-14B-base-gptq-w4-128

Efficient-ML/Qwen3-8B-base-gptq-w8-perchannel

Efficient-ML/Qwen3-8B-base-gptq-w8-128

Efficient-ML/Qwen3-8B-base-gptq-w4-perchannel

Efficient-ML/Qwen3-8B-base-gptq-w4-128