Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
caiovicentino1 's Collections
HLWQ Large MoE (100B+)
HLWQ Models
HLWQ Video & Diffusion Models
HLWQ Gemma Models
Nemotron 30B — Consumer GPU Inference
HLWQ Unified (Weights Q5 + KV Cache Q3)
HLWQ MLX (Apple Silicon)
Large Models (27B-35B) HLWQ
Qwen3.5-4B EOQ Quantized
Qwen2.5 EOQ Quantized
Qwen3.5-9B HLWQ
EOQ Compressed Models
Qwen3.5-27B HLWQ

HLWQ Large MoE (100B+)

updated 2 days ago

Massive MoE models ≥100B quantized with HLWQ · consumer deploy via vLLM expert offload

Upvote
-

  • caiovicentino1/Qwopus-MoE-35B-A3B-HLWQ-Q5

    Text Generation • 35B • Updated about 14 hours ago • 1.86k • 4

    Note 35B · 128 × top-8 · 16 GB HLWQ Q5 (legacy name)


  • caiovicentino1/Nemotron-Cascade-2-30B-A3B-HLWQ-Q5

    Text Generation • 20B • Updated 1 day ago • 2.56k • 7

    Note 30B MoE · consumer GPU via expert offload


  • caiovicentino1/Gemopus-4-26B-A4B-it-HLWQ-Q5

    Image-Text-to-Text • Updated 1 day ago • 120 • 3

    Note 27B Gemma-4 MoE · 128 × top-8 · multimodal · 16.6 GB HLWQ Q5

Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs