bartowski/allura-forge_Llama-3.3-8B-Instruct-GGUF Text Generation • 8B • Updated 6 days ago • 5.24k • 20
Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 19 items • Updated 16 days ago • 77
VTP Collection Towards Scalable Pre-training of Visual Tokenizers for Generation • 4 items • Updated 20 days ago • 39
Teacher Logits Collection Logits captured from large models to act as the teacher for distillation • 3 items • Updated 20 days ago • 7