view article Article Provence: efficient and robust context pruning for retrieval-augmented generation Jan 28 • 24
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates Paper • 2502.06772 • Published Feb 10 • 21
FastKV: KV Cache Compression for Fast Long-Context Processing with Token-Selective Propagation Paper • 2502.01068 • Published Feb 3 • 18
mHuBERT-147 models Collection Compact yet powerful multilingual speech representation models based on the HuBERT architecture. • 3 items • Updated Jun 4, 2024 • 8