NanoBEIR 🍺 Collection A collection of smaller versions of BEIR datasets with 50 queries and up to 10K documents each. • 13 items • Updated Sep 11, 2024 • 27
Splade-Code Collection Learned Sparse Retrieval Models for Code Retrieval (internship Naver Labs Europe) • 3 items • Updated Apr 14 • 1
ChatR1: Reinforcement Learning for Conversational Reasoning and Retrieval Augmented Question Answering Paper • 2510.13312 • Published Oct 15, 2025 • 2
ChatR1 Collection [Main ACL 2026] Corpus, Index, and dataset to train ChatR1. • 17 items • Updated 27 days ago • 1
BidirLM: From Text to Omnimodal Bidirectional Encoders by Adapting and Composing Causal LLMs Paper • 2604.02045 • Published Apr 2 • 37
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5, 2024 • 145
Self-Hinting Language Models Enhance Reinforcement Learning Paper • 2602.03143 • Published Feb 3 • 31
view article Article mmBERT: ModernBERT goes Multilingual +4 mmarone, orionweller, will-fleshman, eugene-yang, dlawrie, vandurme • Sep 9, 2025 • 147
Amharic Text Embedding Models Collection Text Embedding and ColBERT models based on Amharic RoBERTa and BERT for Amharic passage retrieval • 10 items • Updated Jun 11, 2025 • 6
view article Article Train 400x faster Static Embedding Models with Sentence Transformers tomaarsen • Jan 15, 2025 • 230
Parallel Sentences Datasets Collection These datasets all have "english" and "non_english" columns for numerous datasets. They can be used to make embedding models multilingual. • 14 items • Updated Dec 10, 2025 • 23
Search-R1 Collection Preliminary checkpoints with outcome-only RL. • 15 items • Updated Aug 12, 2025 • 18
view article Article Visual Document Retrieval Goes Multilingual marco, cheesyFishes • Jan 10, 2025 • 78
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 manu • Jul 5, 2024 • 317
view article Article BM25 for Python: Achieving high performance while simplifying dependencies with *BM25S*⚡ xhluca • Jul 9, 2024 • 83
DyVo: Dynamic Vocabularies for Learned Sparse Retrieval with Entities Paper • 2410.07722 • Published Oct 10, 2024 • 15