PaperDebugger: A Plugin-Based Multi-Agent System for In-Editor Academic Writing, Review, and Editing Paper β’ 2512.02589 β’ Published 15 days ago β’ 60
Wikontic: Constructing Wikidata-Aligned, Ontology-Aware Knowledge Graphs with Large Language Models Paper β’ 2512.00590 β’ Published 18 days ago β’ 41
CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning Paper β’ 2511.18659 β’ Published 24 days ago β’ 16
O-Mem: Omni Memory System for Personalized, Long Horizon, Self-Evolving Agents Paper β’ 2511.13593 β’ Published 30 days ago β’ 24
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe Paper β’ 2511.16334 β’ Published 27 days ago β’ 91
Nemotron-Personas Collection A collection of multilingual, region-specific synthetic persona datasets that support sovereign AI development across many countries and regions. β’ 3 items β’ Updated about 16 hours ago β’ 14
view article Article Open ASR Leaderboard: Trends and Insights with New Multilingual & Long-Form Tracks +2 27 days ago β’ 23
GroupRank: A Groupwise Reranking Paradigm Driven by Reinforcement Learning Paper β’ 2511.11653 β’ Published Nov 10 β’ 55
Qari-OCR: A High-Accuracy Model for Arabic Optical Character Collection π΅π’πππ‘ ππ π‘βπ πππ€ππππ’π ππ€ππ2 ππΏ 2π΅ πππ ππππ-π‘π’πππ ππ ππ π΄πππππ ππΆπ πππ‘ππ ππ‘, ππππ π£0.1 ππ β’ 7 items β’ Updated Jun 25 β’ 12
Nemotron Elastic: Towards Efficient Many-in-One Reasoning LLMs Paper β’ 2511.16664 β’ Published 27 days ago β’ 25
LightOnOCR Collection The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR β’ 7 items β’ Updated Nov 13 β’ 14
RedOne 2.0: Rethinking Domain-specific LLM Post-Training in Social Networking Services Paper β’ 2511.07070 β’ Published Nov 10 β’ 18
Jan-v2-VL Collection Jan-v2-VL: an 8B VLM focused on reliable, many-step task execution. β’ 6 items β’ Updated Nov 13 β’ 37
Waqfeya Library Collection Waqfeya is one of the primary online resources for Islamic books, similar to Shamela. It hosts more than 10,000 PDF books across over 80 categories. β’ 3 items β’ Updated Apr 23 β’ 2
view article Article Simplifying Alignment: From RLHF to Direct Preference Optimization (DPO) Jan 19 β’ 38
Pre-training Dataset Samples Collection A collection of pre-training datasets samples of sizes 10M, 100M and 1B tokens. Ideal for use in quick experimentation and ablations. β’ 19 items β’ Updated Nov 11 β’ 16
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models Paper β’ 2510.04618 β’ Published Oct 6 β’ 125