Arabic Speech Datasets Collection Best Datasets for Arabic Speech Tasks β’ 16 items β’ Updated 3 days ago β’ 14
view post Post 3202 I have update my https://huggingface.co/collections/MohamedRashad/arabic-speech-datasetswith new datasets, making the full audio data more than 3000 hours of good arabic speech.Feel Free to use it in your new innovations, And happy new year! See translation β€οΈ 10 10 + Reply
unsloth/Nemotron-3-Nano-30B-A3B-GGUF Text Generation β’ 32B β’ Updated 4 days ago β’ 86.3k β’ 194
view post Post 4198 Check out your 2025 Hugging Face Wrapped, a small experimental recap hf-wrapped/2025 See translation 3 replies Β· π€ 7 7 π₯ 3 3 + Reply
view post Post 391 PatchDNA, a DNA foundation model based on Meta's BLT tokenization strategy https://www.biorxiv.org/content/10.1101/2025.11.28.691095v1 See translation π 1 1 + Reply
view post Post 2460 MLEB is the largest, most diverse, and most comprehensive benchmark for legal text embedding models. https://huggingface.co/blog/isaacus/introducing-mleb See translation π 5 5 π₯ 4 4 β€οΈ 4 4 β 3 3 π€ 3 3 π 3 3 π§ 3 3 π€― 3 3 + Reply
METAGENE-1: Metagenomic Foundation Model for Pandemic Monitoring Paper β’ 2501.02045 β’ Published Jan 3, 2025 β’ 22
view post Post 458 Bio LLMs train on many genomes, but can we encode differences within a species? TomatoTomato adds pangenome tokens to represent a domestic tomato and a wild tomato in one sequence π 𧬠monsoon-nlp/tomatotomato-gLM2-150M-v0.1 See translation π 1 1 + Reply