Sahm_Benchmark

university

https://huggingface.co/Raniahossam33

AI & ML interests

None defined yet.

Recent Activity

SarfrazAhmad739 authored a paper 4 days ago

UrduMMLU: A Massive Multitask Benchmark for Urdu Language Understanding

SarfrazAhmad739 authored a paper 4 days ago

TABVERSE: Benchmarking Cross-Format Table Understanding in LLMs and VLMs

SarfrazAhmad739 authored a paper 4 days ago

Cultural Benchmarking of LLMs in Standard and Dialectal Arabic Dialogues

View all activity

SarfrazAhmad739

authored 6 papers 4 days ago

UrduMMLU: A Massive Multitask Benchmark for Urdu Language Understanding

Paper • 2606.07167 • Published about 1 month ago • 1

TABVERSE: Benchmarking Cross-Format Table Understanding in LLMs and VLMs

Paper • 2606.09578 • Published 28 days ago

Cultural Benchmarking of LLMs in Standard and Dialectal Arabic Dialogues

Paper • 2605.00119 • Published Apr 30

SAHM: A Benchmark for Arabic Financial and Shari'ah-Compliant Reasoning

Paper • 2604.19098 • Published Apr 30

NeuralNexus at BEA 2025 Shared Task: Retrieval-Augmented Prompting for Mistake Identification in AI Tutors

Paper • 2506.10627 • Published Jun 12, 2025

A Parallel Cross-Lingual Benchmark for Multimodal Idiomaticity Understanding

Paper • 2601.08645 • Published Feb 24

authored a paper 3 months ago

Can LLM Agents Be CFOs? A Benchmark for Resource Allocation in Dynamic Enterprise Environments

Paper • 2603.23638 • Published Mar 24 • 11

authored 4 papers 5 months ago

FinAuditing: A Financial Taxonomy-Structured Multi-Document Benchmark for Evaluating LLMs

Paper • 2510.08886 • Published Oct 10, 2025 • 20

When Agents Trade: Live Multi-Market Trading Benchmark for LLM Agents

Paper • 2510.11695 • Published Oct 13, 2025 • 3

FinCriticalED: A Visual Benchmark for Financial Fact-Level OCR Evaluation

Paper • 2511.14998 • Published Nov 19, 2025

Ebisu: Benchmarking Large Language Models in Japanese Finance

Paper • 2602.01479 • Published Feb 1 • 17

submitted a paper to Daily Papers 5 months ago

Ebisu: Benchmarking Large Language Models in Japanese Finance

Paper • 2602.01479 • Published Feb 1 • 17

updated a model 6 months ago

SahmBenchmark/arabic-merged-sft-SILMA

Text Generation • 9B • Updated Jan 17 • 5

published a model 6 months ago

SahmBenchmark/arabic-merged-sft-SILMA

Text Generation • 9B • Updated Jan 17 • 5

updated a dataset 6 months ago

SahmBenchmark/financial-reports-extractive-summarization_eval

Viewer • Updated Jan 5 • 80 • 22

published a model 6 months ago

SahmBenchmark/arabic-merged-sft-2

Text Generation • 7B • Updated Dec 31, 2025 • 3

updated a model 6 months ago

SahmBenchmark/arabic-merged-sft-2

Text Generation • 7B • Updated Dec 31, 2025 • 3

updated 3 datasets 7 months ago

SahmBenchmark/Sentiment_Analysis_MCQ_train

Viewer • Updated Dec 19, 2025 • 120 • 23

SahmBenchmark/fatwa-training_standardized_new

Viewer • Updated Dec 19, 2025 • 9.95k • 30

SahmBenchmark/Islamic_Finance_QnA_train

Viewer • Updated Dec 19, 2025 • 1.22k • 34 • 1