1 4 65

Minhyuk Kim

torchtorchkimtorch

AI & ML interests

NLP

Recent Activity

upvoted a paper 7 days ago

XL-SafetyBench: A Country-Grounded Cross-Cultural Benchmark for LLM Safety and Cultural Sensitivity

upvoted a paper 7 days ago

Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs

updated a model 19 days ago

torchtorchkimtorch/up_model_score_specialized

View all activity

Organizations

upvoted 2 papers 7 days ago

XL-SafetyBench: A Country-Grounded Cross-Cultural Benchmark for LLM Safety and Cultural Sensitivity

Paper • 2605.05662 • Published 12 days ago • 11

Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs

Paper • 2605.09063 • Published 10 days ago • 77

updated a model 19 days ago

torchtorchkimtorch/up_model_score_specialized

7B • Updated 19 days ago • 251

published a model 19 days ago

torchtorchkimtorch/up_model_score_specialized

7B • Updated 19 days ago • 251

updated a model 19 days ago

torchtorchkimtorch/up_model

7B • Updated 19 days ago • 265

published a model 19 days ago

torchtorchkimtorch/up_model

7B • Updated 19 days ago • 265

liked a dataset 23 days ago

nvidia/OpenMathInstruct-2

Viewer • Updated Nov 25, 2024 • 22M • 37.9k • 243

upvoted a paper 3 months ago

Judging What We Cannot Solve: A Consequence-Based Approach for Oracle-Free Evaluation of Research-Level Math

Paper • 2602.06291 • Published Feb 6 • 24

liked 2 datasets 4 months ago

HAERAE-HUB/Ko-PIQA

Viewer • Updated Jan 13 • 441 • 26 • 3

AIM-Intelligence/COMPASS-Policy-Alignment-Testbed-Dataset

Viewer • Updated Jan 6 • 5.92k • 30 • 11

liked 2 models 5 months ago

google/functiongemma-270m-it

Text Generation • Updated Jan 14 • 136k • 990

upstage/Solar-Open-100B

Text Generation • Updated Jan 30 • 35.5k • 475

liked a dataset 6 months ago

bespokelabs/Bespoke-Stratos-17k

Viewer • Updated Jan 31, 2025 • 16.7k • 8.66k • 343

upvoted a paper 7 months ago

Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought

Paper • 2510.04230 • Published Oct 5, 2025 • 27

liked a dataset 8 months ago

facebook/collaborative_agent_bench

Preview • Updated Mar 20, 2025 • 52 • 60

liked 2 datasets 9 months ago

nvidia/Nemotron-Pretraining-SFT-v1

Viewer • Updated Dec 23, 2025 • 299M • 1.14k • 65

google-research-datasets/mbpp

Viewer • Updated Jan 4, 2024 • 1.4k • 202k • 231

liked 3 models 9 months ago

Minhyuk Kim

AI & ML interests

Recent Activity

Organizations

torchtorchkimtorch's activity