ZHANGYUXUAN-zR burtenshaw HF Staff commited on
Commit
2b2bd73
·
verified ·
1 Parent(s): a930807

Add GPQA evaluation result (#54)

Browse files

- Add GPQA evaluation result (0ae26748999a1179b83e7f50e653b32ea404af5a)
- Fix task_id to match benchmark eval.yaml (7524de5db643949641e1bafa14f2d9c37a170763)


Co-authored-by: ben burtenshaw <burtenshaw@users.noreply.huggingface.co>

Files changed (1) hide show
  1. .eval_results/gpqa.yaml +9 -0
.eval_results/gpqa.yaml ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ - dataset:
2
+ id: Idavidrein/gpqa
3
+ task_id: diamond
4
+ value: 75.2
5
+ date: '2026-01-27'
6
+ source:
7
+ url: https://huggingface.co/zai-org/GLM-4.7-Flash
8
+ name: Model Card
9
+ user: burtenshaw