Update README.md
Browse files
README.md
CHANGED
|
@@ -54,10 +54,11 @@ The benchmarks and corresponding scores listed in the table below are taken dire
|
|
| 54 |
|HumanEval|0-shot|30.5|68.3|+123.93%|
|
| 55 |
|MBPP|3-shot|47.5|60.3|+26.95%|
|
| 56 |
|MATH|4-shot, maj@4|13.1|40.2*|+206.87%|
|
| 57 |
-
|GSM8K|8-shot, maj@8|52.2|
|
| 58 |
||||**Average**|**+34.25%**|
|
| 59 |
|
| 60 |
\* : We report the 4-shot, maj@1 score instead of the 4-shot, maj@4.
|
|
|
|
| 61 |
|
| 62 |
### Comparison to the Gemma series by Google
|
| 63 |
|
|
|
|
| 54 |
|HumanEval|0-shot|30.5|68.3|+123.93%|
|
| 55 |
|MBPP|3-shot|47.5|60.3|+26.95%|
|
| 56 |
|MATH|4-shot, maj@4|13.1|40.2*|+206.87%|
|
| 57 |
+
|GSM8K|8-shot, maj@8|52.2|75.66**|+44.94%|
|
| 58 |
||||**Average**|**+34.25%**|
|
| 59 |
|
| 60 |
\* : We report the 4-shot, maj@1 score instead of the 4-shot, maj@4.
|
| 61 |
+
\** : We report the 8-shot, maj@1 score instead of the 8-shot, maj@8.
|
| 62 |
|
| 63 |
### Comparison to the Gemma series by Google
|
| 64 |
|