wandb run: https://wandb.ai/eleutherai/pythia-rlhf/runs/e0drjcsz?workspace=user-yongzx | |
Model Evals: | |
| Task |Version|Filter| Metric |Value | |Stderr| | |
|-------------|-------|------|--------|-----:|---|-----:| | |
|arc_challenge|Yaml |none |acc |0.1877|± |0.0114| | |
| | |none |acc_norm|0.2372|± |0.0124| | |
|arc_easy |Yaml |none |acc |0.4390|± |0.0102| | |
| | |none |acc_norm|0.4082|± |0.0101| | |
|logiqa |Yaml |none |acc |0.1889|± |0.0154| | |
| | |none |acc_norm|0.2473|± |0.0169| | |
|piqa |Yaml |none |acc |0.6213|± |0.0113| | |
| | |none |acc_norm|0.6279|± |0.0113| | |
|sciq |Yaml |none |acc |0.7230|± |0.0142| | |
| | |none |acc_norm|0.6840|± |0.0147| | |
|winogrande |Yaml |none |acc |0.5162|± |0.0140| | |
|lambada_openai|Yaml |none |perplexity|58.9478|± |2.7662| | |
| | |none |acc | 0.2602|± |0.0061| | |