metadata
license: cc-by-nc-4.0
datasets:
- teknium/openhermes
Task | Version | Metric | Value | Stderr | |
---|---|---|---|---|---|
hendrycksTest-logical_fallacies | 1 | acc | 0.3067 | ± | 0.0362 |
acc_norm | 0.3067 | ± | 0.0362 | ||
hendrycksTest-global_facts | 1 | acc | 0.3 | ± | 0.0461 |
acc_norm | 0.3 | ± | 0.0461 | ||
hendrycksTest-abstract_algebra | 1 | acc | 0.2700 | ± | 0.0446 |
acc_norm | 0.2700 | ± | 0.0446 | ||
hendrycksTest-college_chemistry | 1 | acc | 0.3100 | ± | 0.0465 |
acc_norm | 0.3100 | ± | 0.0465 | ||
hendrycksTest-college_physics | 1 | acc | 0.2157 | ± | 0.0409 |
acc_norm | 0.2157 | ± | 0.0409 | ||
hendrycksTest-formal_logic | 1 | acc | 0.2857 | ± | 0.0404 |
acc_norm | 0.2857 | ± | 0.0404 |
Compared to TinyLlama-1.1B-Chat-v1.0:
Algebra UP 17.4%
Formal Logic UP 24.2%
Logical Fallacies UP 35.4%
Template Format: Alpaca
It took 4 hours to train in 1 epoch with an RTX 3090.