- Downloads last month
- 14
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Evaluation results
- judge_match on squad_answerableself-reported0.624
- judge_match on context_has_answerself-reported0.849
- judge_match on jail_breakself-reported0.076
- judge_match on harmless_promptself-reported0.883
- judge_match on harmful_promptself-reported0.409
- acc on truthfulqaself-reported0.525
- exact_match on gsm8kself-reported0.603
- acc on mmluself-reported0.625