fblgit's picture
Update README.md
9203985 verified
metadata
license: afl-3.0
library_name: transformers
tags:
  - UNA
  - juanako

For a better performance check out our v2 at fblgit/UNA-ThePitbull-21.4B-v2

UNA-ThePitbull 21.4B v1

Introducing the best LLM in the industry. Nearly as good as a 70B, just a 21.4B based on saltlux/luxia-21.4b-alignment-v1.0 UNA - ThePitbull 21.4B v1

This model has not been poisoned to score high and be useless. We release him becaues its the real deal of EQ & IQ all together in a crazy powerful smart and conversational model. So far the #1 of them at 25/5/2024

Quant version available at bartowski/UNA-ThePitbull-21.4-v1-GGUF

For a better performance check out our v2 at fblgit/UNA-ThePitbull-21.4B-v2

Evaluations

Can only be compared with its non-una base model: the original luxia-21.4b.

UNA (VLLM) Evaluations

|    Tasks     |Version|     Filter     |n-shot|  Metric   |Value |   |Stderr|
|--------------|------:|----------------|-----:|-----------|-----:|---|-----:|
|gsm8k         |      3|strict-match    |     5|exact_match|0.7566|±  |0.0118|
|              |       |flexible-extract|     5|exact_match|0.7582|±  |0.0118|
|hellaswag     |      1|none            |    10|acc        |0.8168|±  |0.0039|
|              |       |none            |    10|acc_norm   |0.9188|±  |0.0027|
|winogrande    |      1|none            |     5|acc        |0.8635|±  |0.0097|
|mmlu          |    N/A|none            |     0|acc        |0.6444|±  |0.0038|
|arc_challenge |      1|none            |    25|acc        |0.7747|±  |0.0122|
|              |       |none            |    25|acc_norm   |0.7850|±  |0.0120|
|truthfulqa_mc2|      2|none            |     0|acc        |0.7902|±  |0.0134|
|mathqa        |      1|none            |     0|acc        |0.4030|±  | 0.009|
|              |       |none            |     0|acc_norm   |0.4034|±  | 0.009|
|pubmedqa      |      1|none            |     0|acc        |0.6860|±  |0.0208|
|boolq         |      2|none            |     0|acc        |0.8401|±  |0.0064|

Original (VLLM) Evaluations

|    Tasks     |Version|     Filter     |n-shot|  Metric   |Value |   |Stderr|
|--------------|------:|----------------|-----:|-----------|-----:|---|-----:|
|gsm8k         |      3|strict-match    |     5|exact_match|0.7528|±  |0.0119|
|              |       |flexible-extract|     5|exact_match|0.7521|±  |0.0119|
|hellaswag     |      1|none            |    10|acc        |0.8117|±  |0.0039|
|              |       |none            |    10|acc_norm   |0.9167|±  |0.0028|
|winogrande    |      1|none            |     5|acc        |0.8682|±  |0.0095|
|mmlu          |    N/A|none            |     0|acc        |0.6448|±  |0.0038|
|arc_challenge |      1|none            |    25|acc        |0.7688|±  |0.0123|
|              |       |none            |    25|acc_norm   |0.7730|±  |0.0122|
|truthfulqa_mc2|      2|none            |     0|acc        |0.7895|±  |0.0133|
|mathqa        |      1|none            |     0|acc        |0.4000|±  | 0.009|
|              |       |none            |     0|acc_norm   |0.4003|±  | 0.009|
|pubmedqa      |      1|none            |     0|acc        |0.6680|±  |0.0211|
|boolq         |      2|none            |     0|acc        |0.8346|±  |0.0065|

UNA Details

Only MLP were Uniformed leaving room for further optimisations. You should be able to perform a SFT+DPO again on this model at moderate speeds. 1e-4/2e-5/etc.