ibivibiv's picture
Update README.md
93dabf5 verified
|
raw
history blame
5.06 kB
---
library_name: transformers
license: apache-2.0
language:
- en
---
# Model Card for Alpaca Dragon 72B V1
Fine tune of [Smaug 72b v0.1](https://huggingface.co/abacusai/Smaug-72B-v0.1) using an alpaca data set I have handy. The data is of planning and reasoning, which I use to help allow a model to break down a set of asks into a logical plan. For some odd reason it bumps the mmlu and winogrande? I would have expected the ARC to go up over those two, but this is often more of an artform than a science at times. All thanks to [Albacus.AI](https://huggingface.co/abacusai) for sharing their work.
I used the same dataset in training one of my owl series [Strix Rufipes 70B](https://huggingface.co/ibivibiv/strix-rufipes-70b), which has worked well for planning out development tasks and other technical work.
![img](./alpaca_dragon.png)
## How to Get Started with the Model
Use the code below to get started with the model.
```
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("ibivibiv/alpaca-dragon-72b-v1")
model = AutoModelForCausalLM.from_pretrained("ibivibiv/alpaca-dragon-72b-v1")
inputs = tokenizer("### Instruction: Create a plan for developing the game of snake in python using pygame.\n### Response:\n", return_tensors="pt", return_attention_mask=False)
outputs = model.generate(**inputs, max_length=200)
text = tokenizer.batch_decode(outputs)[0]
print(text)
```
## Evaluation
| Test Name | Accuracy (%) |
|---------------------------------|--------------|
| All | 77.31 |
| arc:challenge | 70.82 |
| hellaswag | 69.84 |
| hendrycksTest-abstract_algebra | 42.00 |
| hendrycksTest-anatomy | 71.85 |
| hendrycksTest-astronomy | 86.84 |
| hendrycksTest-business_ethics | 82.00 |
| hendrycksTest-clinical_knowledge| 84.53 |
| hendrycksTest-college_biology | 93.06 |
| hendrycksTest-college_chemistry | 54.00 |
| hendrycksTest-college_computer_science | 65.00 |
| hendrycksTest-college_mathematics | 52.00 |
| hendrycksTest-college_medicine | 75.14 |
| hendrycksTest-college_physics | 55.88 |
| hendrycksTest-computer_security | 82.00 |
| hendrycksTest-conceptual_physics| 80.43 |
| hendrycksTest-econometrics | 60.53 |
| hendrycksTest-electrical_engineering | 79.31 |
| hendrycksTest-elementary_mathematics | 70.37 |
| hendrycksTest-formal_logic | 58.73 |
| hendrycksTest-global_facts | 54.00 |
| hendrycksTest-high_school_biology | 88.39 |
| hendrycksTest-high_school_chemistry | 66.01 |
| hendrycksTest-high_school_computer_science | 82.00 |
| hendrycksTest-high_school_european_history | 84.24 |
| hendrycksTest-high_school_geography | 94.44 |
| hendrycksTest-high_school_government_and_politics | 98.96 |
| hendrycksTest-high_school_macroeconomics | 82.05 |
| hendrycksTest-high_school_mathematics | 45.93 |
| hendrycksTest-high_school_microeconomics | 86.13 |
| hendrycksTest-high_school_physics | 54.97 |
| hendrycksTest-high_school_psychology | 92.84 |
| hendrycksTest-high_school_statistics | 68.98 |
| hendrycksTest-high_school_us_history | 91.67 |
| hendrycksTest-high_school_world_history | 89.87 |
| hendrycksTest-human_aging | 78.03 |
| hendrycksTest-human_sexuality | 89.31 |
| hendrycksTest-international_law | 90.91 |
| hendrycksTest-jurisprudence | 87.96 |
| hendrycksTest-logical_fallacies | 84.05 |
| hendrycksTest-machine_learning | 58.93 |
| hendrycksTest-management | 87.38 |
| hendrycksTest-marketing | 95.30 |
| hendrycksTest-medical_genetics | 86.00 |
| hendrycksTest-miscellaneous | 92.21 |
| hendrycksTest-moral_disputes | 83.53 |
| hendrycksTest-moral_scenarios | 69.72 |
| hendrycksTest-nutrition | 85.62 |
| hendrycksTest-philosophy | 83.60 |
| hendrycksTest-prehistory | 87.04 |
| hendrycksTest-professional_accounting | 65.96 |
| hendrycksTest-professional_law | 60.69 |
| hendrycksTest-professional_medicine | 82.72 |
| hendrycksTest-professional_psychology | 81.86 |
| hendrycksTest-public_relations | 75.45 |
| hendrycksTest-security_studies | 82.04 |
| hendrycksTest-sociology | 88.56 |
| hendrycksTest-us_foreign_policy | 94.00 |
| hendrycksTest-virology | 57.23 |
| hendrycksTest-world_religions | 89.47 |
| truthfulqa:mc | 72.6 |
| winogrande | 86.03 |
| gsm8k | 77.63 |
## Environmental Impact
- **Hardware Type:** [A100's..... more than I wanted to use since its all on my $$$]
- **Hours used:** [8]
- **Cloud Provider:** [runpod.io]
- **Compute Region:** [US]
- **Carbon Emitted:** [?]