|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- WizardLM/WizardLM_evol_instruct_V2_196k |
|
- icybee/share_gpt_90k_v1 |
|
language: |
|
- en |
|
library_name: transformers |
|
pipeline_tag: text-generation |
|
tags: |
|
- nlp |
|
- llm |
|
--- |
|
# AmberChat |
|
|
|
|
|
We present AmberChat, an instruction following model finetuned from [LLM360/Amber](https://huggingface.co/LLM360/Amber). |
|
|
|
## Model Description |
|
|
|
- **Model type:** Language model with the same architecture as LLaMA-7B |
|
- **Language(s) (NLP):** English |
|
- **License:** Apache 2.0 |
|
- **Resources for more information:** |
|
- [Research paper](https://arxiv.org/) |
|
- [GitHub Repo](https://github.com/LLM360) |
|
- [Amber pretraining data](https://huggingface.co/) |
|
|
|
|
|
# Loading AmberChat |
|
|
|
```python |
|
from transformers import LlamaTokenizer, LlamaForCausalLM |
|
|
|
tokenizer = LlamaTokenizer.from_pretrained("LLM360/AmberChat") |
|
model = LlamaForCausalLM.from_pretrained("LLM360/AmberChat") |
|
|
|
input_text = "How old are you?" |
|
input_ids = tokenizer(input_text, return_tensors="pt").input_ids |
|
|
|
outputs = model.generate(input_ids) |
|
print(tokenizer.decode(outputs[0])) |
|
``` |
|
|
|
# AmberChat Finetuning Details |
|
|
|
## DataMix |
|
| Subset | Number of rows | License | |
|
| ----------- | ----------- | ----------- | |
|
| WizardLM/WizardLM_evol_instruct_V2_196k | 143k | | |
|
| icybee/share_gpt_90k_v1 | 90k | cc0-1.0 | |
|
| Total | 233k | | |
|
|
|
## Hyperparameters |
|
| Hyperparameter | Value | |
|
| ----------- | ----------- | |
|
| Total Parameters | 6.7B | |
|
| Hidden Size | 4096 | |
|
| Intermediate Size (MLPs) | 11008 | |
|
| Number of Attention Heads | 32 | |
|
| Number of Hidden Lyaers | 32 | |
|
| RMSNorm ɛ | 1e^-6 | |
|
| Max Seq Length | 2048 | |
|
| Vocab Size | 32000 | |
|
|
|
|
|
# Evaluation |
|
|
|
| Model | MT-Bench | |
|
|------------------------------------------------------|------------------------------------------------------------| |
|
| LLM360/Amber 359 | 2.48750 | |
|
| **LLM360/AmberChat** | **5.428125** | |
|
|
|
# Citation |
|
|
|
**BibTeX:** |
|
|
|
```bibtex |
|
@article{xxx, |
|
title={XXX}, |
|
author={XXX}, |
|
journal={XXX}, |
|
year={2023} |
|
} |
|
``` |