--- license: apache-2.0 datasets: - WizardLM/WizardLM_evol_instruct_V2_196k - leemeng/ShareGPT90K_ja_1392 language: - en library_name: transformers pipeline_tag: text-generation tags: - nlp - llm --- # AmberChat We present AmberChat, an instruction following model finetuned from [LLM360/Amber](https://huggingface.co/LLM360/Amber). ## Model Description - **Model type:** Language model with the same architecture as LLaMA-7B - **Language(s) (NLP):** English - **License:** Apache 2.0 - **Original Checkpoints:** [Aws bucket with AmberChat checkpoint with all available optimizer states](https://aws.amazon.com/) - **Resources for more information:** - [Research paper](https://arxiv.org/) - [GitHub Repo](https://github.com/LLM360) - [Amber pretraining data](https://huggingface.co/) # Loading Amber ```python from transformers import LlamaTokenizer, LlamaForCausalLM tokenizer = LlamaTokenizer.from_pretrained("LLM360/AmberChat") model = LlamaForCausalLM.from_pretrained("LLM360/AmberChat") input_text = "translate English to German: How old are you?" input_ids = tokenizer(input_text, return_tensors="pt").input_ids outputs = model.generate(input_ids) print(tokenizer.decode(outputs[0])) ``` # AmberChat Finetuning Details ## DataMix | Subset | Number of rows | | ----------- | ----------- | | WizardLM/WizardLM_evol_instruct_V2_196k | 143k | | Sharegpt-90k | 90k | | Total | 233k | ## Hyperparameters | Hyperparameter | Value | | ----------- | ----------- | | Total Parameters | 6.7B | | Hidden Size | 4096 | | Intermediate Size (MLPs) | 11008 | | Number of Attention Heads | 32 | | Number of Hidden Lyaers | 32 | | RMSNorm ɛ | 1e^-6 | | Max Seq Length | 2048 | | Vocab Size | 32000 | # Evaluation | Model | MT-Bench | |------------------------------------------------------|------------------------------------------------------------| | LLM360/Amber 359 | 2.48750 | | **LLM360/AmberChat** | **5.428125** | # Citation **BibTeX:** ```bibtex @article{xxx, title={XXX}, author={XXX}, journal={XXX}, year={2023} } ```