metadata
license: apache-2.0
datasets:
- WizardLM/WizardLM_evol_instruct_V2_196k
- leemeng/ShareGPT90K_ja_1392
language:
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- nlp
- llm
AmberChat
We present AmberChat, an instruction following model finetuned from LLM360/Amber.
Model Description
- Model type: Language model with the same architecture as LLaMA-7B
- Language(s) (NLP): English
- License: Apache 2.0
- Resources for more information:
Loading Amber
from transformers import LlamaTokenizer, LlamaForCausalLM
tokenizer = LlamaTokenizer.from_pretrained("LLM360/AmberChat")
model = LlamaForCausalLM.from_pretrained("LLM360/AmberChat")
input_text = "translate English to German: How old are you?"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids
outputs = model.generate(input_ids)
print(tokenizer.decode(outputs[0]))
AmberChat Finetuning Details
DataMix
Subset | Number of rows |
---|---|
WizardLM/WizardLM_evol_instruct_V2_196k | 143k |
Sharegpt-90k | 90k |
Total | 233k |
Hyperparameters
Hyperparameter | Value |
---|---|
Total Parameters | 6.7B |
Hidden Size | 4096 |
Intermediate Size (MLPs) | 11008 |
Number of Attention Heads | 32 |
Number of Hidden Lyaers | 32 |
RMSNorm ɛ | 1e^-6 |
Max Seq Length | 2048 |
Vocab Size | 32000 |
Evaluation
Model | MT-Bench |
---|---|
LLM360/Amber 359 | 2.48750 |
LLM360/AmberChat | 5.428125 |
Citation
BibTeX:
@article{xxx,
title={XXX},
author={XXX},
journal={XXX},
year={2023}
}