Text Generation
Transformers
Safetensors
English
llama
nlp
llm
text-generation-inference
Inference Endpoints
AmberChat / README.md
mylibrar's picture
Create README.md
a0ea6b3
|
raw
history blame
2.24 kB
metadata
license: apache-2.0
datasets:
  - WizardLM/WizardLM_evol_instruct_V2_196k
  - leemeng/ShareGPT90K_ja_1392
language:
  - en
library_name: transformers
pipeline_tag: text-generation
tags:
  - nlp
  - llm

AmberChat

We present AmberChat, an instruction following model finetuned from LLM360/Amber.

Model Description

Loading Amber

from transformers import LlamaTokenizer, LlamaForCausalLM

tokenizer = LlamaTokenizer.from_pretrained("LLM360/AmberChat")
model = LlamaForCausalLM.from_pretrained("LLM360/AmberChat")

input_text = "translate English to German: How old are you?"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids

outputs = model.generate(input_ids)
print(tokenizer.decode(outputs[0]))

AmberChat Finetuning Details

DataMix

Subset Number of rows
WizardLM/WizardLM_evol_instruct_V2_196k 143k
Sharegpt-90k 90k
Total 233k

Hyperparameters

Hyperparameter Value
Total Parameters 6.7B
Hidden Size 4096
Intermediate Size (MLPs) 11008
Number of Attention Heads 32
Number of Hidden Lyaers 32
RMSNorm ɛ 1e^-6
Max Seq Length 2048
Vocab Size 32000

Evaluation

Model MT-Bench
LLM360/Amber 359 2.48750
LLM360/AmberChat 5.428125

Citation

BibTeX:

@article{xxx,
  title={XXX},
  author={XXX},
  journal={XXX},
  year={2023}
}