Text Generation
Transformers
Safetensors
English
llama
nlp
llm
text-generation-inference
Inference Endpoints
mylibrar commited on
Commit
a0ea6b3
1 Parent(s): 4690461

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +86 -0
README.md ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - WizardLM/WizardLM_evol_instruct_V2_196k
5
+ - leemeng/ShareGPT90K_ja_1392
6
+ language:
7
+ - en
8
+ library_name: transformers
9
+ pipeline_tag: text-generation
10
+ tags:
11
+ - nlp
12
+ - llm
13
+ ---
14
+ # AmberChat
15
+
16
+
17
+ We present AmberChat, an instruction following model finetuned from [LLM360/Amber](https://huggingface.co/LLM360/Amber).
18
+
19
+ ## Model Description
20
+
21
+ - **Model type:** Language model with the same architecture as LLaMA-7B
22
+ - **Language(s) (NLP):** English
23
+ - **License:** Apache 2.0
24
+ - **Original Checkpoints:** [Aws bucket with AmberChat checkpoint with all available optimizer states](https://aws.amazon.com/)
25
+ - **Resources for more information:**
26
+ - [Research paper](https://arxiv.org/)
27
+ - [GitHub Repo](https://github.com/LLM360)
28
+ - [Amber pretraining data](https://huggingface.co/)
29
+
30
+
31
+ # Loading Amber
32
+
33
+ ```python
34
+ from transformers import LlamaTokenizer, LlamaForCausalLM
35
+
36
+ tokenizer = LlamaTokenizer.from_pretrained("LLM360/AmberChat")
37
+ model = LlamaForCausalLM.from_pretrained("LLM360/AmberChat")
38
+
39
+ input_text = "translate English to German: How old are you?"
40
+ input_ids = tokenizer(input_text, return_tensors="pt").input_ids
41
+
42
+ outputs = model.generate(input_ids)
43
+ print(tokenizer.decode(outputs[0]))
44
+ ```
45
+
46
+ # AmberChat Finetuning Details
47
+
48
+ ## DataMix
49
+ | Subset | Number of rows |
50
+ | ----------- | ----------- |
51
+ | WizardLM/WizardLM_evol_instruct_V2_196k | 143k |
52
+ | Sharegpt-90k | 90k |
53
+ | Total | 233k |
54
+
55
+ ## Hyperparameters
56
+ | Hyperparameter | Value |
57
+ | ----------- | ----------- |
58
+ | Total Parameters | 6.7B |
59
+ | Hidden Size | 4096 |
60
+ | Intermediate Size (MLPs) | 11008 |
61
+ | Number of Attention Heads | 32 |
62
+ | Number of Hidden Lyaers | 32 |
63
+ | RMSNorm ɛ | 1e^-6 |
64
+ | Max Seq Length | 2048 |
65
+ | Vocab Size | 32000 |
66
+
67
+
68
+ # Evaluation
69
+
70
+ | Model | MT-Bench |
71
+ |------------------------------------------------------|------------------------------------------------------------|
72
+ | LLM360/Amber 359 | 2.48750 |
73
+ | **LLM360/AmberChat** | **5.428125** |
74
+
75
+ # Citation
76
+
77
+ **BibTeX:**
78
+
79
+ ```bibtex
80
+ @article{xxx,
81
+ title={XXX},
82
+ author={XXX},
83
+ journal={XXX},
84
+ year={2023}
85
+ }
86
+ ```