ehristoforu commited on
Commit
4b87986
1 Parent(s): 20600e2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +75 -23
README.md CHANGED
@@ -1,41 +1,93 @@
1
  ---
2
  base_model:
 
 
3
  - cognitivecomputations/dolphin-2.7-mixtral-8x7b
4
  - alpindale/WizardLM-2-8x22B
 
 
 
 
 
 
 
 
 
5
  library_name: transformers
6
  tags:
7
- - mergekit
 
 
 
8
  - merge
9
-
 
 
 
 
 
 
 
10
  ---
11
- # merge
12
 
13
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
- ## Merge Details
16
- ### Merge Method
17
 
18
- This model was merged using the passthrough merge method.
19
 
20
- ### Models Merged
 
21
 
22
- The following models were included in the merge:
23
- * [cognitivecomputations/dolphin-2.7-mixtral-8x7b](https://huggingface.co/cognitivecomputations/dolphin-2.7-mixtral-8x7b)
24
- * [alpindale/WizardLM-2-8x22B](https://huggingface.co/alpindale/WizardLM-2-8x22B)
25
 
26
- ### Configuration
27
 
28
- The following YAML configuration was used to produce this model:
 
 
 
 
29
 
30
- ```yaml
31
- slices:
32
- - sources:
33
- - model: alpindale/WizardLM-2-8x22B
34
- layer_range: [0, 32]
35
- - sources:
36
- - model: cognitivecomputations/dolphin-2.7-mixtral-8x7b
37
- layer_range: [20, 32]
38
- merge_method: passthrough
39
- dtype: bfloat16
40
 
 
 
41
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  base_model:
3
+ - mistralai/Mixtral-8x22B-Instruct-v0.1
4
+ - mistralai/Mixtral-8x7B-Instruct-v0.1
5
  - cognitivecomputations/dolphin-2.7-mixtral-8x7b
6
  - alpindale/WizardLM-2-8x22B
7
+ datasets:
8
+ - ehartford/dolphin
9
+ - jondurbin/airoboros-2.2.1
10
+ - ehartford/dolphin-coder
11
+ - migtissera/Synthia-v1.3
12
+ - teknium/openhermes
13
+ - ise-uiuc/Magicoder-OSS-Instruct-75K
14
+ - ise-uiuc/Magicoder-Evol-Instruct-110K
15
+ - LDJnr/Pure-Dove
16
  library_name: transformers
17
  tags:
18
+ - mixtral
19
+ - mixtral-8x22b
20
+ - mixtral-8x7b
21
+ - instruct
22
  - merge
23
+ pipeline_tag: text-generation
24
+ license: apache-2.0
25
+ language:
26
+ - en
27
+ - fr
28
+ - de
29
+ - es
30
+ - it
31
  ---
 
32
 
33
+ # Gixtral (Mixtral from 8x22B & 8x7B to 100B)
34
+
35
+ ![logo](assets/logo.png)
36
+
37
+ We created a model from other cool models to combine everything into one cool model.
38
+
39
+
40
+ ## Model Details
41
+
42
+ ### Model Description
43
+
44
+ - **Developed by:** [@ehristoforu](https://huggingface.co/ehristoforu)
45
+ - **Model type:** Text Generation (conversational)
46
+ - **Language(s) (NLP):** English, French, German, Spanish, Italian
47
+ - **Finetuned from model:** [mistralai/Mixtral-8x22B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1) & [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1)
48
+
49
 
50
+ ## How to Get Started with the Model
 
51
 
52
+ Use the code below to get started with the model.
53
 
54
+ ```py
55
+ from transformers import AutoModelForCausalLM, AutoTokenizer
56
 
57
+ model_id = "ehristoforu/Gixtral-100B"
58
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
 
59
 
60
+ model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
61
 
62
+ messages = [
63
+ {"role": "user", "content": "What is your favourite condiment?"},
64
+ {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
65
+ {"role": "user", "content": "Do you have mayonnaise recipes?"}
66
+ ]
67
 
68
+ inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda")
 
 
 
 
 
 
 
 
 
69
 
70
+ outputs = model.generate(inputs, max_new_tokens=20)
71
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
72
  ```
73
+
74
+
75
+ ## About merge
76
+
77
+ Base model: mistralai/Mixtral-8x22B-Instruct-v0.1 & mistralai/Mixtral-8x7B-Instruct-v0.1
78
+
79
+ Merge models:
80
+ - mistralai/Mixtral-8x22B-Instruct-v0.1
81
+ - mistralai/Mixtral-8x7B-Instruct-v0.1
82
+ - cognitivecomputations/dolphin-2.7-mixtral-8x7b
83
+ - alpindale/WizardLM-2-8x22B
84
+
85
+ Merge datasets:
86
+ - ehartford/dolphin
87
+ - jondurbin/airoboros-2.2.1
88
+ - ehartford/dolphin-coder
89
+ - migtissera/Synthia-v1.3
90
+ - teknium/openhermes
91
+ - ise-uiuc/Magicoder-OSS-Instruct-75K
92
+ - ise-uiuc/Magicoder-Evol-Instruct-110K
93
+ - LDJnr/Pure-Dove