GGUF
English
GGUF
Inference Endpoints
tsunemoto commited on
Commit
c7471e1
1 Parent(s): 2c69cb0

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,17 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ minichat-1.5-3b.Q2_K.gguf filter=lfs diff=lfs merge=lfs -text
37
+ minichat-1.5-3b.Q3_K_L.gguf filter=lfs diff=lfs merge=lfs -text
38
+ minichat-1.5-3b.Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
39
+ minichat-1.5-3b.Q3_K_S.gguf filter=lfs diff=lfs merge=lfs -text
40
+ minichat-1.5-3b.Q4_0.gguf filter=lfs diff=lfs merge=lfs -text
41
+ minichat-1.5-3b.Q4_1.gguf filter=lfs diff=lfs merge=lfs -text
42
+ minichat-1.5-3b.Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
43
+ minichat-1.5-3b.Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text
44
+ minichat-1.5-3b.Q5_0.gguf filter=lfs diff=lfs merge=lfs -text
45
+ minichat-1.5-3b.Q5_1.gguf filter=lfs diff=lfs merge=lfs -text
46
+ minichat-1.5-3b.Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
47
+ minichat-1.5-3b.Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
48
+ minichat-1.5-3b.Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
49
+ minichat-1.5-3b.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,75 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: "MiniChat-1.5-3B Quantized in GGUF"
3
+ tags:
4
+ - GGUF
5
+ language: en
6
+ ---
7
+ # GGUF's of MiniChat-1.5-3B
8
+
9
+ This is a GGUF quantization of MiniChat-1.5-3B.
10
+
11
+ ## Original Model Card:
12
+ ---
13
+
14
+ ## MiniChat-1.5-3B
15
+
16
+ 📑 [arXiv](https://arxiv.org/abs/2311.07052) | 👻 [GitHub](https://github.com/GeneZC/MiniMA) | 🤗 [HuggingFace-MiniMA](https://huggingface.co/GeneZC/MiniMA-3B) | 🤗 [HuggingFace-MiniChat](https://huggingface.co/GeneZC/MiniChat-3B) | 🤗 [HuggingFace-MiniChat-1.5](https://huggingface.co/GeneZC/MiniChat-1.5-3B) | 🤖 [ModelScope-MiniMA](https://modelscope.cn/models/GeneZC/MiniMA-3B) | 🤖 [ModelScope-MiniChat](https://modelscope.cn/models/GeneZC/MiniChat-3B)
17
+
18
+ 🆕 **Updates from MiniChat-3B**:
19
+ - better data mixture;
20
+ - use of [NEFTune](https://arxiv.org/abs/2310.05914);
21
+ - use of [DPO](https://arxiv.org/abs/2305.18290).
22
+
23
+ ❗ Must comply with LICENSE of LLaMA2 since it is derived from LLaMA2.
24
+
25
+ A language model distilled and finetuned from an adapted version of LLaMA2-7B following "Towards the Law of Capacity Gap in Distilling Language Models".
26
+
27
+ Outperforming a wide range of 3B competitors in GPT4 evaluation and even competing with several 7B chat models.
28
+
29
+ <img src="./teaser_b.jpg" alt="teaser_b" width="687" />
30
+
31
+ The following is an example code snippet to use MiniChat-3B:
32
+
33
+ ```python
34
+ import torch
35
+
36
+ from transformers import AutoModelForCausalLM, AutoTokenizer
37
+
38
+ from conversation import get_default_conv_template
39
+
40
+ # MiniChat
41
+ tokenizer = AutoTokenizer.from_pretrained("GeneZC/MiniChat-3B", use_fast=False)
42
+ # GPU.
43
+ model = AutoModelForCausalLM.from_pretrained("GeneZC/MiniChat-3B", use_cache=True, device_map="auto", torch_dtype=torch.float16).eval()
44
+ # CPU.
45
+ # model = AutoModelForCausalLM.from_pretrained("GeneZC/MiniChat-3B", use_cache=True, device_map="cpu", torch_dtype=torch.float16).eval()
46
+
47
+ conv = get_default_conv_template("minichat")
48
+
49
+ question = "Implement a program to find the common elements in two arrays without using any extra data structures."
50
+ conv.append_message(conv.roles[0], question)
51
+ conv.append_message(conv.roles[1], None)
52
+ prompt = conv.get_prompt()
53
+ input_ids = tokenizer([prompt]).input_ids
54
+ output_ids = model.generate(
55
+ torch.as_tensor(input_ids).cuda(),
56
+ do_sample=True,
57
+ temperature=0.7,
58
+ max_new_tokens=1024,
59
+ )
60
+ output_ids = output_ids[0][len(input_ids[0]):]
61
+ output = tokenizer.decode(output_ids, skip_special_tokens=True).strip()
62
+ # output: "def common_elements(arr1, arr2):\n if len(arr1) == 0:\n return []\n if len(arr2) == 0:\n return arr1\n\n common_elements = []\n for element in arr1:\n if element in arr2:\n common_elements.append(element)\n\n return common_elements"
63
+ # Multiturn conversation could be realized by continuously appending questions to `conv`.
64
+ ```
65
+
66
+ ## Bibtex
67
+
68
+ ```bibtex
69
+ @article{zhang2023law,
70
+ title={Towards the Law of Capacity Gap in Distilling Language Models},
71
+ author={Zhang, Chen and Song, Dawei and Ye, Zheyu and Gao, Yan},
72
+ year={2023},
73
+ url={https://arxiv.org/abs/2311.07052}
74
+ }
75
+ ```
minichat-1.5-3b.Q2_K.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0fc12ed2817faa2bef3aa5b3282a0775e3192362375604624b4e7376667d6a20
3
+ size 1297187936
minichat-1.5-3b.Q3_K_L.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0ec83f845378db87af095ab1695dc4831cf89646f14fb10f8ff0149d227a69fa
3
+ size 1631048288
minichat-1.5-3b.Q3_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2d86dc1e3bdc29671cea85d4a6fbcc683d426b8fb7285dea387640e397999f92
3
+ size 1507578464
minichat-1.5-3b.Q3_K_S.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c19418e7dbba08cb9e4041f9f226c6411660bad5b2f530519a58aea39ee3d28a
3
+ size 1358549600
minichat-1.5-3b.Q4_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f4b6f9bb849621a46fff06046c1c81f332dee308267015d572daec7deffc7d74
3
+ size 1739602016
minichat-1.5-3b.Q4_1.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:051729d88bb30f3027ad4f19d2e3d6d11d547f814e8e44294b4f6470c0cf1a38
3
+ size 1918920800
minichat-1.5-3b.Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9e1fd530238689e418a047c91fb4449836897a79a705dfb57c7ccac9d0758484
3
+ size 1846655072
minichat-1.5-3b.Q4_K_S.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:425bbaade022c98e20e683eea1c0de06611a1b0ded25d7d40511fd78d619acc7
3
+ size 1756903520
minichat-1.5-3b.Q5_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:919f73f5dbacd7310fdbb03593d93738268dcb21e4dab3c8bcadbba978f0e389
3
+ size 2098239584
minichat-1.5-3b.Q5_1.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:106a624b3363b01d6429c62a7632c683c53cb5795a2664ac28417aad12b6d123
3
+ size 2277558368
minichat-1.5-3b.Q5_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:313951b28a7ffc0e70b6cadf2e5f6383b3d51c8898c7de06835067c1189ee0e2
3
+ size 2153388128
minichat-1.5-3b.Q5_K_S.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:29d6cbf3e33c2608efbb2f1d24c76d2a0290bf8b6cdec56da605c993dff8e748
3
+ size 2098239584
minichat-1.5-3b.Q6_K.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:93a1cd1b33ee884e285907d0870155f5d0dbfdb3999878c2f828c7a036666eb7
3
+ size 2479292000
minichat-1.5-3b.Q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:28c909c1f1c2f3affc5607439bc7219f4d23e4778d1a13c6bfe7d9d18d54df73
3
+ size 3210768992