AIronMind commited on
Commit
d8b43df
1 Parent(s): c481d23

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +177 -0
README.md ADDED
@@ -0,0 +1,177 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-generation
3
+ inference: false
4
+ license: apache-2.0
5
+ library_name: transformers
6
+ tags:
7
+ - language
8
+ - granite-3.0
9
+ - llama-cpp
10
+ - gguf-my-repo
11
+ base_model: ibm-granite/granite-3.0-1b-a400m-instruct
12
+ new_version: ibm-granite/granite-3.1-1b-a400m-instruct
13
+ model-index:
14
+ - name: granite-3.0-2b-instruct
15
+ results:
16
+ - task:
17
+ type: text-generation
18
+ dataset:
19
+ name: IFEval
20
+ type: instruction-following
21
+ metrics:
22
+ - type: pass@1
23
+ value: 32.39
24
+ name: pass@1
25
+ - type: pass@1
26
+ value: 6.17
27
+ name: pass@1
28
+ - task:
29
+ type: text-generation
30
+ dataset:
31
+ name: AGI-Eval
32
+ type: human-exams
33
+ metrics:
34
+ - type: pass@1
35
+ value: 20.35
36
+ name: pass@1
37
+ - type: pass@1
38
+ value: 32
39
+ name: pass@1
40
+ - type: pass@1
41
+ value: 12.21
42
+ name: pass@1
43
+ - task:
44
+ type: text-generation
45
+ dataset:
46
+ name: OBQA
47
+ type: commonsense
48
+ metrics:
49
+ - type: pass@1
50
+ value: 38.4
51
+ name: pass@1
52
+ - type: pass@1
53
+ value: 47.55
54
+ name: pass@1
55
+ - type: pass@1
56
+ value: 65.59
57
+ name: pass@1
58
+ - type: pass@1
59
+ value: 61.17
60
+ name: pass@1
61
+ - type: pass@1
62
+ value: 49.11
63
+ name: pass@1
64
+ - task:
65
+ type: text-generation
66
+ dataset:
67
+ name: BoolQ
68
+ type: reading-comprehension
69
+ metrics:
70
+ - type: pass@1
71
+ value: 70.12
72
+ name: pass@1
73
+ - type: pass@1
74
+ value: 1.27
75
+ name: pass@1
76
+ - task:
77
+ type: text-generation
78
+ dataset:
79
+ name: ARC-C
80
+ type: reasoning
81
+ metrics:
82
+ - type: pass@1
83
+ value: 41.21
84
+ name: pass@1
85
+ - type: pass@1
86
+ value: 23.07
87
+ name: pass@1
88
+ - type: pass@1
89
+ value: 31.77
90
+ name: pass@1
91
+ - task:
92
+ type: text-generation
93
+ dataset:
94
+ name: HumanEvalSynthesis
95
+ type: code
96
+ metrics:
97
+ - type: pass@1
98
+ value: 30.18
99
+ name: pass@1
100
+ - type: pass@1
101
+ value: 26.22
102
+ name: pass@1
103
+ - type: pass@1
104
+ value: 21.95
105
+ name: pass@1
106
+ - type: pass@1
107
+ value: 15.4
108
+ name: pass@1
109
+ - task:
110
+ type: text-generation
111
+ dataset:
112
+ name: GSM8K
113
+ type: math
114
+ metrics:
115
+ - type: pass@1
116
+ value: 26.31
117
+ name: pass@1
118
+ - type: pass@1
119
+ value: 10.88
120
+ name: pass@1
121
+ - task:
122
+ type: text-generation
123
+ dataset:
124
+ name: PAWS-X (7 langs)
125
+ type: multilingual
126
+ metrics:
127
+ - type: pass@1
128
+ value: 45.84
129
+ name: pass@1
130
+ - type: pass@1
131
+ value: 11.8
132
+ name: pass@1
133
+ ---
134
+
135
+ # AIronMind/granite-3.0-1b-a400m-instruct-Q4_K_M-GGUF
136
+ This model was converted to GGUF format from [`ibm-granite/granite-3.0-1b-a400m-instruct`](https://huggingface.co/ibm-granite/granite-3.0-1b-a400m-instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
137
+ Refer to the [original model card](https://huggingface.co/ibm-granite/granite-3.0-1b-a400m-instruct) for more details on the model.
138
+
139
+ ## Use with llama.cpp
140
+ Install llama.cpp through brew (works on Mac and Linux)
141
+
142
+ ```bash
143
+ brew install llama.cpp
144
+
145
+ ```
146
+ Invoke the llama.cpp server or the CLI.
147
+
148
+ ### CLI:
149
+ ```bash
150
+ llama-cli --hf-repo AIronMind/granite-3.0-1b-a400m-instruct-Q4_K_M-GGUF --hf-file granite-3.0-1b-a400m-instruct-q4_k_m.gguf -p "The meaning to life and the universe is"
151
+ ```
152
+
153
+ ### Server:
154
+ ```bash
155
+ llama-server --hf-repo AIronMind/granite-3.0-1b-a400m-instruct-Q4_K_M-GGUF --hf-file granite-3.0-1b-a400m-instruct-q4_k_m.gguf -c 2048
156
+ ```
157
+
158
+ Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
159
+
160
+ Step 1: Clone llama.cpp from GitHub.
161
+ ```
162
+ git clone https://github.com/ggerganov/llama.cpp
163
+ ```
164
+
165
+ Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
166
+ ```
167
+ cd llama.cpp && LLAMA_CURL=1 make
168
+ ```
169
+
170
+ Step 3: Run inference through the main binary.
171
+ ```
172
+ ./llama-cli --hf-repo AIronMind/granite-3.0-1b-a400m-instruct-Q4_K_M-GGUF --hf-file granite-3.0-1b-a400m-instruct-q4_k_m.gguf -p "The meaning to life and the universe is"
173
+ ```
174
+ or
175
+ ```
176
+ ./llama-server --hf-repo AIronMind/granite-3.0-1b-a400m-instruct-Q4_K_M-GGUF --hf-file granite-3.0-1b-a400m-instruct-q4_k_m.gguf -c 2048
177
+ ```