c2p-cmd commited on
Commit
b351e33
·
verified ·
1 Parent(s): 40e7b80

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -295
README.md CHANGED
@@ -10,306 +10,30 @@ license_link: https://ai.google.dev/gemma/terms
10
 
11
  This model card corresponds to the 2B and 7B Instruct versions of the Gemma model's Guff.
12
 
13
- ***The contents of this card have been copied from [Google's Gemma](https://huggingface.co/google/gemma-7b) Page***
14
-
15
- **Resources and Technical Documentation**:
16
-
17
- * [Responsible Generative AI Toolkit](https://ai.google.dev/responsible)
18
- * [Gemma on Kaggle](https://www.kaggle.com/models/google/gemma)
19
- * [Gemma on Vertex Model Garden](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/335?version=gemma-7b-gg-hf)
20
-
21
  **Terms of Use**: [Terms](https://www.kaggle.com/models/google/gemma/license/consent)
22
 
23
- **Authors**: Google
24
-
25
- ## Model Information
26
-
27
- Summary description and a brief definition of inputs and outputs.
28
-
29
  ### Description
30
 
31
  Gemma is a family of lightweight, state-of-the-art open models from Google,
32
  built from the same research and technology used to create the Gemini models.
33
- They are text-to-text, decoder-only large language models, available in English,
34
- with open weights, pre-trained variants, and instruction-tuned variants. Gemma
35
- models are well-suited for a variety of text-generation tasks, including
36
- question-answering, summarization, and reasoning. Their relatively small size
37
- makes it possible to deploy them in environments with limited resources such as
38
- a laptop, desktop, or your cloud infrastructure, democratizing access to
39
- state-of-the-art AI models and helping foster innovation for everyone.
40
 
41
  #### Model Usage
42
- - Since this is a `guff`, it can be run locally using
43
- - Ollama
44
- - Llama.cpp
45
- - LM Studio
46
- - And Many More
47
-
48
- ### Inputs and outputs
49
-
50
- * **Input:** Text string, such as a question, a prompt, or a document to be
51
- summarized.
52
- * **Output:** Generated English-language text in response to the input, such
53
- as an answer to a question, or a summary of a document.
54
-
55
- ## Model Data
56
-
57
- Data used for model training and how the data was processed.
58
-
59
- ### Training Dataset
60
-
61
- These models were trained on a dataset of text data that includes a wide variety
62
- of sources, totaling 6 trillion tokens. Here are the key components:
63
-
64
- * Web Documents: A diverse collection of web text ensures the model is exposed
65
- to a broad range of linguistic styles, topics, and vocabulary. Primarily
66
- English-language content.
67
- * Code: Exposing the model to code helps it learn the syntax and patterns of
68
- programming languages, which improves its ability to generate code or
69
- understand code-related questions.
70
- * Mathematics: Training on mathematical text helps the model learn logical
71
- reasoning, and symbolic representation, and address mathematical queries.
72
-
73
- The combination of these diverse data sources is crucial for training a powerful
74
- language model that can handle a wide variety of different tasks and text
75
- formats.
76
-
77
- ### Data Preprocessing
78
-
79
- Here are the key data cleaning and filtering methods applied to the training
80
- data:
81
-
82
- * CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was
83
- applied at multiple stages in the data preparation process to ensure the
84
- exclusion of harmful and illegal content
85
- * Sensitive Data Filtering: As part of making Gemma pre-trained models safe and
86
- reliable, automated techniques were used to filter out certain personal
87
- information and other sensitive data from training sets.
88
- * Additional methods: Filtering based on content quality and safely in line with
89
- [our policies](https://storage.googleapis.com/gweb-uniblog-publish-prod/documents/2023_Google_AI_Principles_Progress_Update.pdf#page=11).
90
-
91
- ## Implementation Information
92
-
93
- Details about the model internals.
94
-
95
- ### Hardware
96
-
97
- Gemma was trained using the latest generation of
98
- [Tensor Processing Unit (TPU)](https://cloud.google.com/tpu/docs/intro-to-tpu) hardware (TPUv5e).
99
-
100
- Training large language models requires significant computational power. TPUs,
101
- designed specifically for matrix operations common in machine learning, offer
102
- several advantages in this domain:
103
-
104
- * Performance: TPUs are specifically designed to handle the massive computations
105
- involved in training LLMs. They can speed up training considerably compared to
106
- CPUs.
107
- * Memory: TPUs often come with large amounts of high-bandwidth memory, allowing
108
- for the handling of large models and batch sizes during training. This can
109
- lead to better model quality.
110
- * Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for
111
- handling the growing complexity of large foundation models. You can distribute
112
- training across multiple TPU devices for faster and more efficient processing.
113
- * Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective
114
- solution for training large models compared to CPU-based infrastructure,
115
- especially when considering the time and resources saved due to faster
116
- training.
117
- * These advantages are aligned with
118
- [Google's commitments to operate sustainably](https://sustainability.google/operating-sustainably/).
119
-
120
- ### Software
121
-
122
- Training was done using [JAX](https://github.com/google/jax) and [ML Pathways](https://blog.google/technology/ai/introducing-pathways-next-generation-ai-architecture).
123
-
124
- JAX allows researchers to take advantage of the latest generation of hardware,
125
- including TPUs, for faster and more efficient training of large models.
126
-
127
- ML Pathways is Google's latest effort to build artificially intelligent systems
128
- capable of generalizing across multiple tasks. This is specially suitable for
129
- [foundation models](https://ai.google/discover/foundation-models/), including large language models like
130
- these ones.
131
-
132
- Together, JAX and ML Pathways are used as described in the
133
- [paper about the Gemini family of models](https://arxiv.org/abs/2312.11805); "the 'single
134
- controller' programming model of Jax and Pathways allows a single Python
135
- process to orchestrate the entire training run, dramatically simplifying the
136
- development workflow."
137
-
138
- ## Evaluation
139
-
140
- Model evaluation metrics and results.
141
-
142
- ### Benchmark Results
143
-
144
- These models were evaluated against a large collection of different datasets and
145
- metrics to cover different aspects of text generation:
146
-
147
- | Benchmark | Metric | 2B Params | 7B Params |
148
- | ------------------------------ | ------------- | ----------- | --------- |
149
- | [MMLU](https://arxiv.org/abs/2009.03300) | 5-shot, top-1 | 42.3 | 64.3 |
150
- | [HellaSwag](https://arxiv.org/abs/1905.07830) | 0-shot |71.4 | 81.2 |
151
- | [PIQA](https://arxiv.org/abs/1911.11641) | 0-shot | 77.3 | 81.2 |
152
- | [SocialIQA](https://arxiv.org/abs/1904.09728) | 0-shot | 59.7 | 51.8 |
153
- | [BooIQ](https://arxiv.org/abs/1905.10044) | 0-shot | 69.4 | 83.2 |
154
- | [WinoGrande](https://arxiv.org/abs/1907.10641) | partial score | 65.4 | 72.3 |
155
- | [CommonsenseQA](https://arxiv.org/abs/1811.00937) | 7-shot | 65.3 | 71.3 |
156
- | [OpenBookQA](https://arxiv.org/abs/1809.02789) | | 47.8 | 52.8 |
157
- | [ARC-e](https://arxiv.org/abs/1911.01547) | | 73.2 | 81.5 |
158
- | [ARC-c](https://arxiv.org/abs/1911.01547) | | 42.1 | 53.2 |
159
- | [TriviaQA](https://arxiv.org/abs/1705.03551) | 5-shot | 53.2 | 63.4 |
160
- | [Natural Questions](https://github.com/google-research-datasets/natural-questions) | 5-shot | - | 23 |
161
- | [HumanEval](https://arxiv.org/abs/2107.03374) | pass@1 | 22.0 | 32.3 |
162
- | [MBPP](https://arxiv.org/abs/2108.07732) | 3-shot | 29.2 | 44.4 |
163
- | [GSM8K](https://arxiv.org/abs/2110.14168) | maj@1 | 17.7 | 46.4 |
164
- | [MATH](https://arxiv.org/abs/2108.07732) | 4-shot | 11.8 | 24.3 |
165
- | [AGIEval](https://arxiv.org/abs/2304.06364) | | 24.2 | 41.7 |
166
- | [BIG-Bench](https://arxiv.org/abs/2206.04615) | | 35.2 | 55.1 |
167
- | ------------------------------ | ------------- | ----------- | --------- |
168
- | **Average** | | **54.0** | **56.4** |
169
-
170
- ## Ethics and Safety
171
-
172
- Ethics and safety evaluation approach and results.
173
-
174
- ### Evaluation Approach
175
-
176
- Our evaluation methods include structured evaluations and internal red-teaming
177
- testing of relevant content policies. Red-teaming was conducted by a number of
178
- different teams, each with different goals and human evaluation metrics. These
179
- models were evaluated against a number of different categories relevant to
180
- ethics and safety, including:
181
-
182
- * Text-to-Text Content Safety: Human evaluation on prompts covering safety
183
- policies including child sexual abuse and exploitation, harassment, violence
184
- and gore, and hate speech.
185
- * Text-to-Text Representational Harms: Benchmark against relevant academic
186
- datasets such as [WinoBias](https://arxiv.org/abs/1804.06876) and [BBQ Dataset](https://arxiv.org/abs/2110.08193v2).
187
- * Memorization: Automated evaluation of memorization of training data, including
188
- the risk of personally identifiable information exposure.
189
- * Large-scale harm: Tests for "dangerous capabilities," such as chemical,
190
- biological, radiological, and nuclear (CBRN) risks.
191
-
192
- ### Evaluation Results
193
-
194
- The results of ethics and safety evaluations are within acceptable thresholds
195
- for meeting [internal policies](https://storage.googleapis.com/gweb-uniblog-publish-prod/documents/2023_Google_AI_Principles_Progress_Update.pdf#page=11) for categories such as child
196
- safety, content safety, representational harms, memorization, large-scale harms.
197
- On top of robust internal evaluations, the results of well known safety
198
- benchmarks like BBQ, BOLD, Winogender, Winobias, RealToxicity, and TruthfulQA
199
- are shown here.
200
-
201
- | Benchmark | Metric | 2B Params | 7B Params |
202
- | ------------------------------ | ------------- | ----------- | --------- |
203
- | [RealToxicity](https://arxiv.org/abs/2009.11462) | average | 6.86 | 7.90 |
204
- | [BOLD](https://arxiv.org/abs/2101.11718) | | 45.57 | 49.08 |
205
- | [CrowS-Pairs](https://aclanthology.org/2020.emnlp-main.154/) | top-1 | 45.82 | 51.33 |
206
- | [BBQ Ambig](https://arxiv.org/abs/2110.08193v2) | 1-shot, top-1 | 62.58 | 92.54 |
207
- | [BBQ Disambig](https://arxiv.org/abs/2110.08193v2) | top-1 | 54.62 | 71.99 |
208
- | [Winogender](https://arxiv.org/abs/1804.09301) | top-1 | 51.25 | 54.17 |
209
- | [TruthfulQA](https://arxiv.org/abs/2109.07958) | | 44.84 | 31.81 |
210
- | [Winobias 1_2](https://arxiv.org/abs/1804.06876) | | 56.12 | 59.09 |
211
- | [Winobias 2_2](https://arxiv.org/abs/1804.06876) | | 91.10 | 92.23 |
212
- | [Toxigen](https://arxiv.org/abs/2203.09509) | | 29.77 | 39.59 |
213
- | ------------------------------ | ------------- | ----------- | --------- |
214
-
215
-
216
- ## Usage and Limitations
217
-
218
- These models have certain limitations that users should be aware of.
219
-
220
- ### Intended Usage
221
-
222
- Open Large Language Models (LLMs) have a wide range of applications across
223
- various industries and domains. The following list of potential uses is not
224
- comprehensive. The purpose of this list is to provide contextual information
225
- about the possible use-cases that the model creators considered as part of model
226
- training and development.
227
-
228
- * Content Creation and Communication
229
- * Text Generation: These models can be used to generate creative text formats
230
- such as poems, scripts, code, marketing copy, and email drafts.
231
- * Chatbots and Conversational AI: Power conversational interfaces for customer
232
- service, virtual assistants, or interactive applications.
233
- * Text Summarization: Generate concise summaries of a text corpus, research
234
- papers, or reports.
235
- * Research and Education
236
- * Natural Language Processing (NLP) Research: These models can serve as a
237
- foundation for researchers to experiment with NLP techniques, develop
238
- algorithms, and contribute to the advancement of the field.
239
- * Language Learning Tools: Support interactive language learning experiences,
240
- aiding in grammar correction or providing writing practice.
241
- * Knowledge Exploration: Assist researchers in exploring large bodies of text
242
- by generating summaries or answering questions about specific topics.
243
-
244
- ### Limitations
245
-
246
- * Training Data
247
- * The quality and diversity of the training data significantly influence the
248
- model's capabilities. Biases or gaps in the training data can lead to
249
- limitations in the model's responses.
250
- * The scope of the training dataset determines the subject areas the model can
251
- handle effectively.
252
- * Context and Task Complexity
253
- * LLMs are better at tasks that can be framed with clear prompts and
254
- instructions. Open-ended or highly complex tasks might be challenging.
255
- * A model's performance can be influenced by the amount of context provided
256
- (longer context generally leads to better outputs, up to a certain point).
257
- * Language Ambiguity and Nuance
258
- * Natural language is inherently complex. LLMs might struggle to grasp subtle
259
- nuances, sarcasm, or figurative language.
260
- * Factual Accuracy
261
- * LLMs generate responses based on information they learned from their
262
- training datasets, but they are not knowledge bases. They may generate
263
- incorrect or outdated factual statements.
264
- * Common Sense
265
- * LLMs rely on statistical patterns in language. They might lack the ability
266
- to apply common sense reasoning in certain situations.
267
-
268
- ### Ethical Considerations and Risks
269
-
270
- The development of large language models (LLMs) raises several ethical concerns.
271
- In creating an open model, we have carefully considered the following:
272
-
273
- * Bias and Fairness
274
- * LLMs trained on large-scale, real-world text data can reflect socio-cultural
275
- biases embedded in the training material. These models underwent careful
276
- scrutiny, input data pre-processing described and posterior evaluations
277
- reported in this card.
278
- * Misinformation and Misuse
279
- * LLMs can be misused to generate text that is false, misleading, or harmful.
280
- * Guidelines are provided for responsible use with the model, see the
281
- [Responsible Generative AI Toolkit](http://ai.google.dev/gemma/responsible).
282
- * Transparency and Accountability:
283
- * This model card summarizes details on the models' architecture,
284
- capabilities, limitations, and evaluation processes.
285
- * A responsibly developed open model offers the opportunity to share
286
- innovation by making LLM technology accessible to developers and researchers
287
- across the AI ecosystem.
288
-
289
- Risks identified and mitigations:
290
-
291
- * Perpetuation of biases: It's encouraged to perform continuous monitoring
292
- (using evaluation metrics, human review) and the exploration of de-biasing
293
- techniques during model training, fine-tuning, and other use cases.
294
- * Generation of harmful content: Mechanisms and guidelines for content safety
295
- are essential. Developers are encouraged to exercise caution and implement
296
- appropriate content safety safeguards based on their specific product policies
297
- and application use cases.
298
- * Misuse for malicious purposes: Technical limitations and developer and
299
- end-user education can help mitigate against malicious applications of LLMs.
300
- Educational resources and reporting mechanisms for users to flag misuse are
301
- provided. Prohibited uses of Gemma models are outlined in the
302
- [Gemma Prohibited Use Policy](https://ai.google.dev/gemma/prohibited_use_policy).
303
- * Privacy violations: Models were trained on data filtered for removal of PII
304
- (Personally Identifiable Information). Developers are encouraged to adhere to
305
- privacy regulations with privacy-preserving techniques.
306
-
307
- ### Benefits
308
-
309
- At the time of release, this family of models provides high-performance open
310
- large language model implementations designed from the ground up for Responsible
311
- AI development compared to similarly sized models.
312
-
313
- Using the benchmark evaluation metrics described in this document, these models
314
- have shown to provide superior performance to other, comparably-sized open model
315
- alternatives.
 
10
 
11
  This model card corresponds to the 2B and 7B Instruct versions of the Gemma model's Guff.
12
 
 
 
 
 
 
 
 
 
13
  **Terms of Use**: [Terms](https://www.kaggle.com/models/google/gemma/license/consent)
14
 
 
 
 
 
 
 
15
  ### Description
16
 
17
  Gemma is a family of lightweight, state-of-the-art open models from Google,
18
  built from the same research and technology used to create the Gemini models.
 
 
 
 
 
 
 
19
 
20
  #### Model Usage
21
+ Since this is a `guff`, it can be run locally using
22
+ - Ollama
23
+ - Llama.cpp
24
+ - LM Studio
25
+ - And Many More
26
+ - I have provided [GemmaModelFile](https://huggingface.co/c2p-cmd/google_gemma_guff/blob/main/GemmaModelFile) that can be used with ollama by:
27
+ - Download the model:
28
+ ```python
29
+ pip install huggingface_hub
30
+ from huggingface_hub import hf_hub_download
31
+
32
+ model_id="c2p-cmd/google_gemma_guff"
33
+ hf_hub_download(repo_id=model_id, local_dir="gemma_snapshot", local_dir_use_symlinks=False, filename="gemma_snapshot/gemma-2b-it.gguf")
34
+ ```
35
+ - Load the model file to ollama
36
+ ```shell
37
+ ollama create gemma -f GemmaModelFile
38
+ ```
39
+ - You change the model name based on needs