ghost613 commited on
Commit
e6a3d56
1 Parent(s): c258bc7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -197
README.md CHANGED
@@ -23,200 +23,4 @@ This model card corresponds to the 7B base version of the **Gemma-Ko** model.
23
  * [Original Google's Gemma-7B](https://huggingface.co/google/gemma-7b)
24
  * [Training Code @ Github: Gemma-EasyLM](https://github.com/Beomi/Gemma-EasyLM)
25
 
26
- **Terms of Use**: [Terms](https://www.kaggle.com/models/google/gemma/license/consent)
27
-
28
- **Citation**
29
-
30
- ```bibtex
31
- @misc {gemma_ko_7b,
32
- author = { {Junbum Lee, Taekyoon Choi} },
33
- title = { gemma-ko-7b },
34
- year = 2024,
35
- url = { https://huggingface.co/beomi/gemma-ko-7b },
36
- doi = { 10.57967/hf/1859 },
37
- publisher = { Hugging Face }
38
- }
39
- ```
40
-
41
- **Model Developers**: Junbum Lee (Beomi) & Taekyoon Choi (Taekyoon)
42
-
43
- ## Model Information
44
-
45
- Summary description and brief definition of inputs and outputs.
46
-
47
- ### Description
48
-
49
- Gemma is a family of lightweight, state-of-the-art open models from Google,
50
- built from the same research and technology used to create the Gemini models.
51
- They are text-to-text, decoder-only large language models, available in English,
52
- with open weights, pre-trained variants, and instruction-tuned variants. Gemma
53
- models are well-suited for a variety of text generation tasks, including
54
- question answering, summarization, and reasoning. Their relatively small size
55
- makes it possible to deploy them in environments with limited resources such as
56
- a laptop, desktop or your own cloud infrastructure, democratizing access to
57
- state of the art AI models and helping foster innovation for everyone.
58
-
59
- ### Usage
60
-
61
- Below we share some code snippets on how to get quickly started with running the model. First make sure to `pip install -U transformers`, then copy the snippet from the section that is relevant for your usecase.
62
-
63
- #### Running the model on a CPU
64
-
65
- ```python
66
- from transformers import AutoTokenizer, AutoModelForCausalLM
67
-
68
- tokenizer = AutoTokenizer.from_pretrained("beomi/gemma-ko-7b")
69
- model = AutoModelForCausalLM.from_pretrained("beomi/gemma-ko-7b")
70
-
71
- input_text = "머신러닝과 딥러닝의 차이는"
72
- input_ids = tokenizer(input_text, return_tensors="pt")
73
-
74
- outputs = model.generate(**input_ids)
75
- print(tokenizer.decode(outputs[0]))
76
- ```
77
-
78
-
79
- #### Running the model on a single / multi GPU
80
-
81
- ```python
82
- # pip install accelerate
83
- from transformers import AutoTokenizer, AutoModelForCausalLM
84
-
85
- tokenizer = AutoTokenizer.from_pretrained("beomi/gemma-ko-7b")
86
- model = AutoModelForCausalLM.from_pretrained("beomi/gemma-ko-7b", device_map="auto")
87
-
88
- input_text = "머신러닝과 딥러닝의 차이는"
89
- input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
90
-
91
- outputs = model.generate(**input_ids)
92
- print(tokenizer.decode(outputs[0]))
93
- ```
94
-
95
- #### Other optimizations
96
-
97
- * _Flash Attention 2_
98
-
99
- First make sure to install `flash-attn` in your environment `pip install flash-attn`
100
-
101
- ```diff
102
- model = AutoModelForCausalLM.from_pretrained(
103
- "beomi/gemma-ko-7b",
104
- torch_dtype=torch.float16,
105
- + attn_implementation="flash_attention_2"
106
- ).to(0)
107
- ```
108
-
109
- ### Inputs and outputs
110
-
111
- * **Input:** Text string, such as a question, a prompt, or a document to be
112
- summarized.
113
- * **Output:** Generated Korean/English-language text in response to the input, such
114
- as an answer to a question, or a summary of a document.
115
-
116
- ## Implementation Information
117
-
118
- Details about the model internals.
119
-
120
- ### Software
121
-
122
- Training was done using [beomi/Gemma-EasyLM](https://github.com/Beomi/Gemma-EasyLM).
123
-
124
-
125
- ## Evaluation
126
-
127
- Model evaluation metrics and results.
128
-
129
- ### Benchmark Results
130
-
131
- TBD
132
-
133
- ## Usage and Limitations
134
-
135
- These models have certain limitations that users should be aware of.
136
-
137
- ### Intended Usage
138
-
139
- Open Large Language Models (LLMs) have a wide range of applications across
140
- various industries and domains. The following list of potential uses is not
141
- comprehensive. The purpose of this list is to provide contextual information
142
- about the possible use-cases that the model creators considered as part of model
143
- training and development.
144
-
145
- * Content Creation and Communication
146
- * Text Generation: These models can be used to generate creative text formats
147
- such as poems, scripts, code, marketing copy, and email drafts.
148
- * Research and Education
149
- * Natural Language Processing (NLP) Research: These models can serve as a
150
- foundation for researchers to experiment with NLP techniques, develop
151
- algorithms, and contribute to the advancement of the field.
152
- * Language Learning Tools: Support interactive language learning experiences,
153
- aiding in grammar correction or providing writing practice.
154
- * Knowledge Exploration: Assist researchers in exploring large bodies of text
155
- by generating summaries or answering questions about specific topics.
156
-
157
- ### Limitations
158
-
159
- * Training Data
160
- * The quality and diversity of the training data significantly influence the
161
- model's capabilities. Biases or gaps in the training data can lead to
162
- limitations in the model's responses.
163
- * The scope of the training dataset determines the subject areas the model can
164
- handle effectively.
165
- * Context and Task Complexity
166
- * LLMs are better at tasks that can be framed with clear prompts and
167
- instructions. Open-ended or highly complex tasks might be challenging.
168
- * A model's performance can be influenced by the amount of context provided
169
- (longer context generally leads to better outputs, up to a certain point).
170
- * Language Ambiguity and Nuance
171
- * Natural language is inherently complex. LLMs might struggle to grasp subtle
172
- nuances, sarcasm, or figurative language.
173
- * Factual Accuracy
174
- * LLMs generate responses based on information they learned from their
175
- training datasets, but they are not knowledge bases. They may generate
176
- incorrect or outdated factual statements.
177
- * Common Sense
178
- * LLMs rely on statistical patterns in language. They might lack the ability
179
- to apply common sense reasoning in certain situations.
180
-
181
- ### Ethical Considerations and Risks
182
-
183
- The development of large language models (LLMs) raises several ethical concerns.
184
- In creating an open model, we have carefully considered the following:
185
-
186
- * Bias and Fairness
187
- * LLMs trained on large-scale, real-world text data can reflect socio-cultural
188
- biases embedded in the training material. These models underwent careful
189
- scrutiny, input data pre-processing described and posterior evaluations
190
- reported in this card.
191
- * Misinformation and Misuse
192
- * LLMs can be misused to generate text that is false, misleading, or harmful.
193
- * Guidelines are provided for responsible use with the model, see the
194
- [Responsible Generative AI Toolkit](http://ai.google.dev/gemma/responsible).
195
- * Transparency and Accountability:
196
- * This model card summarizes details on the models' architecture,
197
- capabilities, limitations, and evaluation processes.
198
- * A responsibly developed open model offers the opportunity to share
199
- innovation by making LLM technology accessible to developers and researchers
200
- across the AI ecosystem.
201
-
202
- Risks identified and mitigations:
203
-
204
- * Perpetuation of biases: It's encouraged to perform continuous monitoring
205
- (using evaluation metrics, human review) and the exploration of de-biasing
206
- techniques during model training, fine-tuning, and other use cases.
207
- * Generation of harmful content: Mechanisms and guidelines for content safety
208
- are essential. Developers are encouraged to exercise caution and implement
209
- appropriate content safety safeguards based on their specific product policies
210
- and application use cases.
211
- * Misuse for malicious purposes: Technical limitations and developer and
212
- end-user education can help mitigate against malicious applications of LLMs.
213
- Educational resources and reporting mechanisms for users to flag misuse are
214
- provided. Prohibited uses of Gemma models are outlined in the
215
- [Gemma Prohibited Use Policy](https://ai.google.dev/gemma/prohibited_use_policy).
216
- * Privacy violations: Models were trained on data filtered for removal of PII
217
- (Personally Identifiable Information). Developers are encouraged to adhere to
218
- privacy regulations with privacy-preserving techniques.
219
-
220
- ## Acknowledgement
221
-
222
- The training is supported by [TPU Research Cloud](https://sites.research.google/trc/) program.
 
23
  * [Original Google's Gemma-7B](https://huggingface.co/google/gemma-7b)
24
  * [Training Code @ Github: Gemma-EasyLM](https://github.com/Beomi/Gemma-EasyLM)
25
 
26
+ **Terms of Use**: [Terms](https://www.kaggle.com/models/google/gemma/license/consent)