nicholasKluge commited on
Commit
b38dc02
1 Parent(s): 45c0ff2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -33,7 +33,7 @@ co2_eq_emissions:
33
  geographical_location: Germany
34
  hardware_used: NVIDIA A100-SXM4-40GB
35
  ---
36
- # TeenyTinyLlama-162m
37
 
38
  <img src="./logo-round.png" alt="A little llama wearing a mushroom hat and a monocle." height="200">
39
 
@@ -101,7 +101,7 @@ These are the main arguments used in the training of this model:
101
 
102
  ## Intended Uses
103
 
104
- The primary intended use of TeenyTinyLlama is to research the behavior, functionality, and limitations of large language models. Checkpoints saved during training are intended to provide a controlled setting for performing scientific experiments. You may also further fine-tune and adapt TeenyTinyLlama-162m for deployment, as long as your use is in accordance with the Apache 2.0 license. If you decide to use pre-trained TeenyTinyLlama-162 as a basis for your fine-tuned model, please conduct your own risk and bias assessment.
105
 
106
  ## Basic usage
107
 
@@ -110,7 +110,7 @@ Using the `pipeline`:
110
  ```python
111
  from transformers import pipeline
112
 
113
- generator = pipeline("text-generation", model="nicholasKluge/Teeny-tiny-llama-162m")
114
 
115
  completions = generator("Astronomia é a ciência", num_return_sequences=2, max_new_tokens=100)
116
 
@@ -125,8 +125,8 @@ from transformers import AutoTokenizer, AutoModelForCausalLM
125
  import torch
126
 
127
  # Load model and the tokenizer
128
- tokenizer = AutoTokenizer.from_pretrained("nicholasKluge/Teeny-tiny-llama-162m", revision='main')
129
- model = AutoModelForCausalLM.from_pretrained("nicholasKluge/Teeny-tiny-llama-162m", revision='main')
130
 
131
  # Pass the model to your device
132
  device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
@@ -170,7 +170,7 @@ for i, completion in enumerate(completions):
170
 
171
  | Models | Average | [ARC](https://arxiv.org/abs/1803.05457) | [Hellaswag](https://arxiv.org/abs/1905.07830) | [MMLU](https://arxiv.org/abs/2009.03300) | [TruthfulQA](https://arxiv.org/abs/2109.07958) |
172
  |-------------------------------------------------------------------------------------|---------|-----------------------------------------|-----------------------------------------------|------------------------------------------|------------------------------------------------|
173
- | [TeenyTinyLlama-162m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-162m) | 31.16 | 26.15 | 29.29 | 28.11 | 41.12 |
174
  | [Pythia-160m](https://huggingface.co/EleutherAI/pythia-160m-deduped)* | 31.16 | 24.06 | 31.39 | 24.86 | 44.34 |
175
  | [OPT-125m](https://huggingface.co/facebook/opt-125m)* | 30.80 | 22.87 | 31.47 | 26.02 | 42.87 |
176
  | [Gpt2-portuguese-small](https://huggingface.co/pierreguillou/gpt2-small-portuguese) | 30.22 | 22.48 | 29.62 | 27.36 | 41.44 |
@@ -184,7 +184,7 @@ for i, completion in enumerate(completions):
184
 
185
  | Models | [IMDB](https://huggingface.co/datasets/christykoh/imdb_pt) | [FaQuAD-NLI](https://huggingface.co/datasets/ruanchaves/faquad-nli) | [HateBr](https://huggingface.co/datasets/ruanchaves/hatebr) | [Assin2](https://huggingface.co/datasets/assin2)| [AgNews](https://huggingface.co/datasets/maritaca-ai/ag_news_pt) |
186
  |--------------------------------------------------------------------------------------------|------------------------------------------------------------|---------------------------------------------------------------------|-------------------------------------------------------------|-------------------------------------------------|------------------------------------------------------------------|
187
- | [Teeny Tiny Llama 162m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-162m) | 91.14 | 90.00 | 90.71 | 85.78 | 94.05 |
188
  | [Bert-base-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased) | 92.22 | 93.07 | 91.28 | 87.45 | 94.19 |
189
  | [Bert-large-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased)| 93.58 | 92.26 | 91.57 | 88.97 | 94.11 |
190
  | [Gpt2-small-portuguese](https://huggingface.co/pierreguillou/gpt2-small-portuguese) | 91.60 | 86.46 | 87.42 | 86.11 | 94.07 |
@@ -195,7 +195,7 @@ for i, completion in enumerate(completions):
195
 
196
  @misc{nicholas22llama,
197
  doi = {10.5281/zenodo.6989727},
198
- url = {https://huggingface.co/nicholasKluge/TeenyTinyLlama-162m},
199
  author = {Nicholas Kluge Corrêa},
200
  title = {TeenyTinyLlama},
201
  year = {2023},
@@ -211,4 +211,4 @@ This repository was built as part of the RAIES ([Rede de Inteligência Artificia
211
 
212
  ## License
213
 
214
- TeenyTinyLlama-162m is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.
 
33
  geographical_location: Germany
34
  hardware_used: NVIDIA A100-SXM4-40GB
35
  ---
36
+ # TeenyTinyLlama-160m
37
 
38
  <img src="./logo-round.png" alt="A little llama wearing a mushroom hat and a monocle." height="200">
39
 
 
101
 
102
  ## Intended Uses
103
 
104
+ The primary intended use of TeenyTinyLlama is to research the behavior, functionality, and limitations of large language models. Checkpoints saved during training are intended to provide a controlled setting for performing scientific experiments. You may also further fine-tune and adapt TeenyTinyLlama-160m for deployment, as long as your use is in accordance with the Apache 2.0 license. If you decide to use pre-trained TeenyTinyLlama-160m as a basis for your fine-tuned model, please conduct your own risk and bias assessment.
105
 
106
  ## Basic usage
107
 
 
110
  ```python
111
  from transformers import pipeline
112
 
113
+ generator = pipeline("text-generation", model="nicholasKluge/Teeny-tiny-llama-160m")
114
 
115
  completions = generator("Astronomia é a ciência", num_return_sequences=2, max_new_tokens=100)
116
 
 
125
  import torch
126
 
127
  # Load model and the tokenizer
128
+ tokenizer = AutoTokenizer.from_pretrained("nicholasKluge/Teeny-tiny-llama-160m", revision='main')
129
+ model = AutoModelForCausalLM.from_pretrained("nicholasKluge/Teeny-tiny-llama-160m", revision='main')
130
 
131
  # Pass the model to your device
132
  device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
 
170
 
171
  | Models | Average | [ARC](https://arxiv.org/abs/1803.05457) | [Hellaswag](https://arxiv.org/abs/1905.07830) | [MMLU](https://arxiv.org/abs/2009.03300) | [TruthfulQA](https://arxiv.org/abs/2109.07958) |
172
  |-------------------------------------------------------------------------------------|---------|-----------------------------------------|-----------------------------------------------|------------------------------------------|------------------------------------------------|
173
+ | [TeenyTinyLlama-160m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-160m) | 31.16 | 26.15 | 29.29 | 28.11 | 41.12 |
174
  | [Pythia-160m](https://huggingface.co/EleutherAI/pythia-160m-deduped)* | 31.16 | 24.06 | 31.39 | 24.86 | 44.34 |
175
  | [OPT-125m](https://huggingface.co/facebook/opt-125m)* | 30.80 | 22.87 | 31.47 | 26.02 | 42.87 |
176
  | [Gpt2-portuguese-small](https://huggingface.co/pierreguillou/gpt2-small-portuguese) | 30.22 | 22.48 | 29.62 | 27.36 | 41.44 |
 
184
 
185
  | Models | [IMDB](https://huggingface.co/datasets/christykoh/imdb_pt) | [FaQuAD-NLI](https://huggingface.co/datasets/ruanchaves/faquad-nli) | [HateBr](https://huggingface.co/datasets/ruanchaves/hatebr) | [Assin2](https://huggingface.co/datasets/assin2)| [AgNews](https://huggingface.co/datasets/maritaca-ai/ag_news_pt) |
186
  |--------------------------------------------------------------------------------------------|------------------------------------------------------------|---------------------------------------------------------------------|-------------------------------------------------------------|-------------------------------------------------|------------------------------------------------------------------|
187
+ | [Teeny Tiny Llama 160m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-160m) | 91.14 | 90.00 | 90.71 | 85.78 | 94.05 |
188
  | [Bert-base-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased) | 92.22 | 93.07 | 91.28 | 87.45 | 94.19 |
189
  | [Bert-large-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased)| 93.58 | 92.26 | 91.57 | 88.97 | 94.11 |
190
  | [Gpt2-small-portuguese](https://huggingface.co/pierreguillou/gpt2-small-portuguese) | 91.60 | 86.46 | 87.42 | 86.11 | 94.07 |
 
195
 
196
  @misc{nicholas22llama,
197
  doi = {10.5281/zenodo.6989727},
198
+ url = {https://huggingface.co/nicholasKluge/TeenyTinyLlama-160m},
199
  author = {Nicholas Kluge Corrêa},
200
  title = {TeenyTinyLlama},
201
  year = {2023},
 
211
 
212
  ## License
213
 
214
+ TeenyTinyLlama-160m is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.