nicholasKluge
commited on
Commit
•
8b3ddab
1
Parent(s):
fd8c72c
Update README.md
Browse files
README.md
CHANGED
@@ -16,11 +16,11 @@ widget:
|
|
16 |
- text: "Esqueceram de mim 2 é o pior filme da franquia inteira."
|
17 |
example_title: Exemplo
|
18 |
---
|
19 |
-
# TeenyTinyLlama-
|
20 |
|
21 |
TeenyTinyLlama is a series of small foundational models trained in Brazilian Portuguese.
|
22 |
|
23 |
-
This repository contains a version of [TeenyTinyLlama-
|
24 |
|
25 |
## Details
|
26 |
|
@@ -38,7 +38,7 @@ from transformers import pipeline
|
|
38 |
|
39 |
text = "Esqueceram de mim 2 é um dos melhores filmes de natal de todos os tempos."
|
40 |
|
41 |
-
classifier = pipeline("text-classification", model="nicholasKluge/TeenyTinyLlama-
|
42 |
classifier(text)
|
43 |
|
44 |
# >>> [{'label': 'POSITIVE', 'score': 0.9971244931221008}]
|
@@ -63,13 +63,13 @@ dataset = load_dataset("christykoh/imdb_pt")
|
|
63 |
|
64 |
# Create a `ModelForSequenceClassification`
|
65 |
model = AutoModelForSequenceClassification.from_pretrained(
|
66 |
-
"nicholasKluge/TeenyTinyLlama-
|
67 |
num_labels=2,
|
68 |
id2label={0: "NEGATIVE", 1: "POSITIVE"},
|
69 |
label2id={"NEGATIVE": 0, "POSITIVE": 1}
|
70 |
)
|
71 |
|
72 |
-
tokenizer = AutoTokenizer.from_pretrained("nicholasKluge/TeenyTinyLlama-
|
73 |
|
74 |
# Preprocess the dataset
|
75 |
def preprocess_function(examples):
|
@@ -124,7 +124,7 @@ trainer.train()
|
|
124 |
|
125 |
| Models | [IMDB](https://huggingface.co/datasets/christykoh/imdb_pt) |
|
126 |
|--------------------------------------------------------------------------------------------|------------------------------------------------------------|
|
127 |
-
| [Teeny Tiny Llama
|
128 |
| [Bert-base-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased) | 92.22 |
|
129 |
| [Bert-large-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased)| 93.58 |
|
130 |
| [Gpt2-small-portuguese](https://huggingface.co/pierreguillou/gpt2-small-portuguese) | 91.60 |
|
@@ -135,7 +135,7 @@ trainer.train()
|
|
135 |
|
136 |
@misc{nicholas22llama,
|
137 |
doi = {10.5281/zenodo.6989727},
|
138 |
-
url = {https://huggingface.co/nicholasKluge/TeenyTinyLlama-
|
139 |
author = {Nicholas Kluge Corrêa},
|
140 |
title = {TeenyTinyLlama},
|
141 |
year = {2023},
|
@@ -151,4 +151,4 @@ This repository was built as part of the RAIES ([Rede de Inteligência Artificia
|
|
151 |
|
152 |
## License
|
153 |
|
154 |
-
TeenyTinyLlama-
|
|
|
16 |
- text: "Esqueceram de mim 2 é o pior filme da franquia inteira."
|
17 |
example_title: Exemplo
|
18 |
---
|
19 |
+
# TeenyTinyLlama-160m-IMDB
|
20 |
|
21 |
TeenyTinyLlama is a series of small foundational models trained in Brazilian Portuguese.
|
22 |
|
23 |
+
This repository contains a version of [TeenyTinyLlama-160m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-160m) (`TeenyTinyLlama-160m-IMDB`) fine-tuned on the the [IMDB dataset](https://huggingface.co/datasets/christykoh/imdb_pt).
|
24 |
|
25 |
## Details
|
26 |
|
|
|
38 |
|
39 |
text = "Esqueceram de mim 2 é um dos melhores filmes de natal de todos os tempos."
|
40 |
|
41 |
+
classifier = pipeline("text-classification", model="nicholasKluge/TeenyTinyLlama-160m-IMDB")
|
42 |
classifier(text)
|
43 |
|
44 |
# >>> [{'label': 'POSITIVE', 'score': 0.9971244931221008}]
|
|
|
63 |
|
64 |
# Create a `ModelForSequenceClassification`
|
65 |
model = AutoModelForSequenceClassification.from_pretrained(
|
66 |
+
"nicholasKluge/TeenyTinyLlama-160m",
|
67 |
num_labels=2,
|
68 |
id2label={0: "NEGATIVE", 1: "POSITIVE"},
|
69 |
label2id={"NEGATIVE": 0, "POSITIVE": 1}
|
70 |
)
|
71 |
|
72 |
+
tokenizer = AutoTokenizer.from_pretrained("nicholasKluge/TeenyTinyLlama-160m")
|
73 |
|
74 |
# Preprocess the dataset
|
75 |
def preprocess_function(examples):
|
|
|
124 |
|
125 |
| Models | [IMDB](https://huggingface.co/datasets/christykoh/imdb_pt) |
|
126 |
|--------------------------------------------------------------------------------------------|------------------------------------------------------------|
|
127 |
+
| [Teeny Tiny Llama 160m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-160m) | 91.14 |
|
128 |
| [Bert-base-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased) | 92.22 |
|
129 |
| [Bert-large-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased)| 93.58 |
|
130 |
| [Gpt2-small-portuguese](https://huggingface.co/pierreguillou/gpt2-small-portuguese) | 91.60 |
|
|
|
135 |
|
136 |
@misc{nicholas22llama,
|
137 |
doi = {10.5281/zenodo.6989727},
|
138 |
+
url = {https://huggingface.co/nicholasKluge/TeenyTinyLlama-160m},
|
139 |
author = {Nicholas Kluge Corrêa},
|
140 |
title = {TeenyTinyLlama},
|
141 |
year = {2023},
|
|
|
151 |
|
152 |
## License
|
153 |
|
154 |
+
TeenyTinyLlama-160m-IMDB is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.
|