nicholasKluge commited on
Commit
00d77a3
1 Parent(s): c0b28fa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -15
README.md CHANGED
@@ -122,26 +122,26 @@ trainer.train()
122
 
123
  ## Fine-Tuning Comparisons
124
 
125
- | Models | [IMDB](https://huggingface.co/datasets/christykoh/imdb_pt) |
126
- |--------------------------------------------------------------------------------------------|------------------------------------------------------------|
127
- | [Bert-large-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased)| 93.58 |
128
- | [Teeny Tiny Llama 460m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-460m) | 92.28 |
129
- | [Bert-base-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased) | 92.22 |
130
- | [Gpt2-small-portuguese](https://huggingface.co/pierreguillou/gpt2-small-portuguese) | 91.60 |
131
- | [Teeny Tiny Llama 160m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-160m) | 91.14 |
 
 
 
132
 
133
  ## Cite as 🤗
134
 
135
  ```latex
136
 
137
- @misc{nicholas22llama,
138
- doi = {10.5281/zenodo.6989727},
139
- url = {https://huggingface.co/nicholasKluge/TeenyTinyLlama-460m},
140
- author = {Nicholas Kluge Corrêa},
141
- title = {TeenyTinyLlama},
142
- year = {2023},
143
- publisher = {HuggingFace},
144
- journal = {HuggingFace repository},
145
  }
146
 
147
  ```
 
122
 
123
  ## Fine-Tuning Comparisons
124
 
125
+ To further evaluate the downstream capabilities of our models, we decided to employ a basic fine-tuning procedure for our TTL pair on a subset of tasks from the Poeta benchmark. We apply the same procedure for comparison purposes on both [BERTimbau](https://huggingface.co/neuralmind/bert-base-portuguese-cased) models, given that they are also LLM trained from scratch in Brazilian Portuguese and have a similar size range to our models. We used these comparisons to assess if our pre-training runs produced LLM capable of producing good results ("good" here means "close to BERTimbau") when utilized for downstream applications.
126
+
127
+ | Models | IMDB | FaQuAD-NLI | HateBr | Assin2 | AgNews | Average |
128
+ |-----------------|-----------|------------|-----------|-----------|-----------|---------|
129
+ | BERTimbau-large | **93.58** | 92.26 | 91.57 | **88.97** | 94.11 | 92.10 |
130
+ | BERTimbau-small | 92.22 | **93.07** | 91.28 | 87.45 | 94.19 | 91.64 |
131
+ | **TTL-460m** | 91.64 | 91.18 | **92.28** | 86.43 | **94.42** | 91.19 |
132
+ | **TTL-160m** | 91.14 | 90.00 | 90.71 | 85.78 | 94.05 | 90.34 |
133
+
134
+ All the shown results are the higher accuracy scores achieved on the respective task test sets after fine-tuning the models on the training sets. All fine-tuning runs used the same hyperparameters, and the code implementation can be found in the [model cards](https://huggingface.co/nicholasKluge/TeenyTinyLlama-460m-HateBR) of our fine-tuned models.
135
 
136
  ## Cite as 🤗
137
 
138
  ```latex
139
 
140
+ @misc{correa24ttllama,
141
+ title = {TeenyTinyLlama: a pair of open-source tiny language models trained in Brazilian Portuguese},
142
+ author = {Corr{\^e}a, Nicholas Kluge and Falk, Sophia and Fatimah, Shiza and Sen, Aniket and De Oliveira, Nythamar},
143
+ journal={arXiv},
144
+ year = {2024},
 
 
 
145
  }
146
 
147
  ```