jarodrigues commited on
Commit
7c17f88
1 Parent(s): 3c821e4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -0
README.md CHANGED
@@ -128,6 +128,31 @@ You can use this model directly with a pipeline for masked language modeling:
128
 
129
  ```
130
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
131
 
132
  # Citation
133
 
 
128
 
129
  ```
130
 
131
+ The model can be used by fine-tuning it for a specific task:
132
+
133
+ ```python
134
+ >>> from transformers import AutoTokenizer, AutoModelForSequenceClassification, TrainingArguments, Trainer
135
+ >>> from datasets import load_dataset
136
+
137
+
138
+ >>> model = AutoModelForSequenceClassification.from_pretrained("PORTULAN/albertina-pt-pt", num_labels=2)
139
+ >>> tokenizer = AutoTokenizer.from_pretrained("PORTULAN/albertina-pt-pt")
140
+ >>> dataset = load_dataset("PORTULAN/glueptpt", "rte")
141
+
142
+ >>> def tokenize_function(examples):
143
+ ... return tokenizer(examples["text"], padding="max_length", truncation=True)
144
+ >>> tokenized_datasets = dataset.map(tokenize_function, batched=True)
145
+
146
+ >>> training_args = TrainingArguments(output_dir="albertina-pt-pt-rte", evaluation_strategy="epoch")
147
+ >>> trainer = Trainer(
148
+ ... model=model,
149
+ ... args=training_args,
150
+ ... train_dataset=tokenized_datasets["train"],
151
+ ... eval_dataset=tokenized_datasets["validation"],
152
+ ... )
153
+ >>> trainer.train()
154
+
155
+ ```
156
 
157
  # Citation
158