nicholasKluge
commited on
Commit
•
190f7f3
1
Parent(s):
43df3be
Update README.md
Browse files
README.md
CHANGED
@@ -57,7 +57,7 @@ This repository has the [source code](https://github.com/Nkluge-correa/Aira) use
|
|
57 |
|
58 |
The ToxicityModelPT was trained as an auxiliary reward model for RLHF training (its logit outputs can be treated as penalizations/rewards). Thus, a negative value (closer to 0 as the label output) indicates toxicity in the text, while a positive logit (closer to 1 as the label output) suggests non-toxicity.
|
59 |
|
60 |
-
Here's an example of how to use the
|
61 |
|
62 |
```python
|
63 |
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
|
|
57 |
|
58 |
The ToxicityModelPT was trained as an auxiliary reward model for RLHF training (its logit outputs can be treated as penalizations/rewards). Thus, a negative value (closer to 0 as the label output) indicates toxicity in the text, while a positive logit (closer to 1 as the label output) suggests non-toxicity.
|
59 |
|
60 |
+
Here's an example of how to use the ToxicityModelPT to score the toxicity of a text:
|
61 |
|
62 |
```python
|
63 |
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|