|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- AyoubChLin/CNN_News_Articles_2011-2022 |
|
language: |
|
- en |
|
metrics: |
|
- accuracy |
|
pipeline_tag: text-classification |
|
tags: |
|
- news classification |
|
widget: |
|
- text: money in the pocket |
|
- text: no one can win this cup in quatar.. |
|
--- |
|
# Fine-Tuned BART Model for Text Classification on CNN News Articles |
|
|
|
|
|
This is a fine-tuned BART (Bidirectional and Auto-Regressive Transformers) model for text classification on CNN news articles. The model was fine-tuned on a dataset of CNN news articles with labels indicating the article topic, using a batch size of 32, learning rate of 6e-5, and trained for one epoch. |
|
|
|
## How to Use |
|
|
|
### Install |
|
|
|
```bash |
|
pip install transformers |
|
``` |
|
|
|
### Example Usage |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("IT-community/BART_cnn_news_text_classification") |
|
model = AutoModelForSequenceClassification.from_pretrained("IT-community/BART_cnn_news_text_classification") |
|
|
|
# Tokenize input text |
|
text = "This is an example CNN news article about politics." |
|
inputs = tokenizer(text, padding=True, truncation=True, max_length=512, return_tensors="pt") |
|
|
|
# Make prediction |
|
outputs = model(inputs["input_ids"], attention_mask=inputs["attention_mask"]) |
|
predicted_label = torch.argmax(outputs.logits) |
|
|
|
print(predicted_label) |
|
``` |
|
## Evaluation |
|
|
|
The model achieved the following performance metrics on the test set: |
|
|
|
Accuracy: 0.9591836734693877 |
|
|
|
F1-score: 0.958301875401112 |
|
|
|
Recall: 0.9591836734693877 |
|
|
|
Precision: 0.9579673040369542 |
|
|
|
|
|
## About Us |
|
|
|
We are a scientific club from Saad Dahleb Blida University named IT Community, created in 2016 by students. We are interested in all IT fields, |
|
This work was done by IT Community Club. |
|
|
|
### Contributions |
|
|
|
[Cherguelaine Ayoub](https://huggingface.co/AyoubChLin): |
|
|
|
- Added preprocessing code for CNN news articles |
|
|
|
- Improved model performance with additional fine-tuning on a larger dataset |