AyoubChLin
/

distilbert_cnn_news

Text Classification

Inference Endpoints

Model card Files Files and versions Community

distilbert_cnn_news / README.md

AyoubChLin's picture

Update README.md

391f89c over 1 year ago

|

2.78 kB

	---
	license: apache-2.0
	datasets:
	- AyoubChLin/CNN_News_Articles_2011-2022
	language:
	- en
	metrics:
	- accuracy
	pipeline_tag: text-classification
	widget:
	- text: money in the pocket
	- text: no one can win this cup in quatar..
	- text: Health is an essential aspect of our lives that affects us physically, mentally, and emotionally. Maintaining good health requires us to make healthy lifestyle choices, including eating a balanced diet, getting regular exercise, and getting enough sleep. These habits can help reduce the risk of developing chronic diseases such as diabetes, heart disease, and cancer.
	---
	## DistilBertForSequenceClassification on CNN News Dataset

	This repository contains a fine-tuned DistilBert base model for sequence classification on the CNN News dataset. The model is able to classify news articles into one of six categories: business, entertainment, health, news, politics, and sport.

	The model was fine-tuned for four epochs achieving a training loss of 0.012900, a validation loss of 0.151663,

	- accuracy of 0.9607394366197183.
	- f1 : 0.962072
	- precision : 0.961904
	- recall : 0.962324

	### Model Description

	<!-- Provide a longer summary of what this model is. -->



	- Developed by: [CHERGUELAINE Ayoub](https://www.linkedin.com/in/ayoub-cherguelaine/) & [BOUBEKRI Faycal](https://www.linkedin.com/in/faycal-boubekri-832848199/)
	- Shared by [optional]: HuggingFace
	- Model type: Language model
	- Language(s) (NLP): en
	- Finetuned from model [optional]: [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased)


	### Usage

	You can use this model with the Hugging Face Transformers library for a variety of natural language processing tasks, such as text classification, sentiment analysis, and more.

	Here's an example of how to use this model for text classification in Python:

	``` python
	from transformers import AutoTokenizer, DistilBertForSequenceClassification


	model_name = "AyoubChLin/distilbert_cnn_news"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = TFAutoModelForSequenceClassification.from_pretrained(model_name)

	text = "This is a news article about politics."
	inputs = tokenizer(text, padding=True, truncation=True, return_tensors="pt")

	with torch.no_grad():
	logits = model(**inputs).logits

	predicted_class_id = logits.argmax().item()

	```
	In this example, we first load the tokenizer and the model using their respective from_pretrained methods. We then encode a news article using the tokenizer, pass the inputs through the model, and extract the predicted label using the argmax function. Finally, we map the predicted label to its corresponding category using a list of labels.

	### Contributors
	This model was fine-tuned by CHERGUELAINE Ayoub and BOUBEKRI Faycal.