File size: 1,252 Bytes
c5e0a7b
a8a5f5d
c5e0a7b
 
 
 
 
 
 
 
 
 
9cd8b9a
d57f131
9cd8b9a
 
 
 
 
d57f131
9cd8b9a
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
---
title: News Summarizer and NER
emoji: 🏢
colorFrom: green
colorTo: indigo
sdk: streamlit
sdk_version: 1.29.0
app_file: app.py
pinned: false
license: mit
---

#### New Summarization and NER

News summarization uses "facebook/bart-base" that is fine-tuned using TensorFlow for summarization using 
<a href = "https://www.kaggle.com/datasets/gowrishankarp/newspaper-text-summarization-cnn-dailymail" target="_blank">CNN news articles</a> dataset.<br><br>
NER uses "microsoft/deberta-base" that is fine-tuned using TensorFlow for token classification (NER) using this 
<a href="https://www.kaggle.com/datasets/saurabhprajapat/named-entity-recognition" target="_blank">dataset</a>.<br>The fine-tuning dataset contains annotated sentences.<br>
During inference, the input text is split into sentences using Spacy and entities are identified in each sentence.<br>

The notebook to fine-tune "facebook/bart-base" for news summarization can be found <a href="https://github.com/ksv-muralidhar/hugging_face_tf_fine_tuning/blob/main/bart_en_summarization.ipynb">here</a>.<br>
The notebook to fine-tune "microsoft/deberta-base" for NER can be found <a href="https://github.com/ksv-muralidhar/hugging_face_tf_fine_tuning/blob/main/ner_deberta.ipynb">here</a>.