Edit model card

urdu_news_topic_modeling

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("shaistaDev7/urdu_news_topic_modeling")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 7
  • Number of training documents: 7991
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
0 پولیو - ڈاکٹر - مہم - بتایا - صحت 1626 0_پولیو_ڈاکٹر_مہم_بتایا
1 فلم - اداکارہ - اداکار - شادی - بالی 1290 1_فلم_اداکارہ_اداکار_شادی
2 عمران - خان - تحریک - حکومت - لیگ 1263 2_عمران_خان_تحریک_حکومت
3 روپے - مالی - ملین - سال - ڈالر 1062 3_روپے_مالی_ملین_سال
4 فون - صارفین - ویوو - موبائل - بک 928 4_فون_صارفین_ویوو_موبائل
5 ٹیم - میچ - کرکٹ - رنز - ٹورنامنٹ 916 5_ٹیم_میچ_کرکٹ_رنز
6 کورونا - وائرس - کیسز - مریض - ہزار 906 6_کورونا_وائرس_کیسز_مریض

Training hyperparameters

  • calculate_probabilities: True
  • language: urdu
  • low_memory: True
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: True
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 1.23.5
  • HDBSCAN: 0.8.33
  • UMAP: 0.5.5
  • Pandas: 1.5.3
  • Scikit-Learn: 1.2.2
  • Sentence-transformers: 2.2.2
  • Transformers: 4.35.2
  • Numba: 0.58.1
  • Plotly: 5.15.0
  • Python: 3.10.12
Downloads last month
4
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.