Edit model card

parliament_topic_model

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("daniel-023/parliament_topic_model")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 20
  • Number of training documents: 2005
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 minister - singapore - member - time - government 16 -1_minister_singapore_member_time
0 education - teachers - schools - school - minister 541 0_education_teachers_schools_school
1 water - reclamation - land - development - minister 210 1_water_reclamation_land_development
2 million - singapore - government - finance - year 202 2_million_singapore_government_finance
3 service - police - national - minister - officers 187 3_service_police_national_minister
4 law - council - house - members - committee 140 4_law_council_house_members
5 singapore - identity - citizenship - minister - cards 112 5_singapore_identity_citizenship_minister
6 bus - buses - taxis - transport - taxi 88 6_bus_buses_taxis_transport
7 property - land - tax - board - flats 81 7_property_land_tax_board
8 farmers - prices - minister - price - production 79 8_farmers_prices_minister_price
9 singapore - people - countries - government - foreign 70 9_singapore_people_countries_government
10 culture - cultural - programmes - films - people 49 10_culture_cultural_programmes_films
11 abortion - abortions - family - medical - women 48 11_abortion_abortions_family_medical
12 fund - pension - citizenship - age - years 38 12_fund_pension_citizenship_age
13 airport - telephone - passengers - singapore - terminal 37 13_airport_telephone_passengers_singapore
14 sports - games - national - singapore - national sports 29 14_sports_games_national_singapore
15 drug - drugs - medicines - advertisements - medical 24 15_drug_drugs_medicines_advertisements
16 health - mosquitoes - mosquito - hawkers - rubbish 20 16_health_mosquitoes_mosquito_hawkers
17 brigade - sports - minister - station - firefighting 17 17_brigade_sports_minister_station
18 hawkers - market - hawker - stalls - markets 17 18_hawkers_market_hawker_stalls

Training hyperparameters

  • calculate_probabilities: False
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: 20
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: False
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 1.26.4
  • HDBSCAN: 0.8.37
  • UMAP: 0.5.5
  • Pandas: 2.2.0
  • Scikit-Learn: 1.4.1.post1
  • Sentence-transformers: 2.4.0
  • Transformers: 4.43.3
  • Numba: 0.60.0
  • Plotly: 5.23.0
  • Python: 3.12.1
Downloads last month
7
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.