Edit model card

xsum_6789_3000_1500_test

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("KingKazma/xsum_6789_3000_1500_test")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 27
  • Number of training documents: 1500
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 said - people - would - also - one 10 -1_said_people_would_also
0 police - said - court - mr - found 508 0_police_said_court_mr
1 mr - us - said - president - military 144 1_mr_us_said_president
2 sport - team - world - race - champion 136 2_sport_team_world_race
3 wales - vote - party - said - labour 96 3_wales_vote_party_said
4 foul - win - right - box - half 84 4_foul_win_right_box
5 care - nhs - tax - said - health 62 5_care_nhs_tax_said
6 league - club - season - appearance - football 50 6_league_club_season_appearance
7 wicket - cricket - england - ball - test 36 7_wicket_cricket_england_ball
8 rate - share - bank - growth - price 35 8_rate_share_bank_growth
9 rugby - england - wales - player - ospreys 31 9_rugby_england_wales_player
10 school - teacher - education - child - council 29 10_school_teacher_education_child
11 road - crash - police - collision - barrier 27 11_road_crash_police_collision
12 fire - said - rescue - plane - injured 27 12_fire_said_rescue_plane
13 music - radio - band - singer - show 27 13_music_radio_band_singer
14 passenger - airport - railway - said - scotrail 24 14_passenger_airport_railway_said
15 museum - painting - said - collection - royal 23 15_museum_painting_said_collection
16 road - flooding - weather - beach - rain 22 16_road_flooding_weather_beach
17 eu - trade - european - bank - deal 19 17_eu_trade_european_bank
18 cell - cancer - ebola - disease - human 18 18_cell_cancer_ebola_disease
19 temperature - dr - glacier - heat - researcher 16 19_temperature_dr_glacier_heat
20 bitcoin - software - android - superfish - battery 15 20_bitcoin_software_android_superfish
21 club - football - league - manager - rodgers 14 21_club_football_league_manager
22 zwolle - pec - ajax - zidane - real 13 22_zwolle_pec_ajax_zidane
23 film - best - actress - role - gillan 12 23_film_best_actress_role
24 women - mexico - denmark - footed - romania 12 24_women_mexico_denmark_footed
25 dairy - comedy - uk - export - food 10 25_dairy_comedy_uk_export

Training hyperparameters

  • calculate_probabilities: True
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: False

Framework versions

  • Numpy: 1.22.4
  • HDBSCAN: 0.8.33
  • UMAP: 0.5.3
  • Pandas: 1.5.3
  • Scikit-Learn: 1.2.2
  • Sentence-transformers: 2.2.2
  • Transformers: 4.31.0
  • Numba: 0.57.1
  • Plotly: 5.13.1
  • Python: 3.10.12
Downloads last month
0
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.