Edit model card

transformers_amazon_reviews_topics

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("inesbattah/transformers_amazon_reviews_topics")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 30
  • Number of training documents: 9000
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 amazon - quality - product - cheap - seller 10 -1_amazon_quality_product_cheap
0 refund - ordered - order - delivered - return 3105 0_refund_ordered_order_delivered
1 charging - charger - charge - iphone - headphones 1556 1_charging_charger_charge_iphone
2 wear - shoe - shoes - zipper - fit 655 2_wear_shoe_shoes_zipper
3 shampoo - conditioner - scent - flavor - hair 635 3_shampoo_conditioner_scent_flavor
4 protector - protectors - screen - case - cases 452 4_protector_protectors_screen_case
5 color - colors - colored - blue - black 293 5_color_colors_colored_blue
6 bottle - leak - leaking - bottles - leaks 234 6_bottle_leak_leaking_bottles
7 lights - light - bulbs - flashlight - led 209 7_lights_light_bulbs_flashlight
8 dog - toy - dogs - puppy - chewed 205 8_dog_toy_dogs_puppy
9 chairs - chair - assemble - screws - assembling 192 9_chairs_chair_assemble_screws
10 cheap - cheaply - material - quality - cost 181 10_cheap_cheaply_material_quality
11 book - books - chapters - chapter - author 180 11_book_books_chapters_chapter
12 hose - faucet - pump - valve - leak 167 12_hose_faucet_pump_valve
13 pan - pans - pancakes - griddle - cook 127 13_pan_pans_pancakes_griddle
14 dvd - dvds - disc - discs - cd 114 14_dvd_dvds_disc_discs
15 fit - fitting - didnt - galaxy - samsung 109 15_fit_fitting_didnt_galaxy
16 razor - shave - razors - reviews - blades 97 16_razor_shave_razors_reviews
17 cartridges - cartridge - ink - printer - printing 97 17_cartridges_cartridge_ink_printer
18 watches - watch - clocks - clock - battery 88 18_watches_watch_clocks_clock
19 remote - remotes - buttons - button - programmed 78 19_remote_remotes_buttons_button
20 seeds - seed - planted - planting - germinated 43 20_seeds_seed_planted_planting
21 thermometer - temperature - temperatureoff - temps - temp 36 21_thermometer_temperature_temperatureoff_temps
22 instructions - directions - how - installation - cheap 34 22_instructions_directions_how_installation
23 pistol - holster - gun - glock19 - glock 29 23_pistol_holster_gun_glock19
24 tire - tires - tube - bike - wheel 20 24_tire_tires_tube_bike
25 snoring - snorkeling - snore - snorkel - snores 17 25_snoring_snorkeling_snore_snorkel
26 rugs - carpets - carpet - rug - floors 13 26_rugs_carpets_carpet_rug
27 waterproof - wet - swimming - bathing - raining 12 27_waterproof_wet_swimming_bathing
28 fan - squealing - noise - fans - quiet 12 28_fan_squealing_noise_fans

Training hyperparameters

  • calculate_probabilities: False
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: 30
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: True
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 1.26.4
  • HDBSCAN: 0.8.39
  • UMAP: 0.5.7
  • Pandas: 2.2.2
  • Scikit-Learn: 1.5.2
  • Sentence-transformers: 3.2.1
  • Transformers: 4.44.2
  • Numba: 0.60.0
  • Plotly: 5.24.1
  • Python: 3.10.12
Downloads last month
26
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.