etanios's picture
Add BERTopic model
ad1b303
|
raw
history blame
4.46 kB
metadata
tags:
  - bertopic
library_name: bertopic
pipeline_tag: text-classification

short-pubmed-bertopic

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("etanios/short-pubmed-bertopic")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 32
  • Number of training documents: 9999
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 patients - activity - cells - acid - 10 53 Medical research studies
0 health - children - care - medical - patient 3897 Healthcare and Mental Health
1 renal - pressure - renin - blood - sodium 702 Renal physiology and hypertension
2 progesterone - pregnancy - lh - testosterone - fetal 516 Reproductive physiology and endocrinology
3 ventricular - coronary - left - patients - myocardial 404 Cardiac physiology and anatomy
4 method - gas - liquid - chromatography - urine 381 Analytical techniques for chemical analysis
5 morphine - mg kg - kg - brain - mg 296 Effects of drugs on rat behavior
6 cells - tumor - mice - cell - lymphocytes 281 Tumor immunology and lymphocyte activation
7 hip - patients - joint - rheumatoid - knee 278 Hip and Knee Fractures in Rheumatoid Arthritis Patients
8 cells - electron - cell - axons - synaptic 265 Neuroanatomy and Synaptic Structures
9 retinal - eyes - vitreous - eye - cases 237 Ocular Surgery
10 patients - ulcer - cases - disease - crohn 203 Gastrointestinal surgery and complications
11 strains - resistant - antibiotic - gentamicin - antibiotics 201 Antibiotic resistance in bacterial strains
12 enzyme - molecular - acid - activity - phosphate 201 Enzymology and Molecular Biology
13 iron - hemoglobin - hb - erythrocytes - deficiency 178 Iron Deficiency Anemia Research
14 infection - virus - infected - vaccinated - cattle 175 Virus infection and vaccination studies
15 platelet - platelets - aggregation - fibrinogen - coagulation 167 Platelet function and coagulation
16 cancer - carcinoma - patients - tumor - cases 158 Cancer treatment and survival rates
17 lung - volume - oxygen - flow - ventilation 152 Pulmonary Function and Exercise
18 calcium - thyroid - t3 - parathyroid - t4 151 Thyroid and Parathyroid Hormones
19 fatty - cholesterol - acid - liver - acids 144 Lipid metabolism in rats
20 diet - weight - egg - diets - fed 117 Poultry Nutrition and Physiology
21 stimulation - nerve - neurons - units - motor 111 Motor Neuron Discharge and Potentials
22 glucose - insulin - glucagon - levels - blood glucose 102 Insulin and glucose metabolism
23 hearing - ear - cochlear - auditory - noise 99 Hearing and Auditory Function
24 rna - dna - chromatin - proteins - polymerase 95
25 carotid - artery - intracranial - aneurysms - subdural 94 Carotid Artery Aneurysms and Intracranial Hemorrhage
26 bone - mineral - fluoride - enamel - mineralization 81 Bone and Mineral Metabolism
27 temperature - degrees - heat - temperatures - cold 75 Effects of Temperature on Human Physiology
28 bile - acid - biliary - cholesterol - bile acid 63 Biliary Lipid Metabolism
29 visual - light - receptive - sensitivity - dark 61 Visual processing and sensitivity
30 lung - pulmonary - chest - pleural - patients 61 Respiratory system diseases and disorders

Training hyperparameters

  • calculate_probabilities: False
  • language: None
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: True

Framework versions

  • Numpy: 1.23.5
  • HDBSCAN: 0.8.33
  • UMAP: 0.5.4
  • Pandas: 1.5.3
  • Scikit-Learn: 1.2.2
  • Sentence-transformers: 2.2.2
  • Transformers: 4.33.2
  • Numba: 0.56.4
  • Plotly: 5.15.0
  • Python: 3.10.12