Edit model card


This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.


To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("Thang203/china-only-mar11")


Topic overview

  • Number of topics: 20
  • Number of training documents: 847
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 language - llms - models - data - large 21 -1_language_llms_models_data
0 visual - image - multimodal - models - language 205 0_visual_image_multimodal_models
1 embodied - driving - navigation - robot - robotic 142 1_embodied_driving_navigation_robot
2 recommendation - user - recommendations - systems - behavior 16 2_recommendation_user_recommendations_systems
3 agents - social - bots - interactions - ai agents 16 3_agents_social_bots_interactions
4 rl - reinforcement learning - reinforcement - learning - policy 15 4_rl_reinforcement learning_reinforcement_learning
5 molecular - design - property - prediction - gnns 17 5_molecular_design_property_prediction
6 code - code generation - generation - software - programming 11 6_code_code generation_generation_software
7 medical - knowledge - medical knowledge - llms - language 73 7_medical_knowledge_medical knowledge_llms
8 extraction - information extraction - event - information - relation 16 8_extraction_information extraction_event_information
9 safety - llms - robustness - instructions - assurance 15 9_safety_llms_robustness_instructions
10 reasoning - prompting - cot - llms - chainofthought 14 10_reasoning_prompting_cot_llms
11 knowledge - language - knowledge graph - web - kg 52 11_knowledge_language_knowledge graph_web
12 question - answering - commonsense - question answering - knowledge 17 12_question_answering_commonsense_question answering
13 models - language - model - training - language models 18 13_models_language_model_training
14 dialogue - dialog - models - responses - model 104 14_dialogue_dialog_models_responses
15 detection - fake - news - detectors - texts 31 15_detection_fake_news_detectors
16 chatgpt - sentiment - evaluation - sentiment analysis - human 16 16_chatgpt_sentiment_evaluation_sentiment analysis
17 chinese - evaluation - models - language - language models 22 17_chinese_evaluation_models_language
18 translation - arabic - languages - language - models 26 18_translation_arabic_languages_language

Training hyperparameters

  • calculate_probabilities: False
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: 20
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: True
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 1.25.2
  • HDBSCAN: 0.8.33
  • UMAP: 0.5.5
  • Pandas: 1.5.3
  • Scikit-Learn: 1.2.2
  • Sentence-transformers: 2.6.1
  • Transformers: 4.38.2
  • Numba: 0.58.1
  • Plotly: 5.15.0
  • Python: 3.10.12
Downloads last month