china-only-mar11

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("Thang203/china-only-mar11")

topic_model.get_topic_info()

Topic overview

Number of topics: 20
Number of training documents: 847

Click here for an overview of all topics.

Topic ID	Topic Keywords	Topic Frequency	Label
-1	language - llms - models - data - large	21	-1_language_llms_models_data
0	visual - image - multimodal - models - language	205	0_visual_image_multimodal_models
1	embodied - driving - navigation - robot - robotic	142	1_embodied_driving_navigation_robot
2	recommendation - user - recommendations - systems - behavior	16	2_recommendation_user_recommendations_systems
3	agents - social - bots - interactions - ai agents	16	3_agents_social_bots_interactions
4	rl - reinforcement learning - reinforcement - learning - policy	15	4_rl_reinforcement learning_reinforcement_learning
5	molecular - design - property - prediction - gnns	17	5_molecular_design_property_prediction
6	code - code generation - generation - software - programming	11	6_code_code generation_generation_software
7	medical - knowledge - medical knowledge - llms - language	73	7_medical_knowledge_medical knowledge_llms
8	extraction - information extraction - event - information - relation	16	8_extraction_information extraction_event_information
9	safety - llms - robustness - instructions - assurance	15	9_safety_llms_robustness_instructions
10	reasoning - prompting - cot - llms - chainofthought	14	10_reasoning_prompting_cot_llms
11	knowledge - language - knowledge graph - web - kg	52	11_knowledge_language_knowledge graph_web
12	question - answering - commonsense - question answering - knowledge	17	12_question_answering_commonsense_question answering
13	models - language - model - training - language models	18	13_models_language_model_training
14	dialogue - dialog - models - responses - model	104	14_dialogue_dialog_models_responses
15	detection - fake - news - detectors - texts	31	15_detection_fake_news_detectors
16	chatgpt - sentiment - evaluation - sentiment analysis - human	16	16_chatgpt_sentiment_evaluation_sentiment analysis
17	chinese - evaluation - models - language - language models	22	17_chinese_evaluation_models_language
18	translation - arabic - languages - language - models	26	18_translation_arabic_languages_language

Training hyperparameters

calculate_probabilities: False
language: english
low_memory: False
min_topic_size: 10
n_gram_range: (1, 1)
nr_topics: 20
seed_topic_list: None
top_n_words: 10
verbose: True
zeroshot_min_similarity: 0.7
zeroshot_topic_list: None

Framework versions

Numpy: 1.25.2
HDBSCAN: 0.8.33
UMAP: 0.5.5
Pandas: 1.5.3
Scikit-Learn: 1.2.2
Sentence-transformers: 2.6.1
Transformers: 4.38.2
Numba: 0.58.1
Plotly: 5.15.0
Python: 3.10.12