ISSR_Dark_Web_7Topics
This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
Usage
To use this model, please install BERTopic:
pip install -U bertopic
You can use the model as follows:
from bertopic import BERTopic
topic_model = BERTopic.load("D0men1c0/ISSR_Dark_Web_7Topics")
topic_model.get_topic_info()
You can make predictions as follows:
sentence = ['closed market']
topic, _ = topic_model.transform(sentence)
topic_model.get_topic_info(topic[0])
Topic overview
- Number of topics: 8
- Number of training documents: 65529
Click here for an overview of all topics.
Topic ID | Topic Keywords | Topic Frequency | Label |
---|---|---|---|
-1 | anyone - new - help - free - please | 2823 | -1_anyone_new_help_free |
0 | weed - xanax - vendor - cocaine - mg | 27613 | Drug Vendor Europe |
1 | market - empire - dream - nightmare - vendor | 8645 | Dream Vendor Nightmare |
2 | vendor - scammer - scam - looking - scamming | 6236 | Trusted Vendor Scams |
3 | review - vendor review - vendor - review vendor - review review | 6907 | Vendor MDMA Review |
4 | mdma - lsd - get - looking - wsm | 4230 | Drug Discussion |
5 | order - package - shipping - delivery - pack | 6299 | Order Shipping & Tracking |
6 | bitcoin - card - wallet - btc - bank | 2776 | Financial Services and Products |
Training hyperparameters
- calculate_probabilities: False
- language: None
- low_memory: False
- min_topic_size: 10
- n_gram_range: (1, 2)
- nr_topics: None
- seed_topic_list: [['tor site', 'drug', 'cocaine', 'ketamine', 'weed', 'trafficking', 'scammer', 'market', 'vendor', 'bitcoin', 'mdma', 'coke', 'lsd', 'heroine', 'xanax', 'tor node', 'tor site', 'gun', 'weapon', 'hacking']]
- top_n_words: 10
- verbose: True
- zeroshot_min_similarity: 0.05
- zeroshot_topic_list: [['burglary', 'buy drugs', 'buy weapons', 'child abuse', 'check sale', 'corruption', 'counterfeit money', 'drugs', 'espionage', 'fake IDs', 'find vendor', 'fraud', 'gun', 'hacking', 'kidnapping', 'murder', 'organ trafficking', 'pedophilia', 'rape', 'scammer', 'sell drugs', 'terrorism', 'trafficking']]
Framework versions
- Numpy: 1.26.4
- HDBSCAN: 0.8.36
- UMAP: 0.5.6
- Pandas: 2.2.1
- Scikit-Learn: 1.4.1.post1
- Sentence-transformers: 3.0.1
- Transformers: 4.39.3
- Numba: 0.60.0
- Plotly: 5.22.0
- Python: 3.12.2
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.