Edit model card

ISSR_Dark_Web_7Topics

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("D0men1c0/ISSR_Dark_Web_7Topics")

topic_model.get_topic_info()

You can make predictions as follows:

sentence = ['closed market']
topic, _ = topic_model.transform(sentence)
topic_model.get_topic_info(topic[0])

Topic overview

  • Number of topics: 8
  • Number of training documents: 65529
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 anyone - new - help - free - please 2823 -1_anyone_new_help_free
0 weed - xanax - vendor - cocaine - mg 27613 Drug Vendor Europe
1 market - empire - dream - nightmare - vendor 8645 Dream Vendor Nightmare
2 vendor - scammer - scam - looking - scamming 6236 Trusted Vendor Scams
3 review - vendor review - vendor - review vendor - review review 6907 Vendor MDMA Review
4 mdma - lsd - get - looking - wsm 4230 Drug Discussion
5 order - package - shipping - delivery - pack 6299 Order Shipping & Tracking
6 bitcoin - card - wallet - btc - bank 2776 Financial Services and Products

Training hyperparameters

  • calculate_probabilities: False
  • language: None
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 2)
  • nr_topics: None
  • seed_topic_list: [['tor site', 'drug', 'cocaine', 'ketamine', 'weed', 'trafficking', 'scammer', 'market', 'vendor', 'bitcoin', 'mdma', 'coke', 'lsd', 'heroine', 'xanax', 'tor node', 'tor site', 'gun', 'weapon', 'hacking']]
  • top_n_words: 10
  • verbose: True
  • zeroshot_min_similarity: 0.05
  • zeroshot_topic_list: [['burglary', 'buy drugs', 'buy weapons', 'child abuse', 'check sale', 'corruption', 'counterfeit money', 'drugs', 'espionage', 'fake IDs', 'find vendor', 'fraud', 'gun', 'hacking', 'kidnapping', 'murder', 'organ trafficking', 'pedophilia', 'rape', 'scammer', 'sell drugs', 'terrorism', 'trafficking']]

Framework versions

  • Numpy: 1.26.4
  • HDBSCAN: 0.8.36
  • UMAP: 0.5.6
  • Pandas: 2.2.1
  • Scikit-Learn: 1.4.1.post1
  • Sentence-transformers: 3.0.1
  • Transformers: 4.39.3
  • Numba: 0.60.0
  • Plotly: 5.22.0
  • Python: 3.12.2
Downloads last month
3
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.