# Crypto Project Classifier This is a Hugging Face model built on **FacebookAI/roberta-large**, fine-tuned to classify Twitter accounts as **crypto projects** or **non-crypto entities**. The model takes as input a sentence containing the author's name and bio, and outputs a probability score and a classification (1 for a crypto project, 0 otherwise). ## Model Details - **Language**: English - **Base Model**: FacebookAI/roberta-large - **Task**: Sequence Classification ### Example Input ```plaintext hi i am {author_name}, i do this {twitter_bio} ``` ## How to Use Below is a sample Python script to use the model for classification: ```python import torch import pandas as pd from transformers import RobertaTokenizer, AutoModelForSequenceClassification model_name = "yoursdevkalki/crypto_project_classifier" tokenizer = RobertaTokenizer.from_pretrained(model_name) model = AutoModelForSequenceClassification.from_pretrained(model_name) projects_df = pd.read_csv("projects.csv") test_df = projects_df.head(500) def process_row(description): inputs = tokenizer(description, return_tensors="pt", padding=True, truncation=True) with torch.no_grad(): outputs = model(**inputs) logits = outputs.logits probability = torch.sigmoid(logits).numpy()[0][0] # Convert to probability # Compute prediction prediction = 1 if probability >= 0.6 else 0 return probability * 100, prediction test_df["prob"], test_df["prediction"] = zip(*test_df["twitter_bio"].apply(process_row)) print(test_df) ``` ### Input Format The input text should be structured as: ```plaintext hi i am {author_name}, i do this {twitter_bio} ``` ### Outputs - **prob**: The model's confidence in percentage (0–100%). - **prediction**: Classification result (`1` for crypto project, `0` for non-project). ## Dataset The model was trained on a dataset of 40k samples: - **20k Crypto Projects** (labeled as `1`) - **20k Non-Crypto Entities** (labeled as `0`) ### Metrics Achieved - **F1 Score**: >90% - **Accuracy**: >90% ## Donations If you find this model useful, consider supporting its development: - **Solana Address**: `2oiBTZ3QvTbsns4babAW54PHcKzacYG3MXUcpAMp7LKV` - **Ethereum Address**: `0x56a28F1Bd2CD4E2AAA386aeA1c30a24A2f854Ec4` ## Reach Out - **Twitter**: [@yourdevkalki](https://x.com/yourdevkalki) - **Email**: `yourdevkalki@gmail.com`