Phando
/

chemberta-v2-finetuned-uspto-50k-classification

Text Classification

Inference Endpoints

Model card Files Files and versions Community

Phando commited on Nov 12, 2023

Commit

7208295

•

1 Parent(s): 45e008d

Create README.md

Files changed (1) hide show

README.md +27 -0

README.md ADDED Viewed

	@@ -0,0 +1,27 @@

+---
+datasets:
+- Phando/uspto-50k
+metrics:
+- accuracy
+pipeline_tag: text-classification
+tags:
+- chemistry
+---
+This [ChemBERTa-v2](https://huggingface.co/seyonec/ChemBERTa_zinc250k_v2_40k) checkpoint was fine-tuned on the [USPTO-50k](https://huggingface.co/datasets/Phando/uspto-50k) dataset for sequence classification.
+Specifically, the objective is to predict the reaction class label, and the input is either (canonicalized) all reactant SMILES or all product SMILES (separated by ".").
+- Train/Test split: 0.99/0.01
+- Evaluation results:
+  - Accuracy: 87.11%
+  - Loss: 0.4272
+- Fine-tuning hyperparameters:
+  - seed = 233
+  - batch-size = 128
+  - num_epochs = 5 (but early stopped at epoch 4)
+  - learning_rate = 5e-4
+  - warmup_steps = 64
+  - weight_decay = 0.01