Phando commited on
Commit
7208295
1 Parent(s): 45e008d

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -0
README.md ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - Phando/uspto-50k
4
+ metrics:
5
+ - accuracy
6
+ pipeline_tag: text-classification
7
+ tags:
8
+ - chemistry
9
+ ---
10
+
11
+ This [ChemBERTa-v2](https://huggingface.co/seyonec/ChemBERTa_zinc250k_v2_40k) checkpoint was fine-tuned on the [USPTO-50k](https://huggingface.co/datasets/Phando/uspto-50k) dataset for sequence classification.
12
+
13
+ Specifically, the objective is to predict the reaction class label, and the input is either (canonicalized) all reactant SMILES or all product SMILES (separated by ".").
14
+
15
+ - Train/Test split: 0.99/0.01
16
+
17
+ - Evaluation results:
18
+ - Accuracy: 87.11%
19
+ - Loss: 0.4272
20
+
21
+ - Fine-tuning hyperparameters:
22
+ - seed = 233
23
+ - batch-size = 128
24
+ - num_epochs = 5 (but early stopped at epoch 4)
25
+ - learning_rate = 5e-4
26
+ - warmup_steps = 64
27
+ - weight_decay = 0.01