File size: 1,021 Bytes
81be7ff
 
 
 
 
 
79ff7a6
 
a5dd3ae
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
81be7ff
 
79ff7a6
 
9c36e2a
81be7ff
 
d634191
81be7ff
d634191
81be7ff
 
 
 
 
 
 
 
 
ba57a63
 
81be7ff
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---
license: mit
---

**Base model:** [westlake-repl/SaProt_650M_AF2](https://huggingface.co/westlake-repl/SaProt_650M_AF2)

**Task type:** protein-level classification

The digital label means:

0: Nucleus

1: Cytoplasm

2: Extracellular

3: Mitochondrion

4: Cell.membrane

5: Endoplasmic.reticulum

6: Plastid

7: Golgi.apparatus

8: Lysosome/Vacuole

9: Peroxisome

**Dataset:** [SaProtHub/Dataset-Subcellular_Localization-DeepLoc](https://huggingface.co/datasets/SaProtHub/Dataset-Subcellular_Localization-DeepLoc)

**Model input type:** SA(Structure-aware) sequence

**Performance (on test set):** 85.75% Accuracy

**LoRA config:**
- **r:** 16
- **lora_dropout:** 0
- **lora_alpha:** 32
- **target_modules:** ["query", "key", "value", "intermediate.dense", "output.dense"]
- **modules_to_save:** ["classifier"]

**Training config:**

- **optimizer:**
  - **class:** AdamW
  - **betas:** (0.9, 0.98)
  - **weight_decay:** 0.01
- **learning rate:** 5e-4
- **epoch:** 100
- **batch size:** 64
- **precision:** 16-mixed