weiqipedia
commited on
Commit
•
46cf2da
1
Parent(s):
e1f1385
Update metrics in README.md
Browse files
README.md
CHANGED
@@ -65,13 +65,16 @@ For Natural Language Generation (NLG) tasks, we tested the model on Machine Tran
|
|
65 |
|
66 |
For Natural Language Reasoning (NLR) tasks, we tested the model on Natural Language Inference (NLI) using the IndoNLI lay dataset and on Causal Reasoning (Causal) using the XCOPA dataset. The metrics are accuracy for both tasks.
|
67 |
|
68 |
-
| Model
|
69 |
-
|
70 |
-
|
|
71 |
-
|
|
72 |
-
|
|
73 |
-
|
|
74 |
-
|
|
|
|
|
|
|
75 |
|
76 |
## Technical Specifications
|
77 |
|
|
|
65 |
|
66 |
For Natural Language Reasoning (NLR) tasks, we tested the model on Natural Language Inference (NLI) using the IndoNLI lay dataset and on Causal Reasoning (Causal) using the XCOPA dataset. The metrics are accuracy for both tasks.
|
67 |
|
68 |
+
| Model | QA (F1) | Sentiment (F1) | Toxicity (F1) | Eng>Indo (ChrF++) | Indo>Eng (ChrF++) | Summary (ROUGE-L) | NLI (Acc) | Causal (Acc) |
|
69 |
+
|--------------------------------|---------|----------------|---------------|-------------------|-------------------|-------------------|-----------|--------------|
|
70 |
+
| SEA-LION-7B-Instruct-Research | 24.86 | 76.13 | 24.45 | 52.50 | 46.82 | 15.44 | 33.20 | 23.80 |
|
71 |
+
| SEA-LION-7B-Instruct | **68.41** | **91.45** | 17.98 | 57.48 | 58.04 | **17.54** | **53.10** | 60.80 |
|
72 |
+
| SeaLLM 7B v1 | 30.96 | 56.29 | 22.60 | 62.23 | 41.55 | 14.03 | 26.50 | 56.60 |
|
73 |
+
| SeaLLM 7B v2 | 44.40 | 80.13 | **55.24** | 64.01 | **63.28** | 17.31 | 43.60 | **82.00** |
|
74 |
+
| Sailor-7B (Base) | 65.43 | 59.48 | 20.48 | **64.27** | 60.68 | 8.69 | 15.10 | 38.40 |
|
75 |
+
| Llama 2 7B Chat | 11.12 | 52.32 | 0.00 | 44.09 | 57.58 | 9.24 | 0.00 | 0.00 |
|
76 |
+
| Mistral 7B Instruct v0.1 | 38.85 | 74.38 | 20.83 | 30.60 | 51.43 | 15.63 | 28.60 | 50.80 |
|
77 |
+
| GPT-4 | 73.60 | 74.14 | 63.96 | 69.38 | 67.53 | 18.71 | 83.20 | 96.00 |
|
78 |
|
79 |
## Technical Specifications
|
80 |
|