|
--- |
|
language: |
|
- ln |
|
- sw |
|
- yo |
|
library_name: transformers |
|
base_model: |
|
- LLaMAX/LLaMAX3-8B-Alpaca |
|
--- |
|
|
|
# LLaMAX3-8B-Alpaca 4bit |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. |
|
|
|
- **Developed by:** LLaMAX/LLaMAX3-8B-Alpaca |
|
- **Funded by [optional]:** [More Information Needed] |
|
- **Shared by [optional]:** [More Information Needed] |
|
- **Model type:** [More Information Needed] |
|
- **Language(s) (NLP):** [More Information Needed] |
|
- **License:** [More Information Needed] |
|
- **Finetuned from model [optional]:** [More Information Needed] |
|
|
|
### Model Architecture |
|
|
|
```txt |
|
LlamaForCausalLM( |
|
(model): LlamaModel( |
|
(embed_tokens): Embedding(128256, 4096) |
|
(layers): ModuleList( |
|
(0-31): 32 x LlamaDecoderLayer( |
|
(self_attn): LlamaSdpaAttention( |
|
(q_proj): Linear4bit(in_features=4096, out_features=4096, bias=False) |
|
(k_proj): Linear4bit(in_features=4096, out_features=1024, bias=False) |
|
(v_proj): Linear4bit(in_features=4096, out_features=1024, bias=False) |
|
(o_proj): Linear4bit(in_features=4096, out_features=4096, bias=False) |
|
(rotary_emb): LlamaRotaryEmbedding() |
|
) |
|
(mlp): LlamaMLP( |
|
(gate_proj): Linear4bit(in_features=4096, out_features=14336, bias=False) |
|
(up_proj): Linear4bit(in_features=4096, out_features=14336, bias=False) |
|
(down_proj): Linear4bit(in_features=14336, out_features=4096, bias=False) |
|
(act_fn): SiLU() |
|
) |
|
(input_layernorm): LlamaRMSNorm() |
|
(post_attention_layernorm): LlamaRMSNorm() |
|
) |
|
) |
|
(norm): LlamaRMSNorm() |
|
) |
|
(lm_head): Linear(in_features=4096, out_features=128256, bias=False) |
|
) |
|
``` |
|
|
|
### 🔥 Excellent Translation Performance |
|
|
|
LLaMAX3-8B-Alpaca achieves an average spBLEU score improvement of over **5 points** compared to the LLaMA3-8B-Alpaca model on the Flores-101 dataset. |
|
|
|
|
|
| System | Size | en-X (COMET) | en-X (BLEU) | zh-X (COMET)| zh-X (BLEU) | de-X (COMET) | de-X (BLEU) | ne-X (COMET) | ne-X (BLEU) |ar-X (COMET) | ar-X (BLEU) | az-X (COMET) | az-X (BLEU) | ceb-X (COMET) | ceb-X (BLEU)| |
|
|--------------------|------|--------------------|-------------| ----| ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | |
|
| LLaMA3-8B-Alpaca | 8B |67.97|17.23|64.65|10.14|64.67|13.62|62.95|7.96|63.45|11.27|60.61|6.98|55.26|8.52| |
|
| LLaMAX3-8B-Alpaca | 8B |75.52|22.77|73.16|14.43|73.47|18.95|75.13|15.32|72.29|16.42|72.06|12.41|68.88|15.85| |
|
|
|
|
|
| System | Size | X-en (COMET) | X-en (BLEU) | X-zh (COMET)| X-zh (BLEU) | X-de (COMET) | X-de (BLEU) | X-ne (COMET) | X-ne (BLEU) |X-ar (COMET) | X-ar (BLEU) | X-az (COMET) | X-az (BLEU) | X-ceb (COMET) | X-ceb (BLEU) | |
|
|--------------------|------|----------------|-------------| ----| ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |--------------| |
|
| LLaMA3-8B-Alpaca | 8B |77.43|26.55|73.56|13.17|71.59|16.82|46.56|3.83|66.49|10.20|58.30|4.81|52.68|4.18| |
|
| LLaMAX3-8B-Alpaca | 8B |81.28|31.85|78.34|16.46|76.23|20.64|65.83|14.16|75.84|15.45|70.61|9.32|63.35|12.66| |
|
|
|
|
|
### Supported Languages |
|
Akrikaans (af), Amharic (am), Arabic (ar), Armenian (hy), Assamese (as), Asturian (ast), Azerbaijani (az), Belarusian (be), Bengali (bn), Bosnian (bs), Bulgarian (bg), Burmese (my), Catalan (ca), Cebuano (ceb), Chinese Simpl (zho), Chinese Trad (zho), Croatian (hr), Czech (cs), Danish (da), Dutch (nl), English (en), Estonian (et), Filipino (tl), Finnish (fi), French (fr), Fulah (ff), Galician (gl), Ganda (lg), Georgian (ka), German (de), Greek (el), Gujarati (gu), Hausa (ha), Hebrew (he), Hindi (hi), Hungarian (hu), Icelandic (is), Igbo (ig), Indonesian (id), Irish (ga), Italian (it), Japanese (ja), Javanese (jv), Kabuverdianu (kea), Kamba (kam), Kannada (kn), Kazakh (kk), Khmer (km), Korean (ko), Kyrgyz (ky), Lao (lo), Latvian (lv), Lingala (ln), Lithuanian (lt), Luo (luo), Luxembourgish (lb), Macedonian (mk), Malay (ms), Malayalam (ml), Maltese (mt), Maori (mi), Marathi (mr), Mongolian (mn), Nepali (ne), Northern Sotho (ns), Norwegian (no), Nyanja (ny), Occitan (oc), Oriya (or), Oromo (om), Pashto (ps), Persian (fa), Polish (pl), Portuguese (pt), Punjabi (pa), Romanian (ro), Russian (ru), Serbian (sr), Shona (sn), Sindhi (sd), Slovak (sk), Slovenian (sl), Somali (so), Sorani Kurdish (ku), Spanish (es), Swahili (sw), Swedish (sv), Tajik (tg), Tamil (ta), Telugu (te), Thai (th), Turkish (tr), Ukrainian (uk), Umbundu (umb), Urdu (ur), Uzbek (uz), Vietnamese (vi), Welsh (cy), Wolof (wo), Xhosa (xh), Yoruba (yo), Zulu (zu) |