YoLo2000 commited on
Commit
9e9426a
1 Parent(s): 2d4228d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -9,7 +9,7 @@ language:
9
 
10
  # TiLamb-7B(Tibetan Large Language Model Base)
11
 
12
- **TiLamb-7B** 是一款专注于藏文的大型语言模型基座模型,它使用了 26.43GB 的藏文语料库进行开发,并基于 LLaMA2-7B 模型,通过 LoRA 方法进行了增量预训练。该模型在 LLaMA2 的基础上扩展了词表,从原有的词表大小 32,000 扩充藏文词汇至 61,221 ,并对 embedding 和 lm_head 进行了均值扩充初始化。更多信息请访问 [TiLamb-7B GitHub 主页](https://github.com/NLP-Learning/TiLamb)。
13
 
14
  **重要说明**:
15
  - TiLamb-7B 是一个未经监督微调的基座模型,**不具备对话能力**。
@@ -23,7 +23,7 @@ language:
23
 
24
  # TiLamb-7B (Tibetan Large Language Model Base)
25
 
26
- **TiLamb-7B** is a large-scale language model base focused on the Tibetan language, developed using a 26.43GB Tibetan corpus, and incrementally pre-trained through the LoRA method based on the LLaMA2-7B model. This model expands the vocabulary from the original size of 32,000 to 61,221 Tibetan entries, and initializes the embedding and lm_head with mean expansion. For more information, please visit the [TiLamb-7B GitHub page](https://github.com/NLP-Learning/TiLamb).
27
 
28
  **Important Notes**:
29
  - TiLamb-7B is an unsupervised fine-tuned base model, **lacking conversational capabilities**.
 
9
 
10
  # TiLamb-7B(Tibetan Large Language Model Base)
11
 
12
+ **TiLamb-7B** 是藏文大语言模型的基座模型,它使用了 26.43GB 的藏文语料,基于Meta发布的可商用大模型 LLaMA2-7B 模型,通过 LoRA 方法进行了增量预训练。该模型在 LLaMA2 的基础上扩展了词表,从原有的词表大小 32,000 扩充藏文词汇至 61,221 ,并对 LLaMA2-7B 原始模型的 embedding 和 lm_head 进行了均值扩充初始化。更多信息请访问 [TiLamb-7B GitHub 主页](https://github.com/NLP-Learning/TiLamb)。
13
 
14
  **重要说明**:
15
  - TiLamb-7B 是一个未经监督微调的基座模型,**不具备对话能力**。
 
23
 
24
  # TiLamb-7B (Tibetan Large Language Model Base)
25
 
26
+ **TiLamb-7B** is the foundational model for the Tibetan language, utilizing 26.43GB of Tibetan corpora. It's based on Meta's commercially available large model, LLaMA2-7B, and has been incrementally pre-trained using the LoRA method. This model expands on LLaMA2 by enlarging the vocabulary from the original 32,000 to 61,221 Tibetan words and initializes the embedding and lm_head of the original LLaMA2-7B model through mean expansion. For more information, please visit the [TiLamb-7B GitHub page](https://github.com/NLP-Learning/TiLamb).
27
 
28
  **Important Notes**:
29
  - TiLamb-7B is an unsupervised fine-tuned base model, **lacking conversational capabilities**.