deepnight-research commited on
Commit
7888318
1 Parent(s): 681114a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -7
README.md CHANGED
@@ -6,23 +6,32 @@ language:
6
 
7
  # deepnight-research/lil-c3po
8
  <div style="display: flex; justify-content: center; align-items: center;">
9
- <img src="./lil-c3po.jpg" style="width: 100%; max-width: 350px; height: auto;"/></div>
10
 
11
  ## Model Details:
12
- lil-c3po is an open-source large language model (LLM) resulting from the linear merge of two distinct fine-tuned Mistral-7B models, internally referred to as c3-1 and c3-2. These models, developed in-house, bring together unique characteristics to enhance performance and utility.
 
 
13
 
14
  ## Model Architecture:
15
- lil-c3po inherits its architecture from the combined c3-1 and c3-2 models, incorporating features such as Grouped-Query Attention, Sliding-Window Attention, and Byte-fallback BPE tokenizer. This fusion aims to capitalize on the strengths of both models for improved language understanding and generation.
 
 
16
 
17
  ## Training Details:
18
- - The first model, internally referred to as c3-1, is a 7B parameter Large Language Model fine-tuned on the Intel Gaudi 2 processor. It utilizes the Direct Performance Optimization (DPO) method, specifically tailored for Intel architecture, and is designed to excel in various language-related tasks.
19
- - The second model, denoted as c3-2, is an instruct fine-tuned version of Mistral-7B. Its architecture features improvements in instruct fine-tuning, contributing to enhanced language understanding in instructional contexts.
 
 
 
20
 
21
  ## License:
22
  lil-c3po is released under the MIT license, fostering open-source collaboration and innovation.
23
 
24
  ## Intended Use:
25
- This merged model is suitable for a broad range of language-related tasks, inheriting the capabilities of the fine-tuned c3-1 and c3-2 models. Users interested in language tasks can leverage lil-c3po's capabilities.
 
26
 
27
  ## Out-of-Scope Uses:
28
- While lil-c3po is versatile, it is important to note that, in most cases, fine-tuning may be necessary for specific tasks. Additionally, the model should not be used to intentionally create hostile or alienating environments for people.
 
 
6
 
7
  # deepnight-research/lil-c3po
8
  <div style="display: flex; justify-content: center; align-items: center;">
9
+ <img src="./lil-c3po.jpg" style="width: 100%; height: auto;"/></div>
10
 
11
  ## Model Details:
12
+ lil-c3po is an open-source large language model (LLM) resulting from the linear merge of two distinct
13
+ fine-tuned Mistral-7B models, internally referred to as c3-1 and c3-2. These models, developed in-house,
14
+ bring together unique characteristics to enhance performance and utility.
15
 
16
  ## Model Architecture:
17
+ lil-c3po inherits its architecture from the combined c3-1 and c3-2 models,
18
+ incorporating features such as Grouped-Query Attention, Sliding-Window Attention, and Byte-fallback BPE tokenizer.
19
+ This fusion aims to capitalize on the strengths of both models for improved language understanding and generation.
20
 
21
  ## Training Details:
22
+ - The first model, internally referred to as c3-1, is a 7B parameter Large Language Model
23
+ fine-tuned on the Intel Gaudi 2 processor.
24
+ It utilizes the Direct Performance Optimization (DPO) method and is designed to excel in various language-related tasks.
25
+ - The second model, denoted as c3-2, is an instruct fine-tuned version of Mistral-7B.
26
+ Its architecture features improvements in instruct fine-tuning, contributing to enhanced language understanding in instructional contexts.
27
 
28
  ## License:
29
  lil-c3po is released under the MIT license, fostering open-source collaboration and innovation.
30
 
31
  ## Intended Use:
32
+ This merged model is suitable for a broad range of language-related tasks,
33
+ inheriting the capabilities of the fine-tuned c3-1 and c3-2 models. Users interested in language tasks can leverage lil-c3po's capabilities.
34
 
35
  ## Out-of-Scope Uses:
36
+ While lil-c3po is versatile, it is important to note that, in most cases, fine-tuning may be necessary for specific tasks.
37
+ Additionally, the model should not be used to intentionally create hostile or alienating environments for people.