JRosenkranz commited on
Commit
ffc8645
1 Parent(s): b64dd1d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -2,26 +2,26 @@
2
  license: apache-2.0
3
  ---
4
 
5
- ## Model Name: Granite-7b-base
6
 
7
- #### License: Apache-2.0
8
 
9
- #### Languages: Primarily English
10
 
11
- #### Architecture: The model architecture is a replica of Meta’s Llama2-7B base variant with MHA, trained with 1M batch size on 2T tokens.
12
 
13
- #### Context Length: 4k tokens
14
 
15
- #### Tokenizer: Llama2
16
 
17
- #### Model Developers: IBM Research
18
 
19
  Representing IBM’s commitment to open source innovation IBM has released granite-7b-base, a base pre-trained LLM from IBM’s Granite model series, under an apache-2.0 license for community and commercial use. Granite-7b-base was pre-trained from scratch on IBM-curated data as an open reference implementation of Meta’s Llama-2-7B. In a commitment to data transparency and fostering open innovation, the data sources, sampling proportions, and URLs for access are provided below.
20
 
21
- #### Pre-Training Data
22
 
23
  The model was trained on 2T tokens, with sampling proportions designed to match the sampling distributions released in the Llama1 paper as closely as possible.
24
 
25
- #### Bias, Risks, and Limitations
26
 
27
  Granite-7b-base is a base model and has not undergone any safety alignment, there it may produce problematic outputs. In the absence of adequate safeguards and RLHF, there exists a risk of malicious utilization of these models for generating disinformation or harmful content. Caution is urged against complete reliance on a specific language model for crucial decisions or impactful information, as preventing these models from fabricating content is not straightforward. Additionally, it remains uncertain whether smaller models might exhibit increased susceptibility to hallucination in ungrounded generation scenarios due to their reduced sizes and memorization capacities. This aspect is currently an active area of research, and we anticipate more rigorous exploration, comprehension, and mitigations in this domain.
 
2
  license: apache-2.0
3
  ---
4
 
5
+ **Model Name**: Granite-7b-base
6
 
7
+ **License**: Apache-2.0
8
 
9
+ **Languages**: Primarily English
10
 
11
+ **Architecture**: The model architecture is a replica of Meta’s Llama2-7B base variant with MHA, trained with 1M batch size on 2T tokens.
12
 
13
+ **Context Length**: 4k tokens
14
 
15
+ **Tokenizer**: Llama2
16
 
17
+ **Model Developers**: IBM Research
18
 
19
  Representing IBM’s commitment to open source innovation IBM has released granite-7b-base, a base pre-trained LLM from IBM’s Granite model series, under an apache-2.0 license for community and commercial use. Granite-7b-base was pre-trained from scratch on IBM-curated data as an open reference implementation of Meta’s Llama-2-7B. In a commitment to data transparency and fostering open innovation, the data sources, sampling proportions, and URLs for access are provided below.
20
 
21
+ **Pre-Training Data**
22
 
23
  The model was trained on 2T tokens, with sampling proportions designed to match the sampling distributions released in the Llama1 paper as closely as possible.
24
 
25
+ **Bias, Risks, and Limitations**
26
 
27
  Granite-7b-base is a base model and has not undergone any safety alignment, there it may produce problematic outputs. In the absence of adequate safeguards and RLHF, there exists a risk of malicious utilization of these models for generating disinformation or harmful content. Caution is urged against complete reliance on a specific language model for crucial decisions or impactful information, as preventing these models from fabricating content is not straightforward. Additionally, it remains uncertain whether smaller models might exhibit increased susceptibility to hallucination in ungrounded generation scenarios due to their reduced sizes and memorization capacities. This aspect is currently an active area of research, and we anticipate more rigorous exploration, comprehension, and mitigations in this domain.