nassersala
commited on
Commit
•
41e5b9e
1
Parent(s):
693e014
updates readme
Browse files- README.md +23 -3
- config.json +25 -0
- pytorch_model.bin +3 -0
README.md
CHANGED
@@ -1,3 +1,23 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Small BLOOM Model for Functional Testing
|
2 |
+
## Description
|
3 |
+
|
4 |
+
I've reduced the size [bloom](https://huggingface.co/bigscience/bloom) to roughly 0.5GB
|
5 |
+
|
6 |
+
|
7 |
+
This repository hosts a significantly smaller version of the BLOOM model, designed primarily for functional testing purposes. It is an ideal choice for scenarios where computational efficiency and quick iterations are necessary, such as in development and testing environments.
|
8 |
+
|
9 |
+
## Model Details
|
10 |
+
|
11 |
+
The original BLOOM model has been scaled down with the following changes:
|
12 |
+
|
13 |
+
- Number of Layers (n_layer): Reduced to 12 from the original 70 layers.
|
14 |
+
- Hidden Size (hidden_size): Decreased to 512 from the original 14336.
|
15 |
+
- Number of Attention Heads (n_head): Lowered to 8 from the original 112 heads.
|
16 |
+
|
17 |
+
## Intended Use
|
18 |
+
This model is suitable for functional testing and development purposes. It is not recommended for tasks that require high accuracy or complex language understanding and generation capabilities.
|
19 |
+
|
20 |
+
## Disclaimer
|
21 |
+
Please note that due to the significant reductions in size and complexity, this model does not retain the full capabilities of the original BLOOM model. Expect limitations in accuracy and depth of language understanding.
|
22 |
+
|
23 |
+
crafted by Nasser Ali Alzahrani (@nassersala)
|
config.json
ADDED
@@ -0,0 +1,25 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"apply_residual_connection_post_layernorm": false,
|
3 |
+
"architectures": [
|
4 |
+
"BloomForCausalLM"
|
5 |
+
],
|
6 |
+
"attention_dropout": 0.0,
|
7 |
+
"attention_softmax_in_fp32": true,
|
8 |
+
"bos_token_id": 1,
|
9 |
+
"eos_token_id": 2,
|
10 |
+
"hidden_dropout": 0.0,
|
11 |
+
"hidden_size": 512,
|
12 |
+
"initializer_range": 0.02,
|
13 |
+
"layer_norm_epsilon": 1e-05,
|
14 |
+
"masked_softmax_fusion": true,
|
15 |
+
"model_type": "bloom",
|
16 |
+
"n_head": 8,
|
17 |
+
"n_layer": 12,
|
18 |
+
"pad_token_id": 3,
|
19 |
+
"pretraining_tp": 4,
|
20 |
+
"slow_but_exact": false,
|
21 |
+
"torch_dtype": "float32",
|
22 |
+
"transformers_version": "4.24.0",
|
23 |
+
"use_cache": true,
|
24 |
+
"vocab_size": 250880
|
25 |
+
}
|
pytorch_model.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8f4c412dcfe7f4fbe3a659603e7e97ec7778681652ad7d94b4ba5ff416afd9dd
|
3 |
+
size 665171479
|