Wikidepia commited on
Commit
43e9a78
1 Parent(s): 34ca7b9

Initial model

Browse files
Files changed (4) hide show
  1. README.md +26 -0
  2. config.json +28 -0
  3. pytorch_model.bin +3 -0
  4. spiece.model +3 -0
README.md ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - id
4
+ datasets:
5
+ - allenai/c4
6
+ ---
7
+ # Indonesian T5Base
8
+
9
+
10
+ T5 (Text-to-Text Transfer Transformer) model pretrained on Indonesian mC4 with [extra filtering](https://github.com/Wikidepia/indonesian_datasets/tree/master/dump/mc4). This model is pre-trained only and needs to be fine-tuned to be used for specific tasks.
11
+
12
+ ## Pretraining Details
13
+
14
+ Trained for 1M steps following [`google/t5-v1_1-base`](https://huggingface.co/google/t5-v1_1-base).
15
+
16
+ ## Model Performance
17
+
18
+ TBD
19
+
20
+ ## Limitations and bias
21
+
22
+ This model also has the problem of biased (unethical, harmful, biased) output results due to the bias of the content of the training data, which is associated with the language model using a large-scale corpus. There is potential. Assuming that this problem may occur, please be careful to use it only for applications that do not cause damage.
23
+
24
+ ## Acknowledgement
25
+
26
+ Thanks to Tensorflow Research Cloud for providing TPU v3-8s.
config.json ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "/home/patrick/hugging_face/t5/t5-v1_1-base",
3
+ "architectures": [
4
+ "T5ForConditionalGeneration"
5
+ ],
6
+ "d_ff": 2048,
7
+ "d_kv": 64,
8
+ "d_model": 768,
9
+ "decoder_start_token_id": 0,
10
+ "dropout_rate": 0.1,
11
+ "eos_token_id": 1,
12
+ "feed_forward_proj": "gated-gelu",
13
+ "gradient_checkpointing": false,
14
+ "initializer_factor": 1.0,
15
+ "is_encoder_decoder": true,
16
+ "layer_norm_epsilon": 1e-06,
17
+ "model_type": "t5",
18
+ "num_decoder_layers": 12,
19
+ "num_heads": 12,
20
+ "num_layers": 12,
21
+ "output_past": true,
22
+ "pad_token_id": 0,
23
+ "relative_attention_num_buckets": 32,
24
+ "tie_word_embeddings": false,
25
+ "transformers_version": "4.8.1",
26
+ "use_cache": true,
27
+ "vocab_size": 32128
28
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7cc5d727bd7c1cd49d95373cd02883377c0166b00ed43c75f3463538d827e6f0
3
+ size 990434381
spiece.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:70d33eab49f262358f962dbf38433dec85c44b71cb05f4b0e23f439c45209218
3
+ size 776904