Ayaka commited on
Commit
9f2e214
1 Parent(s): fa25c95

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -2
README.md CHANGED
@@ -18,12 +18,15 @@ co2_eq_emissions:
18
 
19
  This is the Cantonese model of BART base. It is obtained by a second-stage pre-training on the [LIHKG dataset](https://github.com/ayaka14732/lihkg-scraper) based on the [fnlp/bart-base-chinese](https://huggingface.co/fnlp/bart-base-chinese) model.
20
 
21
- **Note**: This model is not the final version and the training is still in progress. Besides, to avoid any copyright issues, please do not use this model for any purpose.
 
 
22
 
23
  ## GitHub Links
24
 
 
25
  - Tokeniser: [ayaka14732/bert-tokenizer-cantonese](https://github.com/ayaka14732/bert-tokenizer-cantonese)
26
- - Model: [ayaka14732/bart-base-jax](https://github.com/ayaka14732/bart-base-jax)
27
 
28
  ## Usage
29
 
@@ -38,3 +41,13 @@ print(output[0]['generated_text'].replace(' ', ''))
38
  ```
39
 
40
  **Note**: Please use the `BertTokenizer` for the model vocabulary. DO NOT use the original `BartTokenizer`.
 
 
 
 
 
 
 
 
 
 
 
18
 
19
  This is the Cantonese model of BART base. It is obtained by a second-stage pre-training on the [LIHKG dataset](https://github.com/ayaka14732/lihkg-scraper) based on the [fnlp/bart-base-chinese](https://huggingface.co/fnlp/bart-base-chinese) model.
20
 
21
+ This project is supported by Cloud TPUs from Google's [TPU Research Cloud](https://sites.research.google/trc/about/) (TRC).
22
+
23
+ **Note**: To avoid any copyright issues, please do not use this model for any purpose.
24
 
25
  ## GitHub Links
26
 
27
+ - Dataset: [ayaka14732/lihkg-scraper](https://github.com/ayaka14732/lihkg-scraper)
28
  - Tokeniser: [ayaka14732/bert-tokenizer-cantonese](https://github.com/ayaka14732/bert-tokenizer-cantonese)
29
+ - Model: [ayaka14732/bart-base-jax#cantonese-pretrain](https://github.com/ayaka14732/bart-base-jax/tree/cantonese-pretrain)
30
 
31
  ## Usage
32
 
 
41
  ```
42
 
43
  **Note**: Please use the `BertTokenizer` for the model vocabulary. DO NOT use the original `BartTokenizer`.
44
+
45
+ ## Training Details
46
+
47
+ - Optimiser: SGD 0.03 + Adaptive Gradient Clipping 0.1
48
+ - Dataset: 172937863 sentences, pad or truncate to 64 tokens
49
+ - Batch size: 640
50
+ - Number of epochs: 7 epochs + 61440 steps
51
+ - Time: 44.0 hours on Google Cloud TPU v4-16
52
+
53
+ WandB link: [`1j7zs802`](https://wandb.ai/ayaka/bart-base-cantonese/runs/1j7zs802)