indiejoseph commited on
Commit
3bd5708
1 Parent(s): 1983b24

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -22,7 +22,9 @@ should probably proofread and complete it, then remove this comment. -->
22
 
23
  # bart-base-cantonese
24
 
25
- This model was trained from scratch on an unknown dataset.
 
 
26
 
27
  ## Model description
28
 
 
22
 
23
  # bart-base-cantonese
24
 
25
+ This model is a continue pre-train version of [fnlp/bart-base-chinese](https://huggingface.co/fnlp/bart-base-chinese) on filtered Cantonese common crawl datast.
26
+
27
+ This tokenizer has extended the Bert tokenizer from fnlp/bart-base-chinese with 500 more Chinese characters commonly found in Cantonese
28
 
29
  ## Model description
30