dchaplinsky commited on
Commit
3fdd67d
1 Parent(s): eb31495

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md CHANGED
@@ -3,8 +3,23 @@ license: mit
3
  tags:
4
  - feature-extraction
5
  library_name: generic
 
 
 
 
6
  ---
7
 
 
 
 
 
 
 
 
 
 
 
 
8
  Usage
9
  ```
10
  import fasttext.util
 
3
  tags:
4
  - feature-extraction
5
  library_name: generic
6
+ datasets:
7
+ - ubertext2.0
8
+ widget:
9
+ - text: "доброго вечора ми з україни"
10
  ---
11
 
12
+ _name_ is pre-trained word vectors for the Ukrainian language, trained with fastText on (yet unreleased) UberText2.0 dataset, released by the [lang-uk](https://lang.org.ua/en/). This model was trained using skipgram in dimension 300, with character n-grams range of 2-5, and 15 negative samples.
13
+
14
+ Our model increases Accuracy by 6.3% compared to the [Facebook Ukrainian word vectors](https://fasttext.cc/docs/en/crawl-vectors.html) on the word analogy task. The dataset for Ukrainian word analogy is available [here](https://github.com/lang-uk/vecs/).
15
+
16
+ Extrinsic evaluations were performed on two sequence labeling tasks: NER and POS tagging. NER-UK dataset was released by the lang-uk, and Ukrainian (UD) corpus was developed by a non-profit organization Institute for Ukrainian.
17
+
18
+ Results:
19
+ 1) spaCy NER F-score 0.818
20
+ 2) POS Flair Accuracy 0.824
21
+ 3) POS spaCy Accuracy 0.911
22
+
23
  Usage
24
  ```
25
  import fasttext.util