dkagramanyan's picture
Update README.md
13574be verified
|
raw
history blame
559 Bytes
---
datasets:
- armvectores/hy_wikipedia_2023
pipeline_tag: feature-extraction
language:
- hy
library_name: fasttext
---
414M tokens
1) 73M hy wikipedia
2) 341M arlis database
74951 unique words
3-5 ngrams
5 window length
300 embedding dim
skipgram
minimum number of words 150
100 epochs, 0.05 start lr
26 hours on 20 xeon gold cores
How to use
1) Install fastText
```
pip install fasttext-wheel
```
2) Import fastText in python
```
import fasttext
model = fasttext.load_model('output.bin')
model.get_nearest_neighbors('զենքեր')
```