khulnasoft commited on
Commit
147d921
1 Parent(s): c7dddba

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -3
README.md CHANGED
@@ -1,3 +1,43 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+ # Bengali Word2Vec Model
6
+ This is a pre-trained word2vec model for Bengali language.
7
+
8
+ This model is build for [bengalinlp](https://github.com/banglawiki/bengalinlp) package.
9
+
10
+ ## Datasets
11
+ - [Wikipedia dump datasets](https://dumps.wikimedia.org/bnwiki/latest/)
12
+
13
+ ## Training details
14
+ - Word2Vec word embedding dimension = 100, min_count=5, window=5, epochs=10
15
+
16
+ ## Usage
17
+ - `pip install -U bengalinlp_toolkit`
18
+ - Generate Vector using pretrain model
19
+
20
+ ```py
21
+ from bengalinlp import BengaliWord2Vec
22
+
23
+ bwv = BengaliWord2Vec()
24
+ model_path = "bengali_word2vec.model"
25
+ word = 'গ্রাম'
26
+ vector = bwv.generate_word_vector(model_path, word)
27
+ print(vector.shape)
28
+ print(vector)
29
+
30
+ ```
31
+
32
+ - Find Most Similar Word Using Pretrained Model
33
+
34
+ ```py
35
+ from bengalinlp import BengaliWord2Vec
36
+
37
+ bwv = BengaliWord2Vec()
38
+ model_path = "bengali_word2vec.model"
39
+ word = 'গ্রাম'
40
+ similar = bwv.most_similar(model_path, word, topn=10)
41
+ print(similar)
42
+
43
+ ```