Sentence Similarity
English
txtai
davidmezzetti commited on
Commit
7fca0dc
1 Parent(s): 986d2dd

Update README

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -53,7 +53,7 @@ See this [article](https://neuml.hashnode.dev/embeddings-in-the-cloud) for addit
53
 
54
  ## Evaluation Results
55
 
56
- Performance was evaluated using the [NDCG@10](https://en.wikipedia.org/wiki/Discounted_cumulative_gain) score with a [custom question-answer evaluation set](https://github.com/neuml/txtchat/tree/master/datasets/wikipedia). Results are shown below.
57
 
58
  | Model | NDCG@10 | MAP@10 |
59
  | ---------------------------------------------------------- | ---------- | --------- |
@@ -69,14 +69,14 @@ The following steps show how to build this index. These scripts are using the la
69
 
70
  - Install required build dependencies
71
  ```bash
72
- pip install txtchat mwparserfromhell datasets
73
  ```
74
 
75
  - Download and build pageviews database
76
  ```bash
77
  mkdir -p pageviews/data
78
  wget -P pageviews/data https://dumps.wikimedia.org/other/pageview_complete/monthly/2024/2024-08/pageviews-202408-user.bz2
79
- python -m txtchat.data.wikipedia.views -p en.wikipedia -v pageviews
80
  ```
81
 
82
  - Build Wikipedia dataset
@@ -94,7 +94,7 @@ ds.save_to_disk(f"wikipedia-{date}")
94
 
95
  - Build txtai-wikipedia index
96
  ```bash
97
- python -m txtchat.data.wikipedia.index \
98
  -d wikipedia-20240901 \
99
  -o txtai-wikipedia \
100
  -v pageviews/pageviews.sqlite
 
53
 
54
  ## Evaluation Results
55
 
56
+ Performance was evaluated using the [NDCG@10](https://en.wikipedia.org/wiki/Discounted_cumulative_gain) score with a [custom question-answer evaluation set](https://github.com/neuml/ragdata/tree/master/datasets/wikipedia). Results are shown below.
57
 
58
  | Model | NDCG@10 | MAP@10 |
59
  | ---------------------------------------------------------- | ---------- | --------- |
 
69
 
70
  - Install required build dependencies
71
  ```bash
72
+ pip install ragdata mwparserfromhell
73
  ```
74
 
75
  - Download and build pageviews database
76
  ```bash
77
  mkdir -p pageviews/data
78
  wget -P pageviews/data https://dumps.wikimedia.org/other/pageview_complete/monthly/2024/2024-08/pageviews-202408-user.bz2
79
+ python -m ragdata.wikipedia.views -p en.wikipedia -v pageviews
80
  ```
81
 
82
  - Build Wikipedia dataset
 
94
 
95
  - Build txtai-wikipedia index
96
  ```bash
97
+ python -m ragdata.wikipedia.index \
98
  -d wikipedia-20240901 \
99
  -o txtai-wikipedia \
100
  -v pageviews/pageviews.sqlite