Sakae Mizuki
commited on
Commit
·
0a49ec5
1
Parent(s):
96177f0
feat: initial commit
Browse files- .gitattributes +1 -0
- README.md +41 -2
- baseline.ckpt +3 -0
- bert-large-cased_WSDEval-ALL.hdf5 +3 -0
.gitattributes
CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
*.hdf5 filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
@@ -1,3 +1,42 @@
|
|
1 |
-
|
2 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
1 |
+
# Semantic Specialization for Knowledge-based Word Sense Disambiguation
|
2 |
+
* This repository contains the trained model (projection heads) and sense/context embeddings used for training and evaluating the model.
|
3 |
+
* If you want to learn how to use these files, please refer to the [semantic_specialization_for_wsd](https://github.com/s-mizuki-nlp/semantic_specialization_for_wsd) repository.
|
4 |
+
|
5 |
+
## Trained Model (Projection Heads)
|
6 |
+
* File: checkpoints/baseline/last.ckpt
|
7 |
+
* This is one of the trained models used for reporting the main results (Table 2 in [Mizuki and Okazaki, EACL2023]).
|
8 |
+
NOTE: Five runs were performed in total.
|
9 |
+
* The main hyperparameters used for training are as follows:
|
10 |
+
|
11 |
+
| Argument name | Value | Description |
|
12 |
+
|----------------------------------------------------------------|----------------------------|------------------------------------------------------------------------------------|
|
13 |
+
| max_epochs | 15 | Maximum number of training epochs |
|
14 |
+
| cfg_similarity_class.temperature ($\beta^{-1}$) | 0.015625 (=1/64) | Temperature parameter for the contrastive loss |
|
15 |
+
| batch_size ($N_B$) | 256 | Number of samples in each batch for the attract-repel and self-training objectives |
|
16 |
+
| coef_max_pool_margin_loss ($\alpha$) | 0.2 | Coefficient for the self-training loss |
|
17 |
+
| cfg_gloss_projection_head.n_layer | 2 | Number of FFNN layers for the projection heads |
|
18 |
+
| cfg_gloss_projection_head.max_l2_norm_ratio ($\epsilon$) | 0.015 | Hyperparameter for the distance constraint integrated in the projection heads |
|
19 |
+
|
20 |
+
## Sense/context embeddings
|
21 |
+
* Directory: `data/bert_embeddings/`
|
22 |
+
* Sense embeddings: `bert-large-cased_WordNet_Gloss_Corpus.hdf5`
|
23 |
+
* Context embeddings for the self-training objective: `bert-large-cased_SemCor.hdf5`
|
24 |
+
* Context embeddings for evaluating the WSD task: `bert-large-cased_WSDEval-ALL.hdf5`
|
25 |
+
|
26 |
+
# Reference
|
27 |
+
|
28 |
+
```
|
29 |
+
@inproceedings{Mizuki:EACL2023,
|
30 |
+
title = "Semantic Specialization for Knowledge-based Word Sense Disambiguation",
|
31 |
+
author = "Mizuki, Sakae and Okazaki, Naoaki",
|
32 |
+
booktitle = "Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume",
|
33 |
+
series = {EACL},
|
34 |
+
month = may,
|
35 |
+
year = "2023",
|
36 |
+
address = "Dubrovnik, Croatia",
|
37 |
+
publisher = "Association for Computational Linguistics",
|
38 |
+
pages = "3449--3462",
|
39 |
+
}
|
40 |
+
```
|
41 |
+
|
42 |
---
|
baseline.ckpt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d383629f69d5b4e6b199ccf9527ec61c7000bae92cf24272a75ea4d2f0ddfb70
|
3 |
+
size 75634843
|
bert-large-cased_WSDEval-ALL.hdf5
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:766b7202f3d8d647ca40f8cf8e8167145d4a63343a8539b2de065e7600a41eff
|
3 |
+
size 139880832
|