Semantic Specialization for Knowledge-based Word Sense Disambiguation
- This repository contains the trained model (projection heads) and sense/context embeddings used for training and evaluating the model.
- If you want to learn how to use these files, please refer to the semantic_specialization_for_wsd repository.
Trained Model (Projection Heads)
- File: checkpoints/baseline/last.ckpt
- This is one of the trained models used for reporting the main results (Table 2 in [Mizuki and Okazaki, EACL2023]).
NOTE: Five runs were performed in total. - The main hyperparameters used for training are as follows:
Argument name | Value | Description |
---|---|---|
max_epochs | 15 | Maximum number of training epochs |
cfg_similarity_class.temperature ($\beta^{-1}$) | 0.015625 (=1/64) | Temperature parameter for the contrastive loss |
batch_size ($N_B$) | 256 | Number of samples in each batch for the attract-repel and self-training objectives |
coef_max_pool_margin_loss ($\alpha$) | 0.2 | Coefficient for the self-training loss |
cfg_gloss_projection_head.n_layer | 2 | Number of FFNN layers for the projection heads |
cfg_gloss_projection_head.max_l2_norm_ratio ($\epsilon$) | 0.015 | Hyperparameter for the distance constraint integrated in the projection heads |
Sense/context embeddings
- Directory:
data/bert_embeddings/
- Sense embeddings:
bert-large-cased_WordNet_Gloss_Corpus.hdf5
- Context embeddings for the self-training objective:
bert-large-cased_SemCor.hdf5
- Context embeddings for evaluating the WSD task:
bert-large-cased_WSDEval-ALL.hdf5
Reference
@inproceedings{Mizuki:EACL2023,
title = "Semantic Specialization for Knowledge-based Word Sense Disambiguation",
author = "Mizuki, Sakae and Okazaki, Naoaki",
booktitle = "Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume",
series = {EACL},
month = may,
year = "2023",
address = "Dubrovnik, Croatia",
publisher = "Association for Computational Linguistics",
pages = "3449--3462",
}
- arXiv version is also available.
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no pipeline_tag.