Transformers
English
word_sense_disambiguation
Inference Endpoints

Semantic Specialization for Knowledge-based Word Sense Disambiguation

  • This repository contains the trained model (projection heads) and sense/context embeddings used for training and evaluating the model.
  • If you want to learn how to use these files, please refer to the semantic_specialization_for_wsd repository.

Trained Model (Projection Heads)

  • File: checkpoints/baseline/last.ckpt
  • This is one of the trained models used for reporting the main results (Table 2 in [Mizuki and Okazaki, EACL2023]).
    NOTE: Five runs were performed in total.
  • The main hyperparameters used for training are as follows:
Argument name Value Description
max_epochs 15 Maximum number of training epochs
cfg_similarity_class.temperature ($\beta^{-1}$) 0.015625 (=1/64) Temperature parameter for the contrastive loss
batch_size ($N_B$) 256 Number of samples in each batch for the attract-repel and self-training objectives
coef_max_pool_margin_loss ($\alpha$) 0.2 Coefficient for the self-training loss
cfg_gloss_projection_head.n_layer 2 Number of FFNN layers for the projection heads
cfg_gloss_projection_head.max_l2_norm_ratio ($\epsilon$) 0.015 Hyperparameter for the distance constraint integrated in the projection heads

Sense/context embeddings

  • Directory: data/bert_embeddings/
  • Sense embeddings: bert-large-cased_WordNet_Gloss_Corpus.hdf5
  • Context embeddings for the self-training objective: bert-large-cased_SemCor.hdf5
  • Context embeddings for evaluating the WSD task: bert-large-cased_WSDEval-ALL.hdf5

Reference

@inproceedings{Mizuki:EACL2023,
    title     = "Semantic Specialization for Knowledge-based Word Sense Disambiguation",
    author    = "Mizuki, Sakae and Okazaki, Naoaki",
    booktitle = "Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume",
    series = {EACL},
    month = may,
    year = "2023",
    address = "Dubrovnik, Croatia",
    publisher = "Association for Computational Linguistics",
    pages = "3449--3462",
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.