|
Paper: [Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study](https://arxiv.org/abs/2212.10233) |
|
|
|
``` |
|
@article{https://doi.org/10.48550/arxiv.2212.10233, |
|
doi = {10.48550/ARXIV.2212.10233}, |
|
url = {https://arxiv.org/abs/2212.10233}, |
|
author = {Wu, Di and Ahmad, Wasi Uddin and Chang, Kai-Wei}, |
|
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, |
|
title = {Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study}, |
|
publisher = {arXiv}, |
|
year = {2022}, |
|
copyright = {Creative Commons Attribution 4.0 International} |
|
} |
|
``` |
|
|
|
Pre-training Corpus: [RealNews](https://github.com/rowanz/grover/tree/master/realnews) |
|
|
|
Pre-training Details: |
|
- Resume from bert-base-uncased |
|
- Batch size: 512 |
|
- Total steps: 250k |
|
- Learning rate: 1e-4 |
|
- LR schedule: linear with 4k warmup steps |
|
- Masking ratio: 15% dynamic masking |