Paper: [Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study](https://arxiv.org/abs/2212.10233) ``` @article{https://doi.org/10.48550/arxiv.2212.10233, doi = {10.48550/ARXIV.2212.10233}, url = {https://arxiv.org/abs/2212.10233}, author = {Wu, Di and Ahmad, Wasi Uddin and Chang, Kai-Wei}, keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, title = {Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study}, publisher = {arXiv}, year = {2022}, copyright = {Creative Commons Attribution 4.0 International} } ``` Pre-training Corpus: [RealNews](https://github.com/rowanz/grover/tree/master/realnews) Pre-training Details: - Resume from bert-base-uncased - Batch size: 512 - Total steps: 250k - Learning rate: 1e-4 - LR schedule: linear with 4k warmup steps - Masking ratio: 15% dynamic masking