GPT Czech Poet: Generation of Czech Poetic Strophes with Language Models
Abstract
High-quality automated poetry generation systems are currently only available for a small subset of languages. We introduce a new model for generating poetry in Czech language, based on <PRE_TAG>fine-tuning</POST_TAG> a pre-trained <PRE_TAG>Large Language Model</POST_TAG>. We demonstrate that guiding the generation process by explicitly specifying <PRE_TAG>strophe parameters</POST_TAG> within the poem text strongly improves the effectiveness of the model. We also find that appropriate <PRE_TAG>tokenization</POST_TAG> is crucial, showing that <PRE_TAG>tokenization</POST_TAG> methods based on <PRE_TAG>syllables</POST_TAG> or <PRE_TAG>individual characters</POST_TAG> instead of <PRE_TAG>subwords</POST_TAG> prove superior in generating poetic strophes. We further enhance the results by introducing Forced~generation, adding explicit specifications of meter and verse parameters at inference time based on the already generated text. We evaluate a range of setups, showing that our proposed approach achieves high accuracies in rhyming and metric aspects of formal quality of the generated poems.
Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper