metadata
license: cc-by-4.0
language:
- ko
tags:
- generation
Model Details
- Model Description: Speech style converter model based on gogamza/kobart-base-v2
- Developed by: Juhwan, Lee and Jisu, Kim, TakSung Heo, and Minsu Jeong
- Model Type: Text-generation
- Language: Korean
- License: CC-BY-4.0
Dataset
- korean SmileStyle Dataset
- Randomly split train/valid dataset (9:1)
BLEU Score
- 25.35
Uses
This model can be used for convert speech style
- formal: λ¬Έμ΄μ²΄
- informal: ꡬμ΄μ²΄
- android: μλλ‘μ΄λ
- azae: μμ¬
- chat: μ±ν
- choding: μ΄λ±νμ
- emoticon: μ΄λͺ¨ν°μ½
- enfp: enfp
- gentle: μ μ¬
- halbae: ν μλ²μ§
- halmae: ν λ¨Έλ
- joongding: μ€νμ
- king: μ
- naruto: λ루ν
- seonbi: μ λΉ
- sosim: μμ¬ν
- translator: λ²μκΈ°
from transformers import pipeline
model = "KoJLabs/bart-speech-style-converter"
tokenizer = AutoTokenizer.from_pretrained(model)
nlg_pipeline = pipeline('text2text-generation',model=model, tokenizer=tokenizer)
styles = ["λ¬Έμ΄μ²΄", "ꡬμ΄μ²΄", "μλλ‘μ΄λ", "μμ¬", "μ±ν
", "μ΄λ±νμ", "μ΄λͺ¨ν°μ½", "enfp", "μ μ¬", "ν μλ²μ§", "ν λ¨Έλ", "μ€νμ", "μ", "λ루ν ", "μ λΉ", "μμ¬ν", "λ²μκΈ°"]
for style in styles:
text = f"{style} νμμΌλ‘ λ³ν:μ€λμ λλ³Άμνμ λ¨Ήμλ€. λ§μμλ€."
out = nlg_pipeline(text, max_length=100)
print(style, out[0]['generated_text'])
Model Source
https://github.com/KoJLabs/speech-style/tree/main
Speech style conversion package
You can exercise korean speech style conversion task with python package KoTAN