denis-berezutskiy-lad
/

lad_transcription_bert_ru_punctuator

Token Classification

Model card Files Files and versions Community

denis-berezutskiy-lad commited on Nov 12, 2023

Commit

c4c521d

•

1 Parent(s): 041913d

Update README layout.

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -16,7 +16,9 @@ pipeline_tag: token-classification
 This is a punctuator/capitalizer model for Russian language, trained via NeMo scripts (https://github.com/NVIDIA/NeMo) on a dataset of continuous professional transcriptions (mostly legislative instances, and some OpenSubtitles as well) - see dataset https://huggingface.co/datasets/denis-berezutskiy-lad/ru_transcription_punctuation for details.
-Note that even though the model was prepaired using NeMo, the standard inference scripts of making result text don't work well with this model, because it has some advanced labels, which require custom handling. That's why a set of ipynb scripts was created (covers both the model training and inference as well as creating the above mentioned dataset): https://github.com/denis-berezutskiy-lad/transcription-bert-ru-punctuator-scripts/tree/main
 The underlying base model is https://huggingface.co/DeepPavlov/rubert-base-cased-conversational

 This is a punctuator/capitalizer model for Russian language, trained via NeMo scripts (https://github.com/NVIDIA/NeMo) on a dataset of continuous professional transcriptions (mostly legislative instances, and some OpenSubtitles as well) - see dataset https://huggingface.co/datasets/denis-berezutskiy-lad/ru_transcription_punctuation for details.
+Note that even though the model was prepaired using NeMo, the standard inference scripts of making result text don't work well with this model, because it has some advanced labels, which require custom handling. That's why a set of ipynb scripts was created (covers both the model training and inference as well as creating the above mentioned dataset):
+https://github.com/denis-berezutskiy-lad/transcription-bert-ru-punctuator-scripts/tree/main
 The underlying base model is https://huggingface.co/DeepPavlov/rubert-base-cased-conversational