--- language: - "fr" tags: - t5 - french - punctuation license: apache-2.0 datasets: - orange_sum - mlsum --- # Text Punctuator Based on Transformers model T5. T5 model fine-tuned for punctuation restoration. Model currently supports only French Language. More language supports will be added later using mT5. Train Datasets : Model trained using 2 french datasets (around 500k records): - [orange_sum](https://huggingface.co/datasets/orange_sum) - [mlsum](https://huggingface.co/datasets/mlsum) (only french text) more info will be added later. --------------------------- ## 🚀 Usage **Below is a quick way to get up and running with the model.** 1. First, install the package. ```bash pip install TextPunctuator ``` 2. Sample python code. ```python from Punctuator import TextPunctuator punctuator = TextPunctuator(use_gpu=False) text = "Sur la base de ces Ă©changes Blake Lemoine a donc jugĂ© que le systĂšme avait atteint un niveau de conscience lui permettant d’ĂȘtre sensible Ce dernier a ensuite envoyĂ© par email un rapport sur la sensibilitĂ© supposĂ©e de LaMDA Ă  deux cents employĂ©s de Google TrĂšs vite les dirigeants de l’entreprise ont rejetĂ© les allĂ©gations" text_punctuated = punctuator.punctuate(text, lang='fr') text_punctuated # Outputs the following: # Sur la base de ces Ă©changes, Blake Lemoine a donc jugĂ© que le systĂšme avait atteint un niveau de conscience lui permettant d’ĂȘtre sensible. Ce dernier a ensuite envoyĂ© par email un rapport sur la sensibilitĂ© supposĂ©e de LaMDA Ă  deux cents employĂ©s de Google. TrĂšs vite, les dirigeants de l’entreprise ont rejetĂ© les allĂ©gations. ``` ----------------------------------------------- ## ☕ Contact Contact [Zakarya ROUZKI ](mailto:zakaryarouzki@gmail.com) or at [Linkedin](https://linkedin.com/in/rouzki). -----------------------------------------------