Spaces:

mkutarna
/

audiobook_gen

Build error

Matthew Kutarna commited on Nov 29, 2022

Commit

74f2c64

1 Parent(s): f42234c

Streamlit app development (#5)

* Fixed gitignore file

* Project architecture update

* Source code & tests initial work

* Repo clean up, file naming

* Streamlit app creation, testing

* Silero debugging, torch load issues

* Change silero usage to pip install, fixed zip archiving

* Updated README

* Fixed streamlit messages and updated requirements.txt

* Simplified app.py, added instructions.md and config.py

* Moved global variables to config, added voice selection step

* Updated Readme from huggingface spaces

* Fixed config imports, moved config to /src

* Separated epub_gen() from predict()

* Testing work, logging debugging

* Update .gitignore, cleaned up lib imports

* Removed pycache files

* Split write_audio from predict, fixed logging to app.log

* Implemented txt import, more pytest attempts

* HTML and PDF parsing functions implemented

* Added parsers to streamlit app, testing

* Added pdf and htm test files

* Fixed st.upload issues, tested file types

* Fixed gitignore file

* Project architecture update

* Source code & tests initial work

* Repo clean up, file naming

* Streamlit app creation, testing

* Silero debugging, torch load issues

* Change silero usage to pip install, fixed zip archiving

* Updated README

* Fixed streamlit messages and updated requirements.txt

* Simplified app.py, added instructions.md and config.py

* Moved global variables to config, added voice selection step

* Updated Readme from huggingface spaces

* Fixed config imports, moved config to /src

* Separated epub_gen() from predict()

* Testing work, logging debugging

* Update .gitignore, cleaned up lib imports

* Removed pycache files

* Split write_audio from predict, fixed logging to app.log

* Implemented txt import, more pytest attempts

* HTML and PDF parsing functions implemented

* Added parsers to streamlit app, testing

* Added pdf and htm test files

* Fixed st.upload issues, tested file types

* Improved function and file naming, removed unneeded comments, improved app instructions.

* Improved file title handling, audio output clean up

* Unexpected character handling tests

* Voice selection preview created

* Test for preprocess, updating test files

* Epub testing updates

* Update Readme to remove HuggingFace Spaces config

* PDF reading function testing, updating

* PDF reading function completed, tested

* Fixed testing file directory

* Cleaned up notebooks and example test files

* Testing predict function, added test audio tensor

* Cleaned up init.py files

* Updated package versions in GitHub Actions workflow

* Updated package versions in GitHub Actions workflow, correctly

* Testing on read_pdf function

* Updated Readme and Instructions

* Updated Readme with demo screenshot, removed non-functional test

* Fixed Readme typos, linked screenshot

* Linting and misc repo updates

* Added function docstrings

* Module headers added.

* HTML reading WIP

* Testing assemble_zip updated, improved path handling

* Assemble zip, further test updating; tests succeed locally

* Assemble zip, further test updating; tests succeed locally, fixed typos

* Pytest files corrections, np warnings handled

* Further testing work, conditionals tested, tesing running GitHub Actions locally.

* Fixed issues with path handling in output functions

* Solved not a dir error: create dir automatically if does not exist.

* Test for write_audio function completed.

* Testing for generate_audio function complete

* Test for predict function implemented, manually set seed for tests

* Formatting, removed whitespace

* Fixing test_predict, changing tolerance for difference

* Switched to torch.testing.assert_close function for test_predict

* Updates from PR comments; import style, assert style, README instructions, use pathlib instead of os

* Fixed hardcoding of paths, using pathlib paths defined in configs instead

* Testing file equality instead of multiple statements, formatting fixes, fixed load_model test

Former-commit-id: 727b3975d12143bb8d05ad51c7c299a773784b6a

Files changed (38) hide show

.coveragerc +1 -1
.github/workflows/python-app.yml +3 -3
.gitignore +3 -1
README.md +24 -1
app.py +66 -0
models/latest_silero_models.yml +563 -0
notebooks/1232-h.htm +0 -0
notebooks/audiobook_gen_silero.ipynb +28 -387
notebooks/parser_function_html.ipynb +389 -0
notebooks/{pg174.epub → test.epub} +0 -0
notebooks/test.htm +118 -0
data/testfile.txt → outputs/.gitkeep +0 -0
pytest.ini +2 -2
requirements.txt +7 -0
resources/audiobook_gen.png +0 -0
resources/instructions.md +13 -0
resources/speaker_en_0.wav +0 -0
resources/speaker_en_110.wav +0 -0
resources/speaker_en_29.wav +0 -0
resources/speaker_en_41.wav +0 -0
src/__inti__.py +0 -0
src/config.py +23 -0
src/file_readers.py +120 -0
src/output.py +74 -0
src/predict.py +110 -0
tests/__pycache__/test_dummy.cpython-39-pytest-7.1.2.pyc +0 -0
tests/data/test.epub +0 -0
tests/data/test.htm +118 -0
tests/data/test.pdf +0 -0
tests/data/test.txt +19 -0
tests/data/test_audio.pt +0 -0
tests/data/test_predict.pt.REMOVED.git-id +1 -0
tests/data/test_processed.txt +26 -0
tests/test_config.py +9 -0
tests/test_dummy.py +0 -2
tests/test_file_readers.py +46 -0
tests/test_output.py +50 -0
tests/test_predict.py +63 -0

.coveragerc CHANGED Viewed

@@ -1,5 +1,5 @@
-# .coveragec for audiobook_gen
 [run]
 # data_file = put a coverage file name here!!!

+# .coveragerc for audiobook_gen
 [run]
 # data_file = put a coverage file name here!!!

.github/workflows/python-app.yml CHANGED Viewed

@@ -19,14 +19,14 @@ jobs:
     steps:
     - uses: actions/checkout@v3
-    - name: Set up Python 3.10
       uses: actions/setup-python@v3
       with:
-        python-version: "3.10"
     - name: Install dependencies
       run: |
         python -m pip install --upgrade pip
-        pip install flake8 pytest pytest-cov
         if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
     - name: Lint with flake8
       run: |

     steps:
     - uses: actions/checkout@v3
+    - name: Set up Python 3.9.12
       uses: actions/setup-python@v3
       with:
+        python-version: "3.9.12"
     - name: Install dependencies
       run: |
         python -m pip install --upgrade pip
+        pip install flake8 pytest==7.1.3 pytest-cov==3.0.0
         if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
     - name: Lint with flake8
       run: |

.gitignore CHANGED Viewed

@@ -8,7 +8,9 @@ token
 docs/
 conda/
 tmp/
-notesbooks/outputs/
 tags
 *~

 docs/
 conda/
 tmp/
+notebooks/outputs/
+tests/__pycache__/
+tests/.pytest_cache
 tags
 *~

README.md CHANGED Viewed

@@ -1,4 +1,27 @@
 Audiobook Gen
 =============
-Audiobook Gen is a tool that allows the users to generate an audio file of text (e.g. audiobook), read in the voice of the user's choice. It will take in 3 inputs: the desired text for audio generation, as well as a pair of text / audio files for the desired voice.

 Audiobook Gen
 =============
+## Description
+Audiobook Gen is a tool that allows the users to generate an audio file of text (e.g. audiobook), read in the voice of the user's choice. This tool is based on the Silero text-to-speech toolkit and uses Streamlit to deliver the application.
+## Demo
+A demonstration of this tool is hosted at HuggingFace Spaces - see [Audiobook_Gen](https://huggingface.co/spaces/mkutarna/audiobook_gen).
+![Demo Screenshot](https://github.com/mkutarna/audiobook_gen/blob/appdev/resources/audiobook_gen.png "Screenshot")
+#### Instructions
+1. Upload the book file to be converted.
+2. Select the desired voice for the audiobook.
+3. Click to run!
+## Dependencies
+- silero
+- streamlit
+- ebooklib
+- PyPDF2
+- bs4
+- nltk
+- stqdm
+## License
+See [LICENSE](https://github.com/mkutarna/audiobook_gen/blob/master/LICENSE)

app.py ADDED Viewed

	@@ -0,0 +1,66 @@

+import logging
+import streamlit as st
+from src import file_readers, predict, output, config
+logging.basicConfig(filename='app.log',
+                    filemode='w',
+                    format='%(name)s - %(levelname)s - %(message)s',
+                    level=logging.INFO,
+                    force=True)
+st.title('Audiobook Generation Tool')
+text_file = open(config.INSTRUCTIONS, "r")
+readme_text = text_file.read()
+text_file.close()
+st.markdown(readme_text)
+st.header('1. Upload your document')
+uploaded_file = st.file_uploader(
+    label="File types accepted: epub, txt, pdf)",
+    type=['epub', 'txt', 'pdf'])
+model = predict.load_model()
+st.header('2. Please select voice')
+speaker = st.radio('Available voices:', config.SPEAKER_LIST.keys(), horizontal=True)
+audio_path = config.resource_path / f'speaker_{config.SPEAKER_LIST.get(speaker)}.wav'
+audio_file = open(audio_path, 'rb')
+audio_bytes = audio_file.read()
+st.audio(audio_bytes, format='audio/ogg')
+st.header('3. Run the app to generate audio')
+if st.button('Click to run!'):
+    file_ext = uploaded_file.type
+    file_title = uploaded_file.name
+    if file_ext == 'application/epub+zip':
+        text, file_title = file_readers.read_epub(uploaded_file)
+    elif file_ext == 'text/plain':
+        file = uploaded_file.read()
+        text = file_readers.preprocess_text(file)
+    elif file_ext == 'application/pdf':
+        text = file_readers.read_pdf(uploaded_file)
+    else:
+        st.warning('Invalid file type', icon="⚠️")
+    st.success('Reading file complete!')
+    with st.spinner('Generating audio...'):
+        output.generate_audio(text, file_title, model, config.SPEAKER_LIST.get(speaker))
+    st.success('Audio generation complete!')
+    with st.spinner('Building zip file...'):
+        zip_file = output.assemble_zip(file_title)
+        title_name = f'{file_title}.zip'
+    st.success('Zip file prepared!')
+    with open(zip_file, "rb") as fp:
+        btn = st.download_button(
+            label="Download Audiobook",
+            data=fp,
+            file_name=title_name,
+            mime="application/zip"
+        )

models/latest_silero_models.yml ADDED Viewed

	@@ -0,0 +1,563 @@

+# pre-trained STT models
+stt_models:
+  en:
+    latest:
+      meta:
+        name: "en_v6"
+        sample: "https://models.silero.ai/examples/en_sample.wav"
+      labels: "https://models.silero.ai/models/en/en_v1_labels.json"
+      jit: "https://models.silero.ai/models/en/en_v6.jit"
+      onnx: "https://models.silero.ai/models/en/en_v5.onnx"
+      jit_q: "https://models.silero.ai/models/en/en_v6_q.jit"
+      jit_xlarge: "https://models.silero.ai/models/en/en_v6_xlarge.jit"
+      onnx_xlarge: "https://models.silero.ai/models/en/en_v6_xlarge.onnx"
+    v6:
+      meta:
+        name: "en_v6"
+        sample: "https://models.silero.ai/examples/en_sample.wav"
+      labels: "https://models.silero.ai/models/en/en_v1_labels.json"
+      jit: "https://models.silero.ai/models/en/en_v6.jit"
+      onnx: "https://models.silero.ai/models/en/en_v5.onnx"
+      jit_q: "https://models.silero.ai/models/en/en_v6_q.jit"
+      jit_xlarge: "https://models.silero.ai/models/en/en_v6_xlarge.jit"
+      onnx_xlarge: "https://models.silero.ai/models/en/en_v6_xlarge.onnx"
+    v5:
+      meta:
+        name: "en_v5"
+        sample: "https://models.silero.ai/examples/en_sample.wav"
+      labels: "https://models.silero.ai/models/en/en_v1_labels.json"
+      jit: "https://models.silero.ai/models/en/en_v5.jit"
+      onnx: "https://models.silero.ai/models/en/en_v5.onnx"
+      onnx_q: "https://models.silero.ai/models/en/en_v5_q.onnx"
+      jit_q: "https://models.silero.ai/models/en/en_v5_q.jit"
+      jit_xlarge: "https://models.silero.ai/models/en/en_v5_xlarge.jit"
+      onnx_xlarge: "https://models.silero.ai/models/en/en_v5_xlarge.onnx"
+    v4_0:
+      meta:
+        name: "en_v4_0"
+        sample: "https://models.silero.ai/examples/en_sample.wav"
+      labels: "https://models.silero.ai/models/en/en_v1_labels.json"
+      jit_large: "https://models.silero.ai/models/en/en_v4_0_jit_large.model"
+      onnx_large: "https://models.silero.ai/models/en/en_v4_0_large.onnx"
+    v3:
+      meta:
+        name: "en_v3"
+        sample: "https://models.silero.ai/examples/en_sample.wav"
+      labels: "https://models.silero.ai/models/en/en_v1_labels.json"
+      jit: "https://models.silero.ai/models/en/en_v3_jit.model"
+      onnx: "https://models.silero.ai/models/en/en_v3.onnx"
+      jit_q: "https://models.silero.ai/models/en/en_v3_jit_q.model"
+      jit_skip: "https://models.silero.ai/models/en/en_v3_jit_skips.model"
+      jit_large: "https://models.silero.ai/models/en/en_v3_jit_large.model"
+      onnx_large: "https://models.silero.ai/models/en/en_v3_large.onnx"
+      jit_xsmall: "https://models.silero.ai/models/en/en_v3_jit_xsmall.model"
+      jit_q_xsmall: "https://models.silero.ai/models/en/en_v3_jit_q_xsmall.model"
+      onnx_xsmall: "https://models.silero.ai/models/en/en_v3_xsmall.onnx"
+    v2:
+      meta:
+        name: "en_v2"
+        sample: "https://models.silero.ai/examples/en_sample.wav"
+      labels: "https://models.silero.ai/models/en/en_v1_labels.json"
+      jit: "https://models.silero.ai/models/en/en_v2_jit.model"
+      onnx: "https://models.silero.ai/models/en/en_v2.onnx"
+      tf: "https://models.silero.ai/models/en/en_v2_tf.tar.gz"
+    v1:
+      meta:
+        name: "en_v1"
+        sample: "https://models.silero.ai/examples/en_sample.wav"
+      labels: "https://models.silero.ai/models/en/en_v1_labels.json"
+      jit: "https://models.silero.ai/models/en/en_v1_jit.model"
+      onnx: "https://models.silero.ai/models/en/en_v1.onnx"
+      tf: "https://models.silero.ai/models/en/en_v1_tf.tar.gz"
+  de:
+    latest:
+      meta:
+        name: "de_v1"
+        sample: "https://models.silero.ai/examples/de_sample.wav"
+      labels: "https://models.silero.ai/models/de/de_v1_labels.json"
+      jit: "https://models.silero.ai/models/de/de_v1_jit.model"
+      onnx: "https://models.silero.ai/models/de/de_v1.onnx"
+      tf: "https://models.silero.ai/models/de/de_v1_tf.tar.gz"
+    v1:
+      meta:
+        name: "de_v1"
+        sample: "https://models.silero.ai/examples/de_sample.wav"
+      labels: "https://models.silero.ai/models/de/de_v1_labels.json"
+      jit_large: "https://models.silero.ai/models/de/de_v1_jit.model"
+      onnx: "https://models.silero.ai/models/de/de_v1.onnx"
+      tf: "https://models.silero.ai/models/de/de_v1_tf.tar.gz"
+    v3:
+      meta:
+        name: "de_v3"
+        sample: "https://models.silero.ai/examples/de_sample.wav"
+      labels: "https://models.silero.ai/models/de/de_v1_labels.json"
+      jit_large: "https://models.silero.ai/models/de/de_v3_large.jit"
+    v4:
+      meta:
+        name: "de_v4"
+        sample: "https://models.silero.ai/examples/de_sample.wav"
+      labels: "https://models.silero.ai/models/de/de_v1_labels.json"
+      jit_large: "https://models.silero.ai/models/de/de_v4_large.jit"
+      onnx_large: "https://models.silero.ai/models/de/de_v4_large.onnx"
+  es:
+    latest:
+      meta:
+        name: "es_v1"
+        sample: "https://models.silero.ai/examples/es_sample.wav"
+      labels: "https://models.silero.ai/models/es/es_v1_labels.json"
+      jit: "https://models.silero.ai/models/es/es_v1_jit.model"
+      onnx: "https://models.silero.ai/models/es/es_v1.onnx"
+      tf: "https://models.silero.ai/models/es/es_v1_tf.tar.gz"
+  ua:
+    latest:
+      meta:
+        name: "ua_v3"
+        sample: "https://models.silero.ai/examples/ua_sample.wav"
+        credits:
+          datasets:
+            speech-recognition-uk: https://github.com/egorsmkv/speech-recognition-uk
+      labels: "https://models.silero.ai/models/ua/ua_v1_labels.json"
+      jit: "https://models.silero.ai/models/ua/ua_v3_jit.model"
+      jit_q: "https://models.silero.ai/models/ua/ua_v3_jit_q.model"
+      onnx: "https://models.silero.ai/models/ua/ua_v3.onnx"
+    v3:
+      meta:
+        name: "ua_v3"
+        sample: "https://models.silero.ai/examples/ua_sample.wav"
+        credits:
+          datasets:
+            speech-recognition-uk: https://github.com/egorsmkv/speech-recognition-uk
+      labels: "https://models.silero.ai/models/ua/ua_v1_labels.json"
+      jit: "https://models.silero.ai/models/ua/ua_v3_jit.model"
+      jit_q: "https://models.silero.ai/models/ua/ua_v3_jit_q.model"
+      onnx: "https://models.silero.ai/models/ua/ua_v3.onnx"
+    v1:
+      meta:
+        name: "ua_v1"
+        sample: "https://models.silero.ai/examples/ua_sample.wav"
+        credits:
+          datasets:
+            speech-recognition-uk: https://github.com/egorsmkv/speech-recognition-uk
+      labels: "https://models.silero.ai/models/ua/ua_v1_labels.json"
+      jit: "https://models.silero.ai/models/ua/ua_v1_jit.model"
+      jit_q: "https://models.silero.ai/models/ua/ua_v1_jit_q.model"
+tts_models:
+  ru:
+    v3_1_ru:
+      latest:
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        package: 'https://models.silero.ai/models/tts/ru/v3_1_ru.pt'
+        sample_rate: [8000, 24000, 48000]
+    ru_v3:
+      latest:
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        package: 'https://models.silero.ai/models/tts/ru/ru_v3.pt'
+        sample_rate: [8000, 24000, 48000]
+    aidar_v2:
+      latest:
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        package: 'https://models.silero.ai/models/tts/ru/v2_aidar.pt'
+        sample_rate: [8000, 16000]
+    aidar_8khz:
+      latest:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_aidar_8000.jit'
+        sample_rate: 8000
+      v1:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_aidar_8000.jit'
+        sample_rate: 8000
+    aidar_16khz:
+      latest:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_aidar_16000.jit'
+        sample_rate: 16000
+      v1:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_aidar_16000.jit'
+        sample_rate: 16000
+    baya_v2:
+      latest:
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        package: 'https://models.silero.ai/models/tts/ru/v2_baya.pt'
+        sample_rate: [8000, 16000]
+    baya_8khz:
+      latest:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_baya_8000.jit'
+        sample_rate: 8000
+      v1:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_baya_8000.jit'
+        sample_rate: 8000
+    baya_16khz:
+      latest:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_baya_16000.jit'
+        sample_rate: 16000
+      v1:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_baya_16000.jit'
+        sample_rate: 16000
+    irina_v2:
+      latest:
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        package: 'https://models.silero.ai/models/tts/ru/v2_irina.pt'
+        sample_rate: [8000, 16000]
+    irina_8khz:
+      latest:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_irina_8000.jit'
+        sample_rate: 8000
+      v1:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_irina_8000.jit'
+        sample_rate: 8000
+    irina_16khz:
+      latest:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_irina_16000.jit'
+        sample_rate: 16000
+      v1:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_irina_16000.jit'
+        sample_rate: 16000
+    kseniya_v2:
+      latest:
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        package: 'https://models.silero.ai/models/tts/ru/v2_kseniya.pt'
+        sample_rate: [8000, 16000]
+    kseniya_8khz:
+      latest:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_kseniya_8000.jit'
+        sample_rate: 8000
+      v1:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_kseniya_8000.jit'
+        sample_rate: 8000
+    kseniya_16khz:
+      latest:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_kseniya_16000.jit'
+        sample_rate: 16000
+      v1:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_kseniya_16000.jit'
+        sample_rate: 16000
+    natasha_v2:
+      latest:
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        package: 'https://models.silero.ai/models/tts/ru/v2_natasha.pt'
+        sample_rate: [8000, 16000]
+    natasha_8khz:
+      latest:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_natasha_8000.jit'
+        sample_rate: 8000
+      v1:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_natasha_8000.jit'
+        sample_rate: 8000
+    natasha_16khz:
+      latest:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_natasha_16000.jit'
+        sample_rate: 16000
+      v1:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_natasha_16000.jit'
+        sample_rate: 16000
+    ruslan_v2:
+      latest:
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        package: 'https://models.silero.ai/models/tts/ru/v2_ruslan.pt'
+        sample_rate: [8000, 16000]
+    ruslan_8khz:
+      latest:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_ruslan_8000.jit'
+        sample_rate: 8000
+      v1:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_ruslan_8000.jit'
+        sample_rate: 8000
+    ruslan_16khz:
+      latest:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_ruslan_16000.jit'
+        sample_rate: 16000
+      v1:
+        tokenset: '_~абвгдеёжзийклмнопрстуфхцчшщъыьэюя +.,!?…:;–'
+        example: 'В н+едрах т+ундры в+ыдры в г+етрах т+ырят в в+ёдра +ядра к+едров.'
+        jit: 'https://models.silero.ai/models/tts/ru/v1_ruslan_16000.jit'
+        sample_rate: 16000
+  en:
+    v3_en:
+      latest:
+        example: 'Can you can a canned can into an un-canned can like a canner can can a canned can into an un-canned can?'
+        package: 'https://models.silero.ai/models/tts/en/v3_en.pt'
+        sample_rate: [8000, 24000, 48000]
+    v3_en_indic:
+      latest:
+        example: 'Can you can a canned can into an un-canned can like a canner can can a canned can into an un-canned can?'
+        package: 'https://models.silero.ai/models/tts/en/v3_en_indic.pt'
+        sample_rate: [8000, 24000, 48000]
+    lj_v2:
+      latest:
+        example: 'Can you can a canned can into an un-canned can like a canner can can a canned can into an un-canned can?'
+        package: 'https://models.silero.ai/models/tts/en/v2_lj.pt'
+        sample_rate: [8000, 16000]
+    lj_8khz:
+      latest:
+        tokenset: '_~abcdefghijklmnopqrstuvwxyz .,!?…:;–'
+        example: 'Can you can a canned can into an un-canned can like a canner can can a canned can into an un-canned can?'
+        jit: 'https://models.silero.ai/models/tts/en/v1_lj_8000.jit'
+        sample_rate: 8000
+      v1:
+        tokenset: '_~abcdefghijklmnopqrstuvwxyz .,!?…:;–'
+        example: 'Can you can a canned can into an un-canned can like a canner can can a canned can into an un-canned can?'
+        jit: 'https://models.silero.ai/models/tts/en/v1_lj_8000.jit'
+        sample_rate: 8000
+    lj_16khz:
+      latest:
+        tokenset: '_~abcdefghijklmnopqrstuvwxyz .,!?…:;–'
+        example: 'Can you can a canned can into an un-canned can like a canner can can a canned can into an un-canned can?'
+        jit: 'https://models.silero.ai/models/tts/en/v1_lj_16000.jit'
+        sample_rate: 16000
+      v1:
+        tokenset: '_~abcdefghijklmnopqrstuvwxyz .,!?…:;–'
+        example: 'Can you can a canned can into an un-canned can like a canner can can a canned can into an un-canned can?'
+        jit: 'https://models.silero.ai/models/tts/en/v1_lj_16000.jit'
+        sample_rate: 16000
+  de:
+    v3_de:
+      latest:
+        example: 'Fischers Fritze fischt frische Fische, Frische Fische fischt Fischers Fritze.'
+        package: 'https://models.silero.ai/models/tts/de/v3_de.pt'
+        sample_rate: [8000, 24000, 48000]
+    thorsten_v2:
+      latest:
+        example: 'Fischers Fritze fischt frische Fische, Frische Fische fischt Fischers Fritze.'
+        package: 'https://models.silero.ai/models/tts/de/v2_thorsten.pt'
+        sample_rate: [8000, 16000]
+    thorsten_8khz:
+      latest:
+        tokenset: '_~abcdefghijklmnopqrstuvwxyzäöüß .,!?…:;–'
+        example: 'Fischers Fritze fischt frische Fische, Frische Fische fischt Fischers Fritze.'
+        jit: 'https://models.silero.ai/models/tts/de/v1_thorsten_8000.jit'
+        sample_rate: 8000
+      v1:
+        tokenset: '_~abcdefghijklmnopqrstuvwxyzäöüß .,!?…:;–'
+        example: 'Fischers Fritze fischt frische Fische, Frische Fische fischt Fischers Fritze.'
+        jit: 'https://models.silero.ai/models/tts/de/v1_thorsten_8000.jit'
+        sample_rate: 8000
+    thorsten_16khz:
+      latest:
+        tokenset: '_~abcdefghijklmnopqrstuvwxyzäöüß .,!?…:;–'
+        example: 'Fischers Fritze fischt frische Fische, Frische Fische fischt Fischers Fritze.'
+        jit: 'https://models.silero.ai/models/tts/de/v1_thorsten_16000.jit'
+        sample_rate: 16000
+      v1:
+        tokenset: '_~abcdefghijklmnopqrstuvwxyzäöüß .,!?…:;–'
+        example: 'Fischers Fritze fischt frische Fische, Frische Fische fischt Fischers Fritze.'
+        jit: 'https://models.silero.ai/models/tts/de/v1_thorsten_16000.jit'
+        sample_rate: 16000
+  es:
+    v3_es:
+      latest:
+        example: 'Hoy ya es ayer y ayer ya es hoy, ya llegó el día, y hoy es hoy.'
+        package: 'https://models.silero.ai/models/tts/es/v3_es.pt'
+        sample_rate: [8000, 24000, 48000]
+    tux_v2:
+      latest:
+        example: 'Hoy ya es ayer y ayer ya es hoy, ya llegó el día, y hoy es hoy.'
+        package: 'https://models.silero.ai/models/tts/es/v2_tux.pt'
+        sample_rate: [8000, 16000]
+    tux_8khz:
+      latest:
+        tokenset: '_~abcdefghijklmnopqrstuvwxyzáéíñóú .,!?…:;–¡¿'
+        example: 'Hoy ya es ayer y ayer ya es hoy, ya llegó el día, y hoy es hoy.'
+        jit: 'https://models.silero.ai/models/tts/es/v1_tux_8000.jit'
+        sample_rate: 8000
+      v1:
+        tokenset: '_~abcdefghijklmnopqrstuvwxyzáéíñóú .,!?…:;–¡¿'
+        example: 'Hoy ya es ayer y ayer ya es hoy, ya llegó el día, y hoy es hoy.'
+        jit: 'https://models.silero.ai/models/tts/es/v1_tux_8000.jit'
+        sample_rate: 8000
+    tux_16khz:
+      latest:
+        tokenset: '_~abcdefghijklmnopqrstuvwxyzáéíñóú .,!?…:;–¡¿'
+        example: 'Hoy ya es ayer y ayer ya es hoy, ya llegó el día, y hoy es hoy.'
+        jit: 'https://models.silero.ai/models/tts/es/v1_tux_16000.jit'
+        sample_rate: 16000
+      v1:
+        tokenset: '_~abcdefghijklmnopqrstuvwxyzáéíñóú .,!?…:;–¡¿'
+        example: 'Hoy ya es ayer y ayer ya es hoy, ya llegó el día, y hoy es hoy.'
+        jit: 'https://models.silero.ai/models/tts/es/v1_tux_16000.jit'
+        sample_rate: 16000
+  fr:
+    v3_fr:
+      latest:
+        example: 'Je suis ce que je suis, et si je suis ce que je suis, qu’est ce que je suis.'
+        package: 'https://models.silero.ai/models/tts/fr/v3_fr.pt'
+        sample_rate: [8000, 24000, 48000]
+    gilles_v2:
+      latest:
+        example: 'Je suis ce que je suis, et si je suis ce que je suis, qu’est ce que je suis.'
+        package: 'https://models.silero.ai/models/tts/fr/v2_gilles.pt'
+        sample_rate: [8000, 16000]
+    gilles_8khz:
+      latest:
+        tokenset: '_~abcdefghijklmnopqrstuvwxyzéàèùâêîôûç .,!?…:;–'
+        example: 'Je suis ce que je suis, et si je suis ce que je suis, qu’est ce que je suis.'
+        jit: 'https://models.silero.ai/models/tts/fr/v1_gilles_8000.jit'
+        sample_rate: 8000
+      v1:
+        tokenset: '_~abcdefghijklmnopqrstuvwxyzéàèùâêîôûç .,!?…:;–'
+        example: 'Je suis ce que je suis, et si je suis ce que je suis, qu’est ce que je suis.'
+        jit: 'https://models.silero.ai/models/tts/fr/v1_gilles_8000.jit'
+        sample_rate: 8000
+    gilles_16khz:
+      latest:
+        tokenset: '_~abcdefghijklmnopqrstuvwxyzéàèùâêîôûç .,!?…:;–'
+        example: 'Je suis ce que je suis, et si je suis ce que je suis, qu’est ce que je suis.'
+        jit: 'https://models.silero.ai/models/tts/fr/v1_gilles_16000.jit'
+        sample_rate: 16000
+      v1:
+        tokenset: '_~abcdefghijklmnopqrstuvwxyzéàèùâêîôûç .,!?…:;–'
+        example: 'Je suis ce que je suis, et si je suis ce que je suis, qu’est ce que je suis.'
+        jit: 'https://models.silero.ai/models/tts/fr/v1_gilles_16000.jit'
+        sample_rate: 16000
+  ba:
+    aigul_v2:
+      latest:
+        example: 'Салауат Юлаевтың тормошо һәм яҙмышы хаҡындағы документтарҙың һәм шиғри әҫәрҙәренең бик аҙ өлөшө генә һаҡланған.'
+        package: 'https://models.silero.ai/models/tts/ba/v2_aigul.pt'
+        sample_rate: [8000, 16000]
+        language_name: 'bashkir'
+  xal:
+    v3_xal:
+      latest:
+        example: 'Һорвн, дөрвн күн ирәд, һазань чиңгнв. Байн Цецн хаана һорвн көвүн күүндҗәнә.'
+        package: 'https://models.silero.ai/models/tts/xal/v3_xal.pt'
+        sample_rate: [8000, 24000, 48000]
+    erdni_v2:
+      latest:
+        example: 'Һорвн, дөрвн күн ирәд, һазань чиңгнв. Байн Цецн хаана һорвн көвүн күүндҗәнә.'
+        package: 'https://models.silero.ai/models/tts/xal/v2_erdni.pt'
+        sample_rate: [8000, 16000]
+        language_name: 'kalmyk'
+  tt:
+    v3_tt:
+      latest:
+        example: 'Исәнмесез, саумысез, нишләп кәҗәгезне саумыйсыз, әтәчегез күкәй салган, нишләп чыгып алмыйсыз.'
+        package: 'https://models.silero.ai/models/tts/tt/v3_tt.pt'
+        sample_rate: [8000, 24000, 48000]
+    dilyara_v2:
+      latest:
+        example: 'Ис+әнмесез, с+аумысез, нишл+әп кәҗәгезн+е с+аумыйсыз, әтәчег+ез күк+әй салг+ан, нишл+әп чыг+ып +алмыйсыз.'
+        package: 'https://models.silero.ai/models/tts/tt/v2_dilyara.pt'
+        sample_rate: [8000, 16000]
+        language_name: 'tatar'
+  uz:
+    v3_uz:
+      latest:
+        example: 'Tanishganimdan xursandman.'
+        package: 'https://models.silero.ai/models/tts/uz/v3_uz.pt'
+        sample_rate: [8000, 24000, 48000]
+    dilnavoz_v2:
+      latest:
+        example: 'Tanishganimdan xursandman.'
+        package: 'https://models.silero.ai/models/tts/uz/v2_dilnavoz.pt'
+        sample_rate: [8000, 16000]
+        language_name: 'uzbek'
+  ua:
+    v3_ua:
+      latest:
+        example: 'К+отики - пухн+асті жив+отики.'
+        package: 'https://models.silero.ai/models/tts/ua/v3_ua.pt'
+        sample_rate: [8000, 24000, 48000]
+    mykyta_v2:
+      latest:
+        example: 'К+отики - пухн+асті жив+отики.'
+        package: 'https://models.silero.ai/models/tts/ua/v22_mykyta_48k.pt'
+        sample_rate: [8000, 24000, 48000]
+        language_name: 'ukrainian'
+  indic:
+    v3_indic:
+      latest:
+        example: 'prasidda kabīra adhyētā, puruṣōttama agravāla kā yaha śōdha ālēkha, usa rāmānaṁda kī khōja karatā hai'
+        package: 'https://models.silero.ai/models/tts/indic/v3_indic.pt'
+        sample_rate: [8000, 24000, 48000]
+  multi:
+    multi_v2:
+      latest:
+        package: 'https://models.silero.ai/models/tts/multi/v2_multi.pt'
+        sample_rate: [8000, 16000]
+        speakers:
+          aidar:
+            lang: 'ru'
+            example: 'Съ+ешьте ещ+ё +этих м+ягких франц+узских б+улочек, д+а в+ыпейте ч+аю.'
+          baya:
+            lang: 'ru'
+            example: 'Съ+ешьте ещ+ё +этих м+ягких франц+узских б+улочек, д+а в+ыпейте ч+аю.'
+          kseniya:
+            lang: 'ru'
+            example: 'Съ+ешьте ещ+ё +этих м+ягких франц+узских б+улочек, д+а в+ыпейте ч+аю.'
+          irina:
+            lang: 'ru'
+            example: 'Съ+ешьте ещ+ё +этих м+ягких франц+узских б+улочек, д+а в+ыпейте ч+аю.'
+          ruslan:
+            lang: 'ru'
+            example: 'Съ+ешьте ещ+ё +этих м+ягких франц+узских б+улочек, д+а в+ыпейте ч+аю.'
+          natasha:
+            lang: 'ru'
+            example: 'Съ+ешьте ещ+ё +этих м+ягких франц+узских б+улочек, д+а в+ыпейте ч+аю.'
+          thorsten:
+            lang: 'de'
+            example: 'Fischers Fritze fischt frische Fische, Frische Fische fischt Fischers Fritze.'
+          tux:
+            lang: 'es'
+            example: 'Hoy ya es ayer y ayer ya es hoy, ya llegó el día, y hoy es hoy.'
+          gilles:
+            lang: 'fr'
+            example: 'Je suis ce que je suis, et si je suis ce que je suis, qu’est ce que je suis.'
+          lj:
+            lang: 'en'
+            example: 'Can you can a canned can into an un-canned can like a canner can can a canned can into an un-canned can?'
+          dilyara:
+            lang: 'tt'
+            example: 'Пес+и пес+и песик+әй, борыннар+ы бәләк+әй.'
+te_models:
+  latest:
+    package: "https://models.silero.ai/te_models/v2_4lang_q.pt"
+    languages: ['en', 'de', 'ru', 'es']
+    punct: '.,-!?—'
+  v2:
+    package: "https://models.silero.ai/te_models/v2_4lang_q.pt"
+    languages: ['en', 'de', 'ru', 'es']
+    punct: '.,-!?—'

notebooks/1232-h.htm ADDED Viewed

The diff for this file is too large to render. See raw diff

notebooks/audiobook_gen_silero.ipynb CHANGED Viewed

@@ -45,7 +45,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 1,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -79,11 +79,11 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 2,
    "metadata": {},
    "outputs": [],
    "source": [
-    "max_char_len = 150\n",
     "sample_rate = 24000"
    ]
   },
@@ -122,11 +122,11 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 4,
    "metadata": {},
    "outputs": [],
    "source": [
-    "ebook_path = 'pg174.epub'"
    ]
   },
   {
@@ -144,7 +144,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 5,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -198,24 +198,9 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 6,
    "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "7c7a0d27b2984cac933f97c68905d393",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/28 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    }
-   ],
    "source": [
     "ebook, title = read_ebook(ebook_path)"
    ]
@@ -229,24 +214,24 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 7,
    "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Title of ebook (path name):the_picture_of_dorian_gray\n",
-      "First paragraph (truncated for display): \n",
-      " ['CHAPTER I.', 'The studio was filled with the rich odour of roses, and when the light summer wind stirred amidst the trees of the garden, there came through the open', 'door the heavy scent of the lilac, or the more delicate perfume of the pink-flowering thorn.', 'From the corner of the divan of Persian saddle-bags on which he was lying, smoking, as was his custom, innumerable cigarettes, Lord Henry Wotton could', 'just catch the gleam of the honey-sweet and honey-coloured blossoms of a laburnum, whose tremulous branches seemed hardly able to bear the burden of a']\n"
-     ]
-    }
-   ],
    "source": [
-    "print(f'Title of ebook (path name):{title}')\n",
     "print(f'First paragraph (truncated for display): \\n {ebook[2][0:5]}')"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -260,357 +245,13 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 8,
    "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "4dd296c9abb941d6817b8d5c075b0c7c",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/23 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "a7e3d37537f9495b93a092bd2125bb15",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/38 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "a16db529b19d4e86b79e056106cfa5c1",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/36 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "324fac7d6d7d44a9b38f7fc9cddb7abb",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/383 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "8b5e5cfc28da4e1d996a39d0b9254c57",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/517 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "95408c5358d64cff8cadf82d3b34d18e",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/385 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "e3322b1b54da4c949c5ad708044c84e3",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/491 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "41f7e5fda2f24e079be210224b36ff63",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/440 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "007f22618ee140058eff80b29e86501e",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/254 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "84c45b18ed994291b28ee259f2610019",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/419 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "d9f08d28db034576a6c8d3a1ef9c7e83",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/463 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "72e658ae2a2c4c76967aa6e8f8fc5cf5",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/361 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "a9c88220cfda402a9dbfd5cf3d8f0f46",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/253 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "229cd86f85d1458887b0a80c758c8dcc",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/401 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "b7d361cc2287451d886bae67da5151a9",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/256 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "c93a785804ea461398046c2cae64db00",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/233 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "269e5f35fd064d2888e62a3f6f34bdf0",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/405 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "69469c02db574342a4bb434cf80f422b",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/279 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "6979dd4c8479420198686d5a44c00887",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/275 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "356e20b91da44f93a0d7ad220fb00e79",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/216 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "1d73c481c8ef45d1add550bdbf278775",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/323 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "ae4e72fab47c4f60bb097a7dc5bca43e",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/352 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "dd1a2bed58474658977fcb6d7fa06ab1",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "  0%|          | 0/374 [00:00<?, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "c5deadfd7a9a4861868632f754c8bbc9",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "0it [00:00, ?it/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Chapter chapter022 is empty.\n"
-     ]
-    }
-   ],
    "source": [
-    "os.mkdir(f'outputs/{title}')\n",
     "\n",
-    "for chapter in tqdm(ebook):\n",
     "    chapter_index = f'chapter{ebook.index(chapter):03}'\n",
     "    audio_list = []\n",
     "    for sentence in tqdm(chapter):\n",
@@ -626,7 +267,7 @@
     "\n",
     "    if len(audio_list) > 0:\n",
     "        audio_file = torch.cat(audio_list).reshape(1, -1)\n",
-    "        torchaudio.save(sample_path, audio_file, sample_rate)\n",
     "    else:\n",
     "        print(f'Chapter {chapter_index} is empty.')"
    ]
@@ -672,7 +313,7 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },
@@ -686,7 +327,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.9.12"
   }
  },
  "nbformat": 4,

   },
   {
    "cell_type": "code",
+   "execution_count": null,
    "metadata": {},
    "outputs": [],
    "source": [
   },
   {
    "cell_type": "code",
+   "execution_count": null,
    "metadata": {},
    "outputs": [],
    "source": [
+    "max_char_len = 140\n",
     "sample_rate = 24000"
    ]
   },
   },
   {
    "cell_type": "code",
+   "execution_count": null,
    "metadata": {},
    "outputs": [],
    "source": [
+    "ebook_path = 'test.epub'"
    ]
   },
   {
   },
   {
    "cell_type": "code",
+   "execution_count": null,
    "metadata": {},
    "outputs": [],
    "source": [
   },
   {
    "cell_type": "code",
+   "execution_count": null,
    "metadata": {},
+   "outputs": [],
    "source": [
     "ebook, title = read_ebook(ebook_path)"
    ]
   },
   {
    "cell_type": "code",
+   "execution_count": null,
    "metadata": {},
+   "outputs": [],
    "source": [
+    "print(f'Title of ebook (path name):{title}\\n')\n",
+    "print(f'First line of the ebook:{ebook[0][0]}\\n')\n",
     "print(f'First paragraph (truncated for display): \\n {ebook[2][0:5]}')"
    ]
   },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "ebook[0][0]"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
   },
   {
    "cell_type": "code",
+   "execution_count": null,
    "metadata": {},
+   "outputs": [],
    "source": [
+    "#os.mkdir(f'outputs/{title}')\n",
     "\n",
+    "for chapter in tqdm(ebook[0:3]):\n",
     "    chapter_index = f'chapter{ebook.index(chapter):03}'\n",
     "    audio_list = []\n",
     "    for sentence in tqdm(chapter):\n",
     "\n",
     "    if len(audio_list) > 0:\n",
     "        audio_file = torch.cat(audio_list).reshape(1, -1)\n",
+    "#         torchaudio.save(sample_path, audio_file, sample_rate)\n",
     "    else:\n",
     "        print(f'Chapter {chapter_index} is empty.')"
    ]
  ],
  "metadata": {
   "kernelspec": {
+   "display_name": "Python 3",
    "language": "python",
    "name": "python3"
   },
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
+   "version": "3.8.10"
   }
  },
  "nbformat": 4,

notebooks/parser_function_html.ipynb ADDED Viewed

	@@ -0,0 +1,389 @@

+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "27a75ece",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import nltk"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5292a160",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import re\n",
+    "import numpy as np\n",
+    "\n",
+    "from bs4 import BeautifulSoup\n",
+    "from nltk import tokenize, download\n",
+    "from textwrap import TextWrapper"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "68609a77",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# file_path = '1232-h.htm'\n",
+    "file_path = 'test.htm'"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5c526c9b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "download('punkt', quiet=True)\n",
+    "wrapper = TextWrapper(140, fix_sentence_endings=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d4732304",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def preprocess(file):\n",
+    "    input_text = BeautifulSoup(file, \"html.parser\").text\n",
+    "    text_list = []\n",
+    "    for paragraph in input_text.split('\\n'):\n",
+    "        paragraph = paragraph.replace('—', '-')\n",
+    "        paragraph = paragraph.replace(' .', '')\n",
+    "        paragraph = re.sub(r'[^\\x00-\\x7f]', \"\", paragraph)\n",
+    "        paragraph = re.sub(r'x0f', \" \", paragraph)\n",
+    "        sentences = tokenize.sent_tokenize(paragraph)\n",
+    "\n",
+    "        sentence_list = []\n",
+    "        for sentence in sentences:\n",
+    "            if not re.search('[a-zA-Z]', sentence):\n",
+    "                sentence = ''\n",
+    "            wrapped_sentences = wrapper.wrap(sentence)\n",
+    "            sentence_list.append(wrapped_sentences)\n",
+    "        trunc_sentences = [phrase for sublist in sentence_list for phrase in sublist]\n",
+    "        text_list.append(trunc_sentences)\n",
+    "    text_list = [text for sentences in text_list for text in sentences]\n",
+    "    return text_list"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3045665a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def read_html(file):\n",
+    "    corpus = preprocess(file)\n",
+    "    return corpus"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e18be118",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "with open(file_path, 'r') as f:\n",
+    "    ebook_upload = f.read()\n",
+    "corpus = read_html(ebook_upload)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ece1c7d3",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "np.shape(corpus)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "dc7e4010",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "corpus[0][2]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6cb47a2d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "corpus"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d11031c7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "assert title == \"1232-h\"\n",
+    "assert np.shape(corpus) == (1, 5476)\n",
+    "assert corpus[0][0] == 'The Project Gutenberg eBook of The Prince, by Nicolo Machiavelli'\n",
+    "assert corpus[0][2] == 'This eBook is for the use of anyone anywhere in the United States and'"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0c57eec6",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "af281267",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import re\n",
+    "\n",
+    "from bs4 import BeautifulSoup\n",
+    "from nltk import tokenize, download\n",
+    "from textwrap import TextWrapper\n",
+    "from stqdm import stqdm"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "676ce437",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "download('punkt', quiet=True)\n",
+    "wrapper = TextWrapper(140, fix_sentence_endings=True)\n",
+    "file_path = 'test.txt'"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "4d278f8e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def preprocess_text(file):\n",
+    "    input_text = BeautifulSoup(file, \"html.parser\").text\n",
+    "    text_list = []\n",
+    "    for paragraph in input_text.split('\\n'):\n",
+    "        paragraph = paragraph.replace('—', '-')\n",
+    "        paragraph = paragraph.replace(' .', '')\n",
+    "        paragraph = re.sub(r'[^\\x00-\\x7f]', \"\", paragraph)\n",
+    "        paragraph = re.sub(r'x0f', \" \", paragraph)\n",
+    "        sentences = tokenize.sent_tokenize(paragraph)\n",
+    "\n",
+    "        sentence_list = []\n",
+    "        for sentence in sentences:\n",
+    "            if not re.search('[a-zA-Z]', sentence):\n",
+    "                sentence = ''\n",
+    "            wrapped_sentences = wrapper.wrap(sentence)\n",
+    "            sentence_list.append(wrapped_sentences)\n",
+    "        trunc_sentences = [phrase for sublist in sentence_list for phrase in sublist]\n",
+    "        text_list.append(trunc_sentences)\n",
+    "    text_list = [text for sentences in text_list for text in sentences]\n",
+    "    return text_list"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "f67e0184",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "with open(file_path, 'r') as uploaded_file:\n",
+    "    file = uploaded_file.read()\n",
+    "    text = preprocess_text(file)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "0bd67797",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'Testing Text File \\n\\nWith generated random Lorem Ipsum and other unexpected characters!\\n\\n<a href=\"https://github.com/mkutarna/audiobook_gen/\">Link to generator repo!</a>\\n\\n此行是对非英语字符的测试\\n\\nLorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Proin fermentum leo vel orci porta non pulvinar. Pretium lectus quam id leo in vitae turpis massa sed. Donec ac odio tempor orci dapibus. Feugiat in ante metus dictum at tempor. Elementum tempus egestas sed sed risus. Adipiscing commodo elit at imperdiet dui accumsan sit. Placerat orci nulla pellentesque dignissim enim. Posuere lorem ipsum dolor sit. Id ornare arcu odio ut sem. Purus faucibus ornare suspendisse sed nisi lacus sed. Ac turpis egestas sed tempus urna et pharetra pharetra massa. Morbi quis commodo odio aenean. Malesuada proin libero nunc consequat interdum. Ut placerat orci nulla pellentesque dignissim enim sit. Elit at imperdiet dui accumsan sit amet.\\n\\nBuilt to test various characters and other possible inputs to the silero model.\\n\\nHere are some Chinese characters: 此行是对非英语字符的测试.\\n\\nThere are 24 letters in the Greek alphabet. The vowels: are α, ε, η, ι, ο, ω, υ. All the rest are consonants.\\n\\nWe can also test for mathematical symbols: ∫, ∇, ∞, δ, ε, X̄, %, √ ,a, ±, ÷, +, = ,-.\\n\\nFinally, here are some emoticons: ☺️🙂😊😀😁☹️🙁😞😟😣😖😨😧😦😱😫😩.'"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "file"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "064aa16b",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "['Testing Text File',\n",
+       " 'With generated random Lorem Ipsum and other unexpected characters!',\n",
+       " 'Link to generator repo!',\n",
+       " 'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.',\n",
+       " 'Proin fermentum leo vel orci porta non pulvinar.',\n",
+       " 'Pretium lectus quam id leo in vitae turpis massa sed.',\n",
+       " 'Donec ac odio tempor orci dapibus.',\n",
+       " 'Feugiat in ante metus dictum at tempor.',\n",
+       " 'Elementum tempus egestas sed sed risus.',\n",
+       " 'Adipiscing commodo elit at imperdiet dui accumsan sit.',\n",
+       " 'Placerat orci nulla pellentesque dignissim enim.',\n",
+       " 'Posuere lorem ipsum dolor sit.',\n",
+       " 'Id ornare arcu odio ut sem.',\n",
+       " 'Purus faucibus ornare suspendisse sed nisi lacus sed.',\n",
+       " 'Ac turpis egestas sed tempus urna et pharetra pharetra massa.',\n",
+       " 'Morbi quis commodo odio aenean.',\n",
+       " 'Malesuada proin libero nunc consequat interdum.',\n",
+       " 'Ut placerat orci nulla pellentesque dignissim enim sit.',\n",
+       " 'Elit at imperdiet dui accumsan sit amet.',\n",
+       " 'Built to test various characters and other possible inputs to the silero model.',\n",
+       " 'Here are some Chinese characters: .',\n",
+       " 'There are 24 letters in the Greek alphabet.',\n",
+       " 'The vowels: are , , , , , , .',\n",
+       " 'All the rest are consonants.',\n",
+       " 'We can also test for mathematical symbols: , , , , , X, %,  ,a, , , +, = ,-.',\n",
+       " 'Finally, here are some emoticons: .']"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "text"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 22,
+   "id": "3e8e7965",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "with open('test_processed.txt', 'w') as output_file:\n",
+    "    for line in text:\n",
+    "        output_file.write(line)\n",
+    "        output_file.write('\\n')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 26,
+   "id": "2aa4c8ff",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "with open('test_processed.txt', 'r') as process_file:\n",
+    "    out_file = [line.strip() for line in process_file.readlines()]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 27,
+   "id": "c483fb65",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "['Testing Text File',\n",
+       " 'With generated random Lorem Ipsum and other unexpected characters!',\n",
+       " 'Link to generator repo!',\n",
+       " 'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.',\n",
+       " 'Proin fermentum leo vel orci porta non pulvinar.',\n",
+       " 'Pretium lectus quam id leo in vitae turpis massa sed.',\n",
+       " 'Donec ac odio tempor orci dapibus.',\n",
+       " 'Feugiat in ante metus dictum at tempor.',\n",
+       " 'Elementum tempus egestas sed sed risus.',\n",
+       " 'Adipiscing commodo elit at imperdiet dui accumsan sit.',\n",
+       " 'Placerat orci nulla pellentesque dignissim enim.',\n",
+       " 'Posuere lorem ipsum dolor sit.',\n",
+       " 'Id ornare arcu odio ut sem.',\n",
+       " 'Purus faucibus ornare suspendisse sed nisi lacus sed.',\n",
+       " 'Ac turpis egestas sed tempus urna et pharetra pharetra massa.',\n",
+       " 'Morbi quis commodo odio aenean.',\n",
+       " 'Malesuada proin libero nunc consequat interdum.',\n",
+       " 'Ut placerat orci nulla pellentesque dignissim enim sit.',\n",
+       " 'Elit at imperdiet dui accumsan sit amet.',\n",
+       " 'Built to test various characters and other possible inputs to the silero model.',\n",
+       " 'Here are some Chinese characters: .',\n",
+       " 'There are 24 letters in the Greek alphabet.',\n",
+       " 'The vowels: are , , , , , , .',\n",
+       " 'All the rest are consonants.',\n",
+       " 'We can also test for mathematical symbols: , , , , , X, %,  ,a, , , +, = ,-.',\n",
+       " 'Finally, here are some emoticons: .']"
+      ]
+     },
+     "execution_count": 27,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "out_file"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "65646961",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.10"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}

notebooks/{pg174.epub → test.epub} RENAMED Viewed

Binary files a/notebooks/pg174.epub and b/notebooks/test.epub differ

notebooks/test.htm ADDED Viewed

	@@ -0,0 +1,118 @@

+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
+"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
+<head>
+<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
+<meta http-equiv="Content-Style-Type" content="text/css" />
+<title>Lorem Ipsum</title>
+<style type="text/css">
+body { margin-left: 20%;
+       margin-right: 20%;
+       text-align: justify; }
+h1, h2, h3, h4, h5 {text-align: center; font-style: normal; font-weight:
+normal; line-height: 1.5; margin-top: .5em; margin-bottom: .5em;}
+h1 {font-size: 300%;
+    margin-top: 0.6em;
+    margin-bottom: 0.6em;
+    letter-spacing: 0.12em;
+    word-spacing: 0.2em;
+    text-indent: 0em;}
+h2 {font-size: 150%; margin-top: 2em; margin-bottom: 1em;}
+h3 {font-size: 130%; margin-top: 1em;}
+h4 {font-size: 120%;}
+h5 {font-size: 110%;}
+.no-break {page-break-before: avoid;} /* for epubs */
+div.chapter {page-break-before: always; margin-top: 4em;}
+hr {width: 80%; margin-top: 2em; margin-bottom: 2em;}
+p {text-indent: 1em;
+   margin-top: 0.25em;
+   margin-bottom: 0.25em; }
+.p2 {margin-top: 2em;}
+p.poem {text-indent: 0%;
+        margin-left: 10%;
+        font-size: 90%;
+        margin-top: 1em;
+        margin-bottom: 1em; }
+p.letter {text-indent: 0%;
+          margin-left: 10%;
+          margin-right: 10%;
+          margin-top: 1em;
+          margin-bottom: 1em; }
+p.noindent {text-indent: 0% }
+p.center  {text-align: center;
+           text-indent: 0em;
+           margin-top: 1em;
+           margin-bottom: 1em; }
+p.footnote {font-size: 90%;
+           text-indent: 0%;
+           margin-left: 10%;
+           margin-right: 10%;
+           margin-top: 1em;
+           margin-bottom: 1em; }
+sup { vertical-align: top; font-size: 0.6em; }
+a:link {color:blue; text-decoration:none}
+a:visited {color:blue; text-decoration:none}
+a:hover {color:red}
+</style>
+</head>
+<body>
+<div style='display:block; margin:1em 0'>
+This eBook is a generated Lorem Ipsum for the purposes of testing the Audiobook Gen app.
+</div>
+<div style='display:block; margin:1em 0'>Language: English</div>
+<div style='display:block; margin:1em 0'>Character set encoding: UTF-8</div>
+<p class="letter">
+<i>
+Diam vel quam elementum pulvinar etiam non quam. At tellus at urna condimentum mattis. Nisi scelerisque eu ultrices vitae auctor eu augue ut. Integer malesuada nunc vel risus commodo viverra maecenas accumsan. Ornare suspendisse sed nisi lacus. Sapien faucibus et molestie ac feugiat sed lectus. Quam elementum pulvinar etiam non. Elementum integer enim neque volutpat ac tincidunt. Justo laoreet sit amet cursus sit. Amet venenatis urna cursus eget nunc scelerisque viverra mauris. Cras semper auctor neque vitae tempus quam pellentesque nec nam. Fermentum iaculis eu non diam phasellus vestibulum lorem sed. Non pulvinar neque laoreet suspendisse interdum consectetur libero. Nec tincidunt praesent semper feugiat nibh sed. Sed id semper risus in hendrerit gravida rutrum. Suspendisse in est ante in nibh. Dui ut ornare lectus sit amet est placerat in.
+</i>
+</p>
+</div><!--end chapter-->
+<div class="chapter">
+<h2><a name="pref01"></a>A NEW LOREM</h2>
+<p>
+Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Proin fermentum leo vel orci porta non pulvinar. Pretium lectus quam id leo in vitae turpis massa sed. Donec ac odio tempor orci dapibus. Feugiat in ante metus dictum at tempor. Elementum tempus egestas sed sed risus. Adipiscing commodo elit at imperdiet dui accumsan sit. Placerat orci nulla pellentesque dignissim enim. Posuere lorem ipsum dolor sit. Id ornare arcu odio ut sem. Purus faucibus ornare suspendisse sed nisi lacus sed. Ac turpis egestas sed tempus urna et pharetra pharetra massa. Morbi quis commodo odio aenean. Malesuada proin libero nunc consequat interdum. Ut placerat orci nulla pellentesque dignissim enim sit. Elit at imperdiet dui accumsan sit amet.
+</p>
+<p>
+Nunc sed id semper risus in hendrerit gravida rutrum quisque. Augue interdum velit euismod in pellentesque. Elementum curabitur vitae nunc sed velit dignissim sodales ut eu. Mi in nulla posuere sollicitudin aliquam ultrices sagittis orci a. Quisque sagittis purus sit amet volutpat consequat mauris. Risus in hendrerit gravida rutrum. Quis vel eros donec ac odio. Eget nunc lobortis mattis aliquam faucibus. Lobortis scelerisque fermentum dui faucibus. Est velit egestas dui id ornare arcu odio. Sed ullamcorper morbi tincidunt ornare massa eget egestas purus. Nisi porta lorem mollis aliquam ut porttitor leo a. Ut morbi tincidunt augue interdum velit. Egestas diam in arcu cursus euismod. Tortor id aliquet lectus proin nibh nisl condimentum id venenatis. Lectus sit amet est placerat in egestas erat imperdiet sed. Amet tellus cras adipiscing enim eu turpis egestas pretium. Et leo duis ut diam quam.
+</p>
+</div><!--end chapter-->
+<div class="chapter">
+<h2><a name="pref02"></a>IPSUM STRIKES BACK</h2>
+<p>
+Egestas diam in arcu cursus euismod quis. Leo in vitae turpis massa sed elementum tempus egestas. Amet nulla facilisi morbi tempus iaculis urna id volutpat. Parturient montes nascetur ridiculus mus. Erat pellentesque adipiscing commodo elit at imperdiet. Egestas congue quisque egestas diam in arcu cursus. Diam ut venenatis tellus in metus. Ullamcorper eget nulla facilisi etiam. Blandit turpis cursus in hac habitasse platea dictumst quisque. Cursus euismod quis viverra nibh cras pulvinar. Neque viverra justo nec ultrices. Dui ut ornare lectus sit. Mauris ultrices eros in cursus turpis massa tincidunt. Lobortis elementum nibh tellus molestie nunc non blandit massa enim. Ullamcorper morbi tincidunt ornare massa eget egestas purus viverra.
+</p>
+<p>
+Mauris in aliquam sem fringilla ut morbi. Nunc sed blandit libero volutpat. Amet venenatis urna cursus eget nunc scelerisque. Sagittis nisl rhoncus mattis rhoncus urna neque. Felis eget nunc lobortis mattis aliquam faucibus purus in massa. Fringilla ut morbi tincidunt augue interdum. Nibh mauris cursus mattis molestie a iaculis at erat. Lacus sed turpis tincidunt id aliquet risus feugiat in. Nulla facilisi etiam dignissim diam quis enim lobortis. Vitae congue eu consequat ac felis donec et. Scelerisque viverra mauris in aliquam sem fringilla ut morbi tincidunt. Blandit volutpat maecenas volutpat blandit aliquam. Ultrices tincidunt arcu non sodales neque sodales ut etiam. Sollicitudin aliquam ultrices sagittis orci a scelerisque. Id cursus metus aliquam eleifend mi. Magna eget est lorem ipsum dolor sit amet consectetur. Eleifend mi in nulla posuere sollicitudin aliquam ultrices. Neque sodales ut etiam sit amet. Enim neque volutpat ac tincidunt vitae semper quis lectus nulla.
+</p>

data/testfile.txt → outputs/.gitkeep RENAMED Viewed

File without changes

pytest.ini CHANGED Viewed

@@ -1,4 +1,4 @@
 # pytest.ini
 [pytest]
-testpaths =
-    tests

 # pytest.ini
 [pytest]
+pythonpath = . src
+testpaths = tests

requirements.txt ADDED Viewed

	@@ -0,0 +1,7 @@

+silero
+streamlit
+ebooklib
+PyPDF2
+bs4
+nltk
+stqdm

resources/audiobook_gen.png ADDED Viewed

resources/instructions.md ADDED Viewed

	@@ -0,0 +1,13 @@

+This tool generates custom-voiced audiobook files from an imported ebook file. Please upload an ebook to begin the conversion process. Output files will be downloaded as a .zip archive.
+### Instructions
+1. Upload the book file to be converted.
+2. Select the desired voice for the audiobook.
+3. Click to run!
+### Notes
+- Currently, only epub, txt, pdf files are accepted for import.
+- Max input file size: 200 MB
+- Audiobook generation can take up to 1 hour, depending on the size of the file.
+- Generation time also depends on compute available for the app.

resources/speaker_en_0.wav ADDED Viewed

Binary file (629 kB). View file

resources/speaker_en_110.wav ADDED Viewed

Binary file (580 kB). View file

resources/speaker_en_29.wav ADDED Viewed

Binary file (546 kB). View file

resources/speaker_en_41.wav ADDED Viewed

Binary file (574 kB). View file

src/__inti__.py DELETED Viewed

File without changes

src/config.py ADDED Viewed

	@@ -0,0 +1,23 @@

+"""
+Notes
+-----
+This module contains the configuration entries for audiobook_gen.
+"""
+from pathlib import Path
+output_path = Path("outputs")
+resource_path = Path("resources")
+INSTRUCTIONS = Path("resources/instructions.md")
+DEVICE = 'cpu'
+LANGUAGE = 'en'
+MAX_CHAR_LEN = 140
+MODEL_ID = 'v3_en'
+SAMPLE_RATE = 24000
+SPEAKER_LIST = {
+    'Voice 1 (Female)': 'en_0',
+    'Voice 2 (Male)': 'en_29',
+    'Voice 3 (Female)': 'en_41',
+    'Voice 4 (Male)': 'en_110'
+}

src/file_readers.py ADDED Viewed

	@@ -0,0 +1,120 @@

+"""
+Notes
+-----
+This module contains the functions for audiobook_gen that read in the
+file formats that require for parsing than plain text (pdf, html, epub),
+as well as the preprocessing function for all input files.
+"""
+import re
+from bs4 import BeautifulSoup
+from nltk import tokenize, download
+from textwrap import TextWrapper
+from stqdm import stqdm
+from src import config
+download('punkt', quiet=True)
+wrapper = TextWrapper(config.MAX_CHAR_LEN, fix_sentence_endings=True)
+def preprocess_text(file):
+    """
+    Preprocesses and tokenizes a section of text from the corpus:
+    1. Removes residual HTML tags
+    2. Handles un-supported characters
+    3. Tokenizes text and confirms max token size
+    Parameters
+    ----------
+    file : file_like
+        list of strings,
+        section of corpus to be pre-processed and tokenized
+    Returns
+    -------
+    text_list :  : array_like
+        list of strings,
+        body of tokenized text from which audio is generated
+    """
+    input_text = BeautifulSoup(file, "html.parser").text
+    text_list = []
+    for paragraph in input_text.split('\n'):
+        paragraph = paragraph.replace('—', '-')
+        paragraph = paragraph.replace(' .', '')
+        paragraph = re.sub(r'[^\x00-\x7f]', "", paragraph)
+        paragraph = re.sub(r'x0f', " ", paragraph)
+        sentences = tokenize.sent_tokenize(paragraph)
+        sentence_list = []
+        for sentence in sentences:
+            if not re.search('[a-zA-Z]', sentence):
+                sentence = ''
+            wrapped_sentences = wrapper.wrap(sentence)
+            sentence_list.append(wrapped_sentences)
+        trunc_sentences = [phrase for sublist in sentence_list for phrase in sublist]
+        text_list.append(trunc_sentences)
+    text_list = [text for sentences in text_list for text in sentences]
+    return text_list
+def read_pdf(file):
+    """
+    Invokes PyPDF2 PdfReader to extract main body text from PDF file_like input,
+    and preprocesses text section by section.
+    Parameters
+    ----------
+    file : file_like
+        PDF file input to be parsed and preprocessed
+    Returns
+    -------
+    corpus : array_like
+        list of list of strings,
+        body of tokenized text from which audio is generated
+    """
+    from PyPDF2 import PdfReader
+    reader = PdfReader(file)
+    corpus = []
+    for item in stqdm(list(reader.pages), desc="Pages in pdf:"):
+        text_list = preprocess_text(item.extract_text())
+        corpus.append(text_list)
+    return corpus
+def read_epub(file):
+    """
+    Invokes ebooklib read_epub to extract main body text from epub file_like input,
+    and preprocesses text section by section.
+    Parameters
+    ----------
+    file : file_like
+        EPUB file input to be parsed and preprocessed
+    Returns
+    -------
+    corpus : array_like
+        list of list of strings,
+        body of tokenized text from which audio is generated
+    file_title : str
+        title of document, used to name output files
+    """
+    import ebooklib
+    from ebooklib import epub
+    book = epub.read_epub(file)
+    file_title = book.get_metadata('DC', 'title')[0][0]
+    file_title = file_title.lower().replace(' ', '_')
+    corpus = []
+    for item in stqdm(list(book.get_items()), desc="Chapters in ebook:"):
+        if item.get_type() == ebooklib.ITEM_DOCUMENT:
+            text_list = preprocess_text(item.get_content())
+            corpus.append(text_list)
+    return corpus, file_title

src/output.py ADDED Viewed

	@@ -0,0 +1,74 @@

+"""
+Notes
+-----
+This module contains the functions for audiobook_gen that take the generated audio tensors and output to audio files,
+as well as assembling the final zip archive for user download.
+"""
+import logging
+from src import config
+def write_audio(audio_list, sample_path):
+    """
+    Invokes torchaudio to save generated audio tensors to a file.
+    Parameters
+    ----------
+    audio_list : torch.tensor
+        pytorch tensor containing generated audio
+    sample_path : str
+        file name and path for outputting tensor to audio file
+    Returns
+    -------
+    None
+    """
+    import torch
+    import torchaudio
+    from src import config as cf
+    if not config.output_path.exists():
+        config.output_path.mkdir()
+    if len(audio_list) > 0:
+        audio_file = torch.cat(audio_list).reshape(1, -1)
+        torchaudio.save(sample_path, audio_file, cf.SAMPLE_RATE)
+        logging.info(f'Audio generated at: {sample_path}')
+    else:
+        logging.info(f'Audio at: {sample_path} is empty.')
+def assemble_zip(title):
+    """
+    Creates a zip file and inserts all .wav files in the output directory,
+    and returns the name / path of the zip file.
+    Parameters
+    ----------
+    title : str
+        title of document, used to name zip directory
+    Returns
+    -------
+    zip_name : str
+        name and path of zip directory generated
+    """
+    import zipfile
+    from stqdm import stqdm
+    if not config.output_path.exists():
+        config.output_path.mkdir()
+    zip_name = config.output_path / f'{title}.zip'
+    with zipfile.ZipFile(zip_name, mode="w") as archive:
+        for file_path in stqdm(config.output_path.iterdir()):
+            if file_path.suffix == '.wav':
+                archive.write(file_path, arcname=file_path.name)
+                file_path.unlink()
+    return zip_name

src/predict.py ADDED Viewed

	@@ -0,0 +1,110 @@

+"""
+Notes
+-----
+This module contains the functions for audiobook_gen that handle text-to-speech generation.
+The functions take in the preprocessed text and invoke the Silero package to generate audio tensors.
+"""
+import logging
+import torch
+from stqdm import stqdm
+from src import output, config
+def load_model():
+    """
+    Load Silero package containg the model information
+    for the language and speaker set in config.py
+    and converts it to the set device.
+    Parameters
+    ----------
+    None
+    Returns
+    -------
+    model : torch.package
+    """
+    from silero import silero_tts
+    model, _ = silero_tts(language=config.LANGUAGE, speaker=config.MODEL_ID)
+    model.to(config.DEVICE)
+    return model
+def generate_audio(corpus, title, model, speaker):
+    """
+    For each section within the corpus, calls predict() function to generate audio tensors
+    and then calls write_audio() to output the tensors to audio files.
+    Parameters
+    ----------
+    corpus : array_like
+        list of list of strings,
+        body of tokenized text from which audio is generated
+    title : str
+        title of document, used to name output files
+    model : torch.package
+        torch package containing model for language and speaker specified
+    speaker : str
+        identifier of selected speaker for audio generation
+    Returns
+    -------
+    None
+    """
+    for section in stqdm(corpus, desc="Sections in document:"):
+        section_index = f'part{corpus.index(section):03}'
+        audio_list, sample_path = predict(section, section_index, title, model, speaker)
+        output.write_audio(audio_list, sample_path)
+def predict(text_section, section_index, title, model, speaker):
+    """
+    Applies Silero TTS engine for each token within the corpus section,
+    appending it to the output tensor array, and creates file path for output.
+    Parameters
+    ----------
+    text_section : array_like
+        list of strings,
+        body of tokenized text from which audio is generated
+    section_index : int
+        index of current section within corpus
+    title : str
+        title of document, used to name output files
+    model : torch.package
+        torch package containing model for language and speaker specified
+    speaker : str
+        identifier of selected speaker for audio generation
+    Returns
+    -------
+    audio_list : torch.tensor
+        pytorch tensor containing generated audio
+    sample_path : str
+        file name and path for outputting tensor to audio file
+    """
+    audio_list = []
+    for sentence in stqdm(text_section, desc="Sentences in section:"):
+        audio = model.apply_tts(text=sentence, speaker=speaker, sample_rate=config.SAMPLE_RATE)
+        if len(audio) > 0 and isinstance(audio, torch.Tensor):
+            audio_list.append(audio)
+            logging.info(f'Tensor generated for sentence: \n {sentence}')
+        else:
+            logging.info(f'Tensor for sentence is not valid: \n {sentence}')
+    sample_path = config.output_path / f'{title}_{section_index}.wav'
+    return audio_list, sample_path

tests/__pycache__/test_dummy.cpython-39-pytest-7.1.2.pyc DELETED Viewed

Binary file (661 Bytes)

tests/data/test.epub ADDED Viewed

Binary file (90.4 kB). View file

tests/data/test.htm ADDED Viewed

	@@ -0,0 +1,118 @@

+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
+"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
+<head>
+<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
+<meta http-equiv="Content-Style-Type" content="text/css" />
+<title>Lorem Ipsum</title>
+<style type="text/css">
+body { margin-left: 20%;
+       margin-right: 20%;
+       text-align: justify; }
+h1, h2, h3, h4, h5 {text-align: center; font-style: normal; font-weight:
+normal; line-height: 1.5; margin-top: .5em; margin-bottom: .5em;}
+h1 {font-size: 300%;
+    margin-top: 0.6em;
+    margin-bottom: 0.6em;
+    letter-spacing: 0.12em;
+    word-spacing: 0.2em;
+    text-indent: 0em;}
+h2 {font-size: 150%; margin-top: 2em; margin-bottom: 1em;}
+h3 {font-size: 130%; margin-top: 1em;}
+h4 {font-size: 120%;}
+h5 {font-size: 110%;}
+.no-break {page-break-before: avoid;} /* for epubs */
+div.chapter {page-break-before: always; margin-top: 4em;}
+hr {width: 80%; margin-top: 2em; margin-bottom: 2em;}
+p {text-indent: 1em;
+   margin-top: 0.25em;
+   margin-bottom: 0.25em; }
+.p2 {margin-top: 2em;}
+p.poem {text-indent: 0%;
+        margin-left: 10%;
+        font-size: 90%;
+        margin-top: 1em;
+        margin-bottom: 1em; }
+p.letter {text-indent: 0%;
+          margin-left: 10%;
+          margin-right: 10%;
+          margin-top: 1em;
+          margin-bottom: 1em; }
+p.noindent {text-indent: 0% }
+p.center  {text-align: center;
+           text-indent: 0em;
+           margin-top: 1em;
+           margin-bottom: 1em; }
+p.footnote {font-size: 90%;
+           text-indent: 0%;
+           margin-left: 10%;
+           margin-right: 10%;
+           margin-top: 1em;
+           margin-bottom: 1em; }
+sup { vertical-align: top; font-size: 0.6em; }
+a:link {color:blue; text-decoration:none}
+a:visited {color:blue; text-decoration:none}
+a:hover {color:red}
+</style>
+</head>
+<body>
+<div style='display:block; margin:1em 0'>
+This eBook is a generated Lorem Ipsum for the purposes of testing the Audiobook Gen app.
+</div>
+<div style='display:block; margin:1em 0'>Language: English</div>
+<div style='display:block; margin:1em 0'>Character set encoding: UTF-8</div>
+<p class="letter">
+<i>
+Diam vel quam elementum pulvinar etiam non quam. At tellus at urna condimentum mattis. Nisi scelerisque eu ultrices vitae auctor eu augue ut. Integer malesuada nunc vel risus commodo viverra maecenas accumsan. Ornare suspendisse sed nisi lacus. Sapien faucibus et molestie ac feugiat sed lectus. Quam elementum pulvinar etiam non. Elementum integer enim neque volutpat ac tincidunt. Justo laoreet sit amet cursus sit. Amet venenatis urna cursus eget nunc scelerisque viverra mauris. Cras semper auctor neque vitae tempus quam pellentesque nec nam. Fermentum iaculis eu non diam phasellus vestibulum lorem sed. Non pulvinar neque laoreet suspendisse interdum consectetur libero. Nec tincidunt praesent semper feugiat nibh sed. Sed id semper risus in hendrerit gravida rutrum. Suspendisse in est ante in nibh. Dui ut ornare lectus sit amet est placerat in.
+</i>
+</p>
+</div><!--end chapter-->
+<div class="chapter">
+<h2><a name="pref01"></a>A NEW LOREM</h2>
+<p>
+Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Proin fermentum leo vel orci porta non pulvinar. Pretium lectus quam id leo in vitae turpis massa sed. Donec ac odio tempor orci dapibus. Feugiat in ante metus dictum at tempor. Elementum tempus egestas sed sed risus. Adipiscing commodo elit at imperdiet dui accumsan sit. Placerat orci nulla pellentesque dignissim enim. Posuere lorem ipsum dolor sit. Id ornare arcu odio ut sem. Purus faucibus ornare suspendisse sed nisi lacus sed. Ac turpis egestas sed tempus urna et pharetra pharetra massa. Morbi quis commodo odio aenean. Malesuada proin libero nunc consequat interdum. Ut placerat orci nulla pellentesque dignissim enim sit. Elit at imperdiet dui accumsan sit amet.
+</p>
+<p>
+Nunc sed id semper risus in hendrerit gravida rutrum quisque. Augue interdum velit euismod in pellentesque. Elementum curabitur vitae nunc sed velit dignissim sodales ut eu. Mi in nulla posuere sollicitudin aliquam ultrices sagittis orci a. Quisque sagittis purus sit amet volutpat consequat mauris. Risus in hendrerit gravida rutrum. Quis vel eros donec ac odio. Eget nunc lobortis mattis aliquam faucibus. Lobortis scelerisque fermentum dui faucibus. Est velit egestas dui id ornare arcu odio. Sed ullamcorper morbi tincidunt ornare massa eget egestas purus. Nisi porta lorem mollis aliquam ut porttitor leo a. Ut morbi tincidunt augue interdum velit. Egestas diam in arcu cursus euismod. Tortor id aliquet lectus proin nibh nisl condimentum id venenatis. Lectus sit amet est placerat in egestas erat imperdiet sed. Amet tellus cras adipiscing enim eu turpis egestas pretium. Et leo duis ut diam quam.
+</p>
+</div><!--end chapter-->
+<div class="chapter">
+<h2><a name="pref02"></a>IPSUM STRIKES BACK</h2>
+<p>
+Egestas diam in arcu cursus euismod quis. Leo in vitae turpis massa sed elementum tempus egestas. Amet nulla facilisi morbi tempus iaculis urna id volutpat. Parturient montes nascetur ridiculus mus. Erat pellentesque adipiscing commodo elit at imperdiet. Egestas congue quisque egestas diam in arcu cursus. Diam ut venenatis tellus in metus. Ullamcorper eget nulla facilisi etiam. Blandit turpis cursus in hac habitasse platea dictumst quisque. Cursus euismod quis viverra nibh cras pulvinar. Neque viverra justo nec ultrices. Dui ut ornare lectus sit. Mauris ultrices eros in cursus turpis massa tincidunt. Lobortis elementum nibh tellus molestie nunc non blandit massa enim. Ullamcorper morbi tincidunt ornare massa eget egestas purus viverra.
+</p>
+<p>
+Mauris in aliquam sem fringilla ut morbi. Nunc sed blandit libero volutpat. Amet venenatis urna cursus eget nunc scelerisque. Sagittis nisl rhoncus mattis rhoncus urna neque. Felis eget nunc lobortis mattis aliquam faucibus purus in massa. Fringilla ut morbi tincidunt augue interdum. Nibh mauris cursus mattis molestie a iaculis at erat. Lacus sed turpis tincidunt id aliquet risus feugiat in. Nulla facilisi etiam dignissim diam quis enim lobortis. Vitae congue eu consequat ac felis donec et. Scelerisque viverra mauris in aliquam sem fringilla ut morbi tincidunt. Blandit volutpat maecenas volutpat blandit aliquam. Ultrices tincidunt arcu non sodales neque sodales ut etiam. Sollicitudin aliquam ultrices sagittis orci a scelerisque. Id cursus metus aliquam eleifend mi. Magna eget est lorem ipsum dolor sit amet consectetur. Eleifend mi in nulla posuere sollicitudin aliquam ultrices. Neque sodales ut etiam sit amet. Enim neque volutpat ac tincidunt vitae semper quis lectus nulla.
+</p>

tests/data/test.pdf ADDED Viewed

Binary file (99.9 kB). View file

tests/data/test.txt ADDED Viewed

	@@ -0,0 +1,19 @@

+Testing Text File
+With generated random Lorem Ipsum and other unexpected characters!
+<a href="https://github.com/mkutarna/audiobook_gen/">Link to generator repo!</a>
+此行是对非英语字符的测试
+Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Proin fermentum leo vel orci porta non pulvinar. Pretium lectus quam id leo in vitae turpis massa sed. Donec ac odio tempor orci dapibus. Feugiat in ante metus dictum at tempor. Elementum tempus egestas sed sed risus. Adipiscing commodo elit at imperdiet dui accumsan sit. Placerat orci nulla pellentesque dignissim enim. Posuere lorem ipsum dolor sit. Id ornare arcu odio ut sem. Purus faucibus ornare suspendisse sed nisi lacus sed. Ac turpis egestas sed tempus urna et pharetra pharetra massa. Morbi quis commodo odio aenean. Malesuada proin libero nunc consequat interdum. Ut placerat orci nulla pellentesque dignissim enim sit. Elit at imperdiet dui accumsan sit amet.
+Built to test various characters and other possible inputs to the silero model.
+Here are some Chinese characters: 此行是对非英语字符的测试.
+There are 24 letters in the Greek alphabet. The vowels: are α, ε, η, ι, ο, ω, υ. All the rest are consonants.
+We can also test for mathematical symbols: ∫, ∇, ∞, δ, ε, X̄, %, √ ,a, ±, ÷, +, = ,-.
+Finally, here are some emoticons: ☺️🙂😊😀😁☹️🙁😞😟😣😖😨😧😦😱😫😩.

tests/data/test_audio.pt ADDED Viewed

Binary file (594 kB). View file

tests/data/test_predict.pt.REMOVED.git-id ADDED Viewed

	@@ -0,0 +1 @@


1	+ 84cf0cd8d8bede5ff60d18475d71e26543d5d7ad

tests/data/test_processed.txt ADDED Viewed

	@@ -0,0 +1,26 @@

+Testing Text File
+With generated random Lorem Ipsum and other unexpected characters!
+Link to generator repo!
+Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
+Proin fermentum leo vel orci porta non pulvinar.
+Pretium lectus quam id leo in vitae turpis massa sed.
+Donec ac odio tempor orci dapibus.
+Feugiat in ante metus dictum at tempor.
+Elementum tempus egestas sed sed risus.
+Adipiscing commodo elit at imperdiet dui accumsan sit.
+Placerat orci nulla pellentesque dignissim enim.
+Posuere lorem ipsum dolor sit.
+Id ornare arcu odio ut sem.
+Purus faucibus ornare suspendisse sed nisi lacus sed.
+Ac turpis egestas sed tempus urna et pharetra pharetra massa.
+Morbi quis commodo odio aenean.
+Malesuada proin libero nunc consequat interdum.
+Ut placerat orci nulla pellentesque dignissim enim sit.
+Elit at imperdiet dui accumsan sit amet.
+Built to test various characters and other possible inputs to the silero model.
+Here are some Chinese characters: .
+There are 24 letters in the Greek alphabet.
+The vowels: are , , , , , , .
+All the rest are consonants.
+We can also test for mathematical symbols: , , , , , X, %,  ,a, , , +, = ,-.
+Finally, here are some emoticons: .

tests/test_config.py ADDED Viewed

	@@ -0,0 +1,9 @@

+"""
+Notes
+-----
+This module contains the configuration entries for audiobook_gen tests.
+"""
+from pathlib import Path
+data_path = Path("tests/data")

tests/test_dummy.py DELETED Viewed

	@@ -1,2 +0,0 @@
1	- def test_dummy():
2	- assert 1 == 1

tests/test_file_readers.py ADDED Viewed

	@@ -0,0 +1,46 @@

+import pytest
+import numpy as np
+from src import file_readers
+import test_config
+def test_preprocess_text():
+    """
+    Tests preprocess function by asserting title,
+    shape of corpus, and correct line reading.
+    """
+    test_path = test_config.data_path / "test.txt"
+    processed_path = test_config.data_path / "test_processed.txt"
+    with open(test_path, 'r') as file:
+        test_corpus = file_readers.preprocess_text(file)
+    with open(processed_path, 'r') as process_file:
+        processed_corpus = [line.strip() for line in process_file.readlines()]
+    assert processed_corpus == test_corpus
+def test_read_pdf():
+    pdf_path = test_config.data_path / "test.pdf"
+    corpus = np.array(file_readers.read_pdf(pdf_path), dtype=object)
+    assert np.shape(corpus) == (4, )
+    assert np.shape(corpus[0]) == (3, )
+    assert corpus[0][0] == 'Lorem Ipsum'
+    assert corpus[2][0] == 'Preface'
+def test_read_epub():
+    """
+    Tests read_epub function by asserting title,
+    shape of corpus,  and correct line reading.
+    """
+    ebook_path = test_config.data_path / "test.epub"
+    corpus, title = file_readers.read_epub(ebook_path)
+    corpus_arr = np.array(corpus, dtype=object)
+    assert title == "the_picture_of_dorian_gray"
+    assert np.shape(corpus_arr) == (6,)
+    assert np.shape(corpus_arr[0]) == (39,)
+    assert corpus[0][0] == 'The Project Gutenberg eBook of The Picture of Dorian Gray, by Oscar Wilde'
+    assert corpus[2][0] == 'CHAPTER I.'

tests/test_output.py ADDED Viewed

	@@ -0,0 +1,50 @@

+import pytest
+from src import output, config
+import test_config
+def test_write_audio():
+    """
+    Tests write_audio function, takes in an audio tensor with a file path and writes the audio to a file.
+    """
+    import torch
+    test_path = test_config.data_path / 'test_audio.wav'
+    audio_path = test_config.data_path / 'test_audio.pt'
+    audio_list = torch.load(audio_path)
+    output.write_audio(audio_list, test_path)
+    assert test_path.is_file()
+    assert test_path.stat().st_size == 592858
+    test_path.unlink()
+def test_assemble_zip():
+    """
+    Tests assemble_zip function, which collects all the audio files from the output directory,
+    and zips them up into a zip directory.
+    """
+    from shutil import copy2
+    if not config.output_path.exists():
+        config.output_path.mkdir()
+    title = "speaker_samples"
+    zip_path = config.output_path / 'speaker_samples.zip'
+    wav1_path = config.output_path / 'speaker_en_0.wav'
+    wav2_path = config.output_path / 'speaker_en_110.wav'
+    for file_path in config.resource_path.iterdir():
+        if file_path.suffix == '.wav':
+            copy2(file_path, config.output_path)
+    _ = output.assemble_zip(title)
+    assert zip_path.is_file()
+    assert not wav1_path.is_file()
+    assert not wav2_path.is_file()
+    zip_path.unlink()

tests/test_predict.py ADDED Viewed

	@@ -0,0 +1,63 @@

+import pytest
+import torch
+import numpy as np
+from src import predict, file_readers, config
+import test_config
+def test_load_model():
+    """
+    Tests load_model function, which loads the silero TTS model.
+    """
+    model = predict.load_model()
+    assert model.speakers[0] == 'en_0'
+    assert np.shape(model.speakers) == (119,)
+def test_generate_audio():
+    """
+    Tests generate_audio function, which takes the TTS model and file input,
+    and uses the predict & write_audio functions to output the audio file.
+    """
+    ebook_path = test_config.data_path / "test.epub"
+    wav1_path = config.output_path / 'the_picture_of_dorian_gray_part000.wav'
+    wav2_path = config.output_path / 'the_picture_of_dorian_gray_part001.wav'
+    wav3_path = config.output_path / 'the_picture_of_dorian_gray_part002.wav'
+    corpus, title = file_readers.read_epub(ebook_path)
+    model = predict.load_model()
+    speaker = 'en_110'
+    predict.generate_audio(corpus[0:2], title, model, speaker)
+    assert wav1_path.is_file()
+    assert wav2_path.is_file()
+    assert not wav3_path.is_file()
+    wav1_path.unlink()
+    wav2_path.unlink()
+def test_predict():
+    """
+    Tests predict function, generates audio tensors for each token in the text section,
+    and appends them together along with a generated file path for output.
+    """
+    seed = 1337
+    torch.manual_seed(seed)
+    torch.cuda.manual_seed(seed)
+    model = predict.load_model()
+    tensor_path = test_config.data_path / "test_predict.pt"
+    test_tensor = torch.load(tensor_path)
+    ebook_path = test_config.data_path / "test.epub"
+    corpus, title = file_readers.read_epub(ebook_path)
+    section_index = 'part001'
+    speaker = 'en_110'
+    audio_list, _ = predict.predict(corpus[1], section_index, title, model, speaker)
+    audio_tensor = torch.cat(audio_list).reshape(1, -1)
+    torch.testing.assert_close(audio_tensor, test_tensor, atol=1e-3, rtol=0.2)