alessandro trinca tornidor commited on
Commit
7aaf29c
·
1 Parent(s): 023235e

test: update python backend test coverage, remove some unused classes and functions (see README.md)

Browse files
README.md CHANGED
@@ -1,31 +1,56 @@
1
 
2
- # AI Pronunciation Trainer
3
- This tool uses AI to evaluate your pronunciation so you can improve it and be understood more clearly. You can go straight test the tool at https://aipronunciationtr.com (please use the chrome browser for desktop and have some patience for it to "warm-up" :) ).
 
4
 
5
  ![](images/MainScreen.jpg)
6
 
7
- ## Installation
 
8
  To run the program locally, you need to install the requirements and run the main python file:
9
- ```
 
10
  pip install -r requirements.txt
11
  python webApp.py
12
  ```
13
- You'll also need ffmpeg, which you can download from here https://ffmpeg.org/download.html. On Windows, it may be needed to add the ffmpeg "bin" folder to your PATH environment variable. On Mac, you can also just run "brew install ffmpeg".
 
14
 
15
  You should be able to run it locally without any major issues as long as you’re using a recent python 3.X version.
16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  ## Online version
18
- For the people who don’t feel comfortable running code or just want to have a quick way to use the tool, I hosted an online version of it at https://aipronunciationtr.com. It should work well in desktop-chrome, any other browser is not officially supported, although most of the functionality should work fine.
19
-
20
- Please be aware that the usage is limited by day (I’m still not rich ;)). If, for some reason, you would like to avoid the daily usage limit, just enter in contact and we see what we can do.
 
21
 
22
  ## Motivation
23
 
24
- Often, when we want to improve our pronunciation, it is very difficult to self-assess how good we’re speaking. Asking a native, or language instructor, to constantly correct us is either impractical, due to monetary constrains, or annoying due to simply being too boring for this other person. Additionally, they may often say “it sounds good” after your 10th try to not discourage you, even though you may still have some mistakes in your pronunciation.
 
 
25
 
26
- The AI pronunciation trainer is a way to provide objective feedback on how well your pronunciation is in an automatic and scalable fashion, so the only limit to your improvement is your own dedication.
27
 
28
- This project originated from a small program that I did to improve my own pronunciation. When I finished it, I believed it could be a useful tool also for other people trying to be better understood, so I decided to make a simple, more user-friendly version of it.
29
 
30
- ## Disclaimer
31
  This is a simple project that I made in my free time with the goal to be useful to some people. It is not perfect, thus be aware that some small bugs may be present. In case you find something is not working, all feedback is welcome, and issues may be addressed depending on their severity.
 
1
 
2
+ # AI Pronunciation Trainer
3
+
4
+ This tool uses AI to evaluate your pronunciation so you can improve it and be understood more clearly. You can go straight test the tool at <https://aipronunciationtr.com> (please use the chrome browser for desktop and have some patience for it to "warm-up" :) ).
5
 
6
  ![](images/MainScreen.jpg)
7
 
8
+ ## Installation
9
+
10
  To run the program locally, you need to install the requirements and run the main python file:
11
+
12
+ ```bash
13
  pip install -r requirements.txt
14
  python webApp.py
15
  ```
16
+
17
+ You'll also need ffmpeg, which you can download from here <https://ffmpeg.org/download.html>. On Windows, it may be needed to add the ffmpeg "bin" folder to your PATH environment variable. On Mac, you can also just run "brew install ffmpeg".
18
 
19
  You should be able to run it locally without any major issues as long as you’re using a recent python 3.X version.
20
 
21
+ ## Changes on [trincadev's](https://github.com/trincadev/) [repository](https://github.com/trincadev/ai-pronunciation-trainer)
22
+
23
+ I upgraded the frontend (jquery@3.7.1, bootstrap@5.3.3) and backend (pytorch==1.13.1, numpy<2.0.0) libraries where possible: for example right now this project doesn't work with pytorch > 2.0.0, then we are locked with pytorch == 1.13.1.
24
+
25
+ ### Unused classes and functions now removed
26
+
27
+ - `aip_trainer.lambdas.lambdaTTS.*`
28
+ - `aip_trainer.models.models.getTTSModel()`
29
+ - `aip_trainer.models.models.getTranslationModel()`
30
+ - `aip_trainer.models.AllModels.NeuralTTS`
31
+ - `aip_trainer.models.AllModels.NeuralTranslator`
32
+
33
+ ### TODO
34
+
35
+ - add e2e tests with playwright
36
+ - move from pytorch to onnxruntime
37
+ - refactor frontend with something more modern (e.g. vuejs)
38
+ - refactor css style with tailwindcss
39
+
40
  ## Online version
41
+
42
+ For the people who don’t feel comfortable running code or just want to have a quick way to use the tool, I hosted an online version of it at <https://aipronunciationtr.com>. It should work well in desktop-chrome, any other browser is not officially supported, although most of the functionality should work fine.
43
+
44
+ Please be aware that the usage is limited by day (I’m still not rich ;)). If, for some reason, you would like to avoid the daily usage limit, just enter in contact and we see what we can do.
45
 
46
  ## Motivation
47
 
48
+ Often, when we want to improve our pronunciation, it is very difficult to self-assess how good we’re speaking. Asking a native, or language instructor, to constantly correct us is either impractical, due to monetary constrains, or annoying due to simply being too boring for this other person. Additionally, they may often say “it sounds good” after your 10th try to not discourage you, even though you may still have some mistakes in your pronunciation.
49
+
50
+ The AI pronunciation trainer is a way to provide objective feedback on how well your pronunciation is in an automatic and scalable fashion, so the only limit to your improvement is your own dedication.
51
 
52
+ This project originated from a small program that I did to improve my own pronunciation. When I finished it, I believed it could be a useful tool also for other people trying to be better understood, so I decided to make a simple, more user-friendly version of it.
53
 
54
+ ## Disclaimer
55
 
 
56
  This is a simple project that I made in my free time with the goal to be useful to some people. It is not perfect, thus be aware that some small bugs may be present. In case you find something is not working, all feedback is welcome, and issues may be addressed depending on their severity.
aip_trainer/WordMetrics.py CHANGED
@@ -3,35 +3,6 @@ import numpy as np
3
  from aip_trainer import app_logger
4
 
5
 
6
- # ref from https://gitlab.com/-/snippets/1948157
7
- # For some variants, look here https://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance#Python
8
-
9
-
10
- # Pure python
11
- def edit_distance_python2(a, b):
12
- # This version is commutative, so as an optimization we force |a|>=|b|
13
- if len(a) < len(b):
14
- return edit_distance_python(b, a)
15
- if len(b) == 0: # Can deal with empty sequences faster
16
- return len(a)
17
- # Only two rows are really needed: the one currently filled in, and the previous
18
- distances = [
19
- [i for i in range(len(b) + 1)],
20
- [0 for _ in range(len(b) + 1)]
21
- ]
22
- # We can prefill the first row:
23
- costs = [0 for _ in range(3)]
24
- for i, a_token in enumerate(a, start=1):
25
- distances[1][0] += 1 # Deals with the first column.
26
- for j, b_token in enumerate(b, start=1):
27
- costs[0] = distances[1][j-1] + 1
28
- costs[1] = distances[0][j] + 1
29
- costs[2] = distances[0][j-1] + (0 if a_token == b_token else 1)
30
- distances[1][j] = min(costs)
31
- # Move to the next row:
32
- distances[0][:] = distances[1][:]
33
- return distances[1][len(b)]
34
-
35
  # https://stackabuse.com/levenshtein-distance-and-text-similarity-in-python/
36
  def edit_distance_python(seq1, seq2):
37
  size_x = len(seq1) + 1
 
3
  from aip_trainer import app_logger
4
 
5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  # https://stackabuse.com/levenshtein-distance-and-text-similarity-in-python/
7
  def edit_distance_python(seq1, seq2):
8
  size_x = len(seq1) + 1
aip_trainer/lambdas/lambdaGetSample.py CHANGED
@@ -1,6 +1,5 @@
1
  import json
2
  import pickle
3
- import random
4
  from pathlib import Path
5
 
6
  import epitran
@@ -89,21 +88,20 @@ def getSentenceCategory(sentence) -> int:
89
  return category + 1
90
 
91
 
92
- if __name__ == "__main__":
93
- import pandas as pd
94
- with open(sample_folder / 'data_de_en_2.pickle', 'rb') as handle:
95
- df = pickle.load(handle)
 
 
 
96
  pass
97
- df["de_category"] = df["de_sentence"].apply(getSentenceCategory)
98
  print("de_category added")
99
- df["en_category"] = df["en_sentence"].apply(getSentenceCategory)
100
  print("en_category added")
101
- df_json = df.to_json()
102
- with open(sample_folder / 'data_de_en_with_categories.json', 'w') as dst:
103
  dst.write(df_json)
104
  print("data_de_en_with_categories.json written")
105
- with open(sample_folder / 'data_de_en_with_categories.json', 'r') as src:
106
- jj = json.load(src)
107
- print("jj:", jj)
108
- df2 = pd.read_json(json.dumps(jj))
109
- print(df2)
 
1
  import json
2
  import pickle
 
3
  from pathlib import Path
4
 
5
  import epitran
 
88
  return category + 1
89
 
90
 
91
+ def get_pickle2json_dataframe(
92
+ custom_pickle_filename_no_ext: Path | str = 'data_de_en_2',
93
+ custom_folder: Path = sample_folder
94
+ ):
95
+ custom_folder = Path(custom_folder)
96
+ with open(custom_folder / f'{custom_pickle_filename_no_ext}.pickle', 'rb') as handle:
97
+ df2 = pickle.load(handle)
98
  pass
99
+ df2["de_category"] = df2["de_sentence"].apply(getSentenceCategory)
100
  print("de_category added")
101
+ df2["en_category"] = df2["en_sentence"].apply(getSentenceCategory)
102
  print("en_category added")
103
+ df_json = df2.to_json()
104
+ with open(custom_folder / f'{custom_pickle_filename_no_ext}.json', 'w') as dst:
105
  dst.write(df_json)
106
  print("data_de_en_with_categories.json written")
107
+
 
 
 
 
aip_trainer/lambdas/lambdaTTS.py DELETED
@@ -1,48 +0,0 @@
1
-
2
- import base64
3
- import json
4
- import tempfile
5
-
6
- import soundfile as sf
7
-
8
- from aip_trainer import app_logger
9
- from aip_trainer.models.models import getTTSModel
10
- from aip_trainer.models.AIModels import NeuralTTS
11
-
12
-
13
- sampling_rate = 16000
14
- model_de = getTTSModel('de')
15
- model_TTS_lambda = NeuralTTS(model_de, sampling_rate)
16
-
17
-
18
- def lambda_handler(event, context):
19
-
20
- body = json.loads(event['body'])
21
-
22
- text_string = body['value']
23
-
24
- linear_factor = 0.2
25
- audio = model_TTS_lambda.getAudioFromSentence(
26
- text_string).detach().numpy()*linear_factor
27
- with tempfile.TemporaryFile(prefix="temp_sound_", suffix=".wav") as f1:
28
- app_logger.info(f"Saving temp audio to {f1.name}...")
29
- # random_file_name = utilsFileIO.generateRandomString(20) + '.wav'
30
- # sf.write('./'+random_file_name, audio, 16000)
31
-
32
- sf.write(f1.name, audio, sampling_rate)
33
- with open(f1.name, "rb") as f:
34
- audio_byte_array = f.read()
35
- # os.remove(random_file_name)
36
- return {
37
- 'statusCode': 200,
38
- 'headers': {
39
- 'Access-Control-Allow-Headers': '*',
40
- 'Access-Control-Allow-Origin': 'http://127.0.0.1:3000/',
41
- 'Access-Control-Allow-Methods': 'OPTIONS,POST,GET'
42
- },
43
- 'body': json.dumps(
44
- {
45
- "wavBase64": str(base64.b64encode(audio_byte_array))[2:-1],
46
- },
47
- )
48
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
aip_trainer/models/AIModels.py CHANGED
@@ -34,33 +34,3 @@ class NeuralASR(ModelInterfaces.IASRModel):
34
 
35
  self.audio_transcript, self.word_locations_in_samples = self.decoder(
36
  nn_output[0, :, :].detach(), audio_length_in_samples, word_align=True)
37
-
38
-
39
- class NeuralTTS(ModelInterfaces.ITextToSpeechModel):
40
- def __init__(self, model: torch.nn.Module, sampling_rate: int) -> None:
41
- super().__init__()
42
- self.model = model
43
- self.sampling_rate = sampling_rate
44
-
45
- def getAudioFromSentence(self, sentence: str) -> np.array:
46
- with torch.inference_mode():
47
- audio_transcript = self.model.apply_tts(texts=[sentence],
48
- sample_rate=self.sampling_rate)[0]
49
-
50
- return audio_transcript
51
-
52
-
53
- class NeuralTranslator(ModelInterfaces.ITranslationModel):
54
- def __init__(self, model: torch.nn.Module, tokenizer) -> None:
55
- super().__init__()
56
- self.model = model
57
- self.tokenizer = tokenizer
58
-
59
- def translateSentence(self, sentence: str) -> str:
60
- """Get the transcripts of the process audio"""
61
- tokenized_text = self.tokenizer(sentence, return_tensors='pt')
62
- translation = self.model.generate(**tokenized_text)
63
- translated_text = self.tokenizer.batch_decode(
64
- translation, skip_special_tokens=True)[0]
65
-
66
- return translated_text
 
34
 
35
  self.audio_transcript, self.word_locations_in_samples = self.decoder(
36
  nn_output[0, :, :].detach(), audio_length_in_samples, word_align=True)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
aip_trainer/models/models.py CHANGED
@@ -1,10 +1,12 @@
 
 
1
  import torch
2
  import torch.nn as nn
3
 
4
- import pickle
5
 
 
 
6
 
7
- def getASRModel(language: str) -> nn.Module:
8
 
9
  if language == 'de':
10
 
@@ -18,51 +20,7 @@ def getASRModel(language: str) -> nn.Module:
18
  model='silero_stt',
19
  language='en',
20
  device=torch.device('cpu'))
21
- elif language == 'fr':
22
- model, decoder, utils = torch.hub.load(repo_or_dir='snakers4/silero-models',
23
- model='silero_stt',
24
- language='fr',
25
- device=torch.device('cpu'))
26
-
27
- return (model, decoder)
28
-
29
-
30
- def getTTSModel(language: str) -> nn.Module:
31
-
32
- if language == 'de':
33
-
34
- speaker = 'thorsten_v2' # 16 kHz
35
- model, _ = torch.hub.load(repo_or_dir='snakers4/silero-models',
36
- model='silero_tts',
37
- language=language,
38
- speaker=speaker)
39
-
40
- elif language == 'en':
41
- speaker = 'lj_16khz' # 16 kHz
42
- model = torch.hub.load(repo_or_dir='snakers4/silero-models',
43
- model='silero_tts',
44
- language=language,
45
- speaker=speaker)
46
- else:
47
- raise ValueError('Language not implemented')
48
-
49
- return model
50
-
51
-
52
- def getTranslationModel(language: str) -> nn.Module:
53
- from transformers import AutoTokenizer
54
- from transformers import AutoModelForSeq2SeqLM
55
- if language == 'de':
56
- model = AutoModelForSeq2SeqLM.from_pretrained(
57
- "Helsinki-NLP/opus-mt-de-en")
58
- tokenizer = AutoTokenizer.from_pretrained(
59
- "Helsinki-NLP/opus-mt-de-en")
60
- # Cache models to avoid Hugging face processing
61
- with open('translation_model_de.pickle', 'wb') as handle:
62
- pickle.dump(model, handle)
63
- with open('translation_tokenizer_de.pickle', 'wb') as handle:
64
- pickle.dump(tokenizer, handle)
65
  else:
66
- raise ValueError('Language not implemented')
67
 
68
- return model, tokenizer
 
1
+ from typing import Any
2
+
3
  import torch
4
  import torch.nn as nn
5
 
 
6
 
7
+ # second returned type here is the custom class src.silero.utils.Decoder from snakers4/silero-models
8
+ def getASRModel(language: str) -> tuple[nn.Module, Any]:
9
 
 
10
 
11
  if language == 'de':
12
 
 
20
  model='silero_stt',
21
  language='en',
22
  device=torch.device('cpu'))
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
  else:
24
+ raise NotImplementedError("currenty works only for 'de' and 'en' languages, not for '{}'.".format(language))
25
 
26
+ return model, decoder
aip_trainer/lambdas/data_de_en_2.pickle → tests/test_data_de_en_2.pickle RENAMED
File without changes
tests/test_data_de_en_2_expected.json ADDED
The diff for this file is too large to render. See raw diff
 
tests/test_dataset.py CHANGED
@@ -2,7 +2,7 @@ import json
2
  import unittest
3
 
4
  from aip_trainer.lambdas import lambdaGetSample
5
- from tests import test_logger
6
 
7
 
8
  def helper_category(category: int, threshold_min: int, threshold_max: int, n: int = 1000):
@@ -32,6 +32,18 @@ class TestDataset(unittest.TestCase):
32
  def test_hard_sentences(self):
33
  helper_category(3, 20, 10000)
34
 
 
 
 
 
 
 
 
 
 
 
 
 
35
 
36
  if __name__ == '__main__':
37
  unittest.main()
 
2
  import unittest
3
 
4
  from aip_trainer.lambdas import lambdaGetSample
5
+ from tests import test_logger, TEST_ROOT_FOLDER
6
 
7
 
8
  def helper_category(category: int, threshold_min: int, threshold_max: int, n: int = 1000):
 
32
  def test_hard_sentences(self):
33
  helper_category(3, 20, 10000)
34
 
35
+ def test_get_pickle2json_dataframe(self):
36
+ import os
37
+
38
+ custom_filename = 'test_data_de_en_2'
39
+ lambdaGetSample.get_pickle2json_dataframe(custom_filename, TEST_ROOT_FOLDER)
40
+ with open(TEST_ROOT_FOLDER / f'{custom_filename}.json', 'r') as src1:
41
+ with open(TEST_ROOT_FOLDER / f'{custom_filename}_expected.json', 'r') as src2:
42
+ json1 = json.load(src1)
43
+ json2 = json.load(src2)
44
+ assert json1 == json2
45
+ os.remove(TEST_ROOT_FOLDER / f'{custom_filename}.json')
46
+
47
 
48
  if __name__ == '__main__':
49
  unittest.main()
webApp.py CHANGED
@@ -7,7 +7,6 @@ from flask_cors import CORS
7
 
8
  from aip_trainer.lambdas import lambdaGetSample
9
  from aip_trainer.lambdas import lambdaSpeechToScore
10
- from aip_trainer.lambdas import lambdaTTS
11
 
12
 
13
  app = Flask(__name__, template_folder="static")
@@ -22,40 +21,6 @@ def main():
22
  return render_template('main.html')
23
 
24
 
25
- @app.route(rootPath+'/getAllSamples')
26
- def getDataDeEnAll():
27
- import pickle
28
- from pathlib import Path
29
- sample_folder = Path(PROJECT_ROOT_FOLDER / "aip_trainer" / "lambdas")
30
- with open(sample_folder / 'data_de_en_2.pickle', 'rb') as handle:
31
- df = pickle.load(handle)
32
- j = df.to_json()
33
- return Response(j, mimetype='application/json')
34
-
35
-
36
- @app.route(rootPath+'/getSampleSearch', methods=['POST'])
37
- def getDataDeEnSearch():
38
- import pickle
39
- from pathlib import Path
40
- sample_folder = Path(PROJECT_ROOT_FOLDER / "aip_trainer" / "lambdas")
41
- with open(sample_folder / 'data_de_en_2.pickle', 'rb') as handle:
42
- event = request.get_json(force=True)
43
- df = pickle.load(handle)
44
- lang = event.get('language')
45
- filter_key = event.get('search')
46
- df_by_language = df[f"{lang}_sentence"]
47
- filter_obj = df_by_language.str.contains(filter_key)
48
- filtered = df_by_language[filter_obj]
49
- j = filtered.to_json()
50
- return Response(j, mimetype='application/json')
51
-
52
-
53
- @app.route(rootPath+'/getAudioFromText', methods=['POST'])
54
- def getAudioFromText():
55
- event = {'body': json.dumps(request.get_json(force=True))}
56
- return lambdaTTS.lambda_handler(event, [])
57
-
58
-
59
  @app.route(rootPath+'/getSample', methods=['POST'])
60
  def getNext():
61
  event = {'body': json.dumps(request.get_json(force=True))}
 
7
 
8
  from aip_trainer.lambdas import lambdaGetSample
9
  from aip_trainer.lambdas import lambdaSpeechToScore
 
10
 
11
 
12
  app = Flask(__name__, template_folder="static")
 
21
  return render_template('main.html')
22
 
23
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
  @app.route(rootPath+'/getSample', methods=['POST'])
25
  def getNext():
26
  event = {'body': json.dumps(request.get_json(force=True))}