--- tags: - audio - text-to-speech - onnx inference: false language: en datasets: - CSTR-Edinburgh/vctk license: apache-2.0 library_name: txtai --- # ESPnet VITS Text-to-Speech (TTS) Model for ONNX [espnet/kan-bayashi_vctk_vits](https://huggingface.co/espnet/kan-bayashi_vctk_tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space_train.total_count.ave) exported to ONNX. This model is an ONNX export using the [espnet_onnx](https://github.com/espnet/espnet_onnx) library. ## Usage with txtai [txtai](https://github.com/neuml/txtai) has a built in Text to Speech (TTS) pipeline that makes using this model easy. _Note the following example requires txtai >= 7.5_ ```python import soundfile as sf from txtai.pipeline import TextToSpeech # Build pipeline tts = TextToSpeech("NeuML/vctk-vits-onnx") # Generate speech with speaker id speech, rate = tts("Say something here", speaker=15) # Write to file sf.write("out.wav", speech, rate) ``` ## Usage with ONNX This model can also be run directly with ONNX provided the input text is tokenized. Tokenization can be done with [ttstokenizer](https://github.com/neuml/ttstokenizer). Note that the txtai pipeline has additional functionality such as batching large inputs together that would need to be duplicated with this method. ```python import numpy as np import onnxruntime import soundfile as sf import yaml from ttstokenizer import TTSTokenizer # This example assumes the files have been downloaded locally with open("vctk-vits-onnx/config.yaml", "r", encoding="utf-8") as f: config = yaml.safe_load(f) # Create model model = onnxruntime.InferenceSession( "vctk-vits-onnx/model.onnx", providers=["CPUExecutionProvider"] ) # Create tokenizer tokenizer = TTSTokenizer(config["token"]["list"]) # Tokenize inputs inputs = tokenizer("Say something here") # Generate speech outputs = model.run(None, {"text": inputs, "sids": np.array([15])}) # Write to file sf.write("out.wav", outputs[0], 22050) ``` ## How to export More information on how to export ESPnet models to ONNX can be [found here](https://github.com/espnet/espnet_onnx#text2speech-inference). ## Speaker reference The [CSTR VCTK Corpus](https://datashare.ed.ac.uk/handle/10283/3443) includes speech data uttered by native speakers of English with various accents. When using this model, set a `speaker` id from the reference table below. The `ref` column corresponds to the id in the VCTK dataset. | SPEAKER | REF | AGE | GENDER | ACCENTS | REGION | |----------:|-----:|------:|:---------|:---------------|:-----------------| | 1 | 225 | 23 | F | English | Southern England | | 2 | 226 | 22 | M | English | Surrey | | 3 | 227 | 38 | M | English | Cumbria | | 4 | 228 | 22 | F | English | Southern England | | 5 | 229 | 23 | F | English | Southern England | | 6 | 230 | 22 | F | English | Stockton-on-tees | | 7 | 231 | 23 | F | English | Southern England | | 8 | 232 | 23 | M | English | Southern England | | 9 | 233 | 23 | F | English | Staffordshire | | 10 | 234 | 22 | F | Scottish | West Dumfries | | 11 | 236 | 23 | F | English | Manchester | | 12 | 237 | 22 | M | Scottish | Fife | | 13 | 238 | 22 | F | Northern Irish | Belfast | | 14 | 239 | 22 | F | English | SW England | | 15 | 240 | 21 | F | English | Southern England | | 16 | 241 | 21 | M | Scottish | Perth | | 17 | 243 | 22 | M | English | London | | 18 | 244 | 22 | F | English | Manchester | | 19 | 245 | 25 | M | Irish | Dublin | | 20 | 246 | 22 | M | Scottish | Selkirk | | 21 | 247 | 22 | M | Scottish | Argyll | | 22 | 248 | 23 | F | Indian | | | 23 | 249 | 22 | F | Scottish | Aberdeen | | 24 | 250 | 22 | F | English | SE England | | 25 | 251 | 26 | M | Indian | | | 26 | 252 | 22 | M | Scottish | Edinburgh | | 27 | 253 | 22 | F | Welsh | Cardiff | | 28 | 254 | 21 | M | English | Surrey | | 29 | 255 | 19 | M | Scottish | Galloway | | 30 | 256 | 24 | M | English | Birmingham | | 31 | 257 | 24 | F | English | Southern England | | 32 | 258 | 22 | M | English | Southern England | | 33 | 259 | 23 | M | English | Nottingham | | 34 | 260 | 21 | M | Scottish | Orkney | | 35 | 261 | 26 | F | Northern Irish | Belfast | | 36 | 262 | 23 | F | Scottish | Edinburgh | | 37 | 263 | 22 | M | Scottish | Aberdeen | | 38 | 264 | 23 | F | Scottish | West Lothian | | 39 | 265 | 23 | F | Scottish | Ross | | 40 | 266 | 22 | F | Irish | Athlone | | 41 | 267 | 23 | F | English | Yorkshire | | 42 | 268 | 23 | F | English | Southern England | | 43 | 269 | 20 | F | English | Newcastle | | 44 | 270 | 21 | M | English | Yorkshire | | 45 | 271 | 19 | M | Scottish | Fife | | 46 | 272 | 23 | M | Scottish | Edinburgh | | 47 | 273 | 23 | M | English | Suffolk | | 48 | 274 | 22 | M | English | Essex | | 49 | 275 | 23 | M | Scottish | Midlothian | | 50 | 276 | 24 | F | English | Oxford | | 51 | 277 | 23 | F | English | NE England | | 52 | 278 | 22 | M | English | Cheshire | | 53 | 279 | 23 | M | English | Leicester | | 54 | 280 | | | Unknown | | | 55 | 281 | 29 | M | Scottish | Edinburgh | | 56 | 282 | 23 | F | English | Newcastle | | 57 | 283 | 24 | F | Irish | Cork | | 58 | 284 | 20 | M | Scottish | Fife | | 59 | 285 | 21 | M | Scottish | Edinburgh | | 60 | 286 | 23 | M | English | Newcastle | | 61 | 287 | 23 | M | English | York | | 62 | 288 | 22 | F | Irish | Dublin | | 63 | 292 | 23 | M | Northern Irish | Belfast | | 64 | 293 | 22 | F | Northern Irish | Belfast | | 65 | 294 | 33 | F | American | San Francisco | | 66 | 295 | 23 | F | Irish | Dublin | | 67 | 297 | 20 | F | American | New York | | 68 | 298 | 19 | M | Irish | Tipperary | | 69 | 299 | 25 | F | American | California | | 70 | 300 | 23 | F | American | California | | 71 | 301 | 23 | F | American | North Carolina | | 72 | 302 | 20 | M | Canadian | Montreal | | 73 | 303 | 24 | F | Canadian | Toronto | | 74 | 304 | 22 | M | Northern Irish | Belfast | | 75 | 305 | 19 | F | American | Philadelphia | | 76 | 306 | 21 | F | American | New York | | 77 | 307 | 23 | F | Canadian | Ontario | | 78 | 308 | 18 | F | American | Alabama | | 79 | 310 | 21 | F | American | Tennessee | | 80 | 311 | 21 | M | American | Iowa | | 81 | 312 | 19 | F | Canadian | Hamilton | | 82 | 313 | 24 | F | Irish | County Down | | 83 | 314 | 26 | F | South African | Cape Town | | 84 | 316 | 20 | M | Canadian | Alberta | | 85 | 317 | 23 | F | Canadian | Hamilton | | 86 | 318 | 32 | F | American | Napa | | 87 | 323 | 19 | F | South African | Pretoria | | 88 | 326 | 26 | M | Australian | Sydney | | 89 | 329 | 23 | F | American | | | 90 | 330 | 26 | F | American | | | 91 | 333 | 19 | F | American | Indiana | | 92 | 334 | 18 | M | American | Chicago | | 93 | 335 | 25 | F | New Zealand | English | | 94 | 336 | 18 | F | South African | Johannesburg | | 95 | 339 | 21 | F | American | Pennsylvania | | 96 | 340 | 18 | F | Irish | Dublin | | 97 | 341 | 26 | F | American | Ohio | | 98 | 343 | 27 | F | Canadian | Alberta | | 99 | 345 | 22 | M | American | Florida | | 100 | 347 | 26 | M | South African | Johannesburg | | 101 | 351 | 21 | F | Northern Irish | Derry | | 102 | 360 | 19 | M | American | New Jersey | | 103 | 361 | 19 | F | American | New Jersey | | 104 | 362 | 29 | F | American | | | 105 | 363 | 22 | M | Canadian | Toronto | | 106 | 364 | 23 | M | Irish | Donegal | | 107 | 374 | 28 | M | Australian | English |