Spaces:

unilight
/

sheet-demo

Sleeping

App Files Files Community

unilight commited on Oct 3, 2024

Commit

d2ddbfd

1 Parent(s): 48ce4d1

change input type

Browse files

Files changed (1) hide show

app.py +8 -3

app.py CHANGED Viewed

@@ -128,8 +128,13 @@ def predict(model_name, wav_file):
 with gr.Blocks(title="S3PRL-VC: Any-to-one voice conversion demo on VCC2020") as demo:
     gr.Markdown(
         """
-        # S3PRL-VC: Any-to-one voice conversion demo on VCC2020
-        ### [[Paper (ICASSP2023)]](https://arxiv.org/abs/2110.06280) [[Paper(JSTSP)]](https://arxiv.org/abs/2207.04356) [[Code]](https://github.com/unilight/s3prl-vc)
         **S3PRL-VC** is a voice conversion (VC) toolkit for benchmarking self-supervised speech representations (S3Rs). The term **any-to-one** means that the system can convert from any unseen speaker to a pre-defined speaker given in training.
         In this demo, you can record your voice, and the model will convert your voice to one of the four pre-defined speakers. These four speakers come from the **voice conversion challenge (VCC) 2020**. You can listen to the samples to get a sense of what these speakers sound like.
         The **RTF** of the system is around **1.5~2.5**, i.e. if you recorded a 5 second long audio, it will take 5 * (1.5~2.5) = 7.5~12.5 seconds to generate the output.
@@ -139,7 +144,7 @@ with gr.Blocks(title="S3PRL-VC: Any-to-one voice conversion demo on VCC2020") as
     with gr.Row():
         with gr.Column():
             gr.Markdown("## Record your speech here!")
-            input_wav = gr.Audio(label="Input speech", sources='microphone', type='filepath')
             gr.Markdown("## Select a model!")
             model_name = gr.Radio(label="Model", choices=list(model_paths.keys()))

 with gr.Blocks(title="S3PRL-VC: Any-to-one voice conversion demo on VCC2020") as demo:
     gr.Markdown(
         """
+        # Demo for SHEET: Speech Human Evaluation Estimation Toolkit
+        ### [Paper (To be uploaded)] [[Code]](https://github.com/unilight/sheet)
+        **SHEET** is a subjective speech quality assessment (SSQA) toolkit designed to conduct SSQA research. It was specifically designed to interactive with MOS-Bench, a collective of datasets to benchmark SSQA models.
+        In this demo, we provide interactive models
         **S3PRL-VC** is a voice conversion (VC) toolkit for benchmarking self-supervised speech representations (S3Rs). The term **any-to-one** means that the system can convert from any unseen speaker to a pre-defined speaker given in training.
         In this demo, you can record your voice, and the model will convert your voice to one of the four pre-defined speakers. These four speakers come from the **voice conversion challenge (VCC) 2020**. You can listen to the samples to get a sense of what these speakers sound like.
         The **RTF** of the system is around **1.5~2.5**, i.e. if you recorded a 5 second long audio, it will take 5 * (1.5~2.5) = 7.5~12.5 seconds to generate the output.
     with gr.Row():
         with gr.Column():
             gr.Markdown("## Record your speech here!")
+            input_wav = gr.Audio(label="Input speech", type='filepath')
             gr.Markdown("## Select a model!")
             model_name = gr.Radio(label="Model", choices=list(model_paths.keys()))