admin commited on
Commit
eeb05df
·
1 Parent(s): 1e7bc0f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -5
README.md CHANGED
@@ -14,6 +14,9 @@ tags:
14
 
15
  The design of the chest-falsetto voice discrimination model aims to effectively differentiate between real and synthetic voices in audio samples, with four specific categories including male chest, male falsetto, female chest, and female falsetto voices. The model's training is based on a backbone network from the computer vision (CV) domain, which involves transforming audio data into spectrograms and fine-tuning to enhance the network's accuracy in recognizing different voice categories. During training, a dataset containing both real and synthetic voice samples is utilized to ensure the model adequately learns and captures features relevant to male and female chest and falsetto voices. Through this approach, the model can finely classify different genders and chest/falsetto voices, providing a reliable solution for accurate voice discrimination in audio. This model holds broad potential applications in fields such as speech processing and music production, offering an efficient and precise tool for audio analysis and processing. Its training and fine-tuning strategies based on computer vision principles highlight the model's adaptability and robustness across different domains, providing beneficial examples for further research and application.
16
 
 
 
 
17
  ## Usage
18
  ```python
19
  from modelscope import snapshot_download
@@ -40,20 +43,36 @@ A demo result of SqueezeNet fine-tuning:
40
  <table id="pianos">
41
  <tr>
42
  <th>Loss curve</th>
43
- <td><img src="./loss.jpg"></td>
44
  </tr>
45
  <tr>
46
  <th>Training and validation accuracy</th>
47
- <td><img src="./acc.jpg"></td>
48
  </tr>
49
  <tr>
50
  <th>Confusion matrix</th>
51
- <td><img src="./mat.jpg"></td>
52
  </tr>
53
  </table>
54
 
 
 
 
55
  ## Mirror
56
  <https://www.modelscope.cn/models/ccmusic-database/chest_falsetto>
57
 
58
- ## Reference
59
- [1] <https://github.com/monetjoe/ccmusic_eval>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  The design of the chest-falsetto voice discrimination model aims to effectively differentiate between real and synthetic voices in audio samples, with four specific categories including male chest, male falsetto, female chest, and female falsetto voices. The model's training is based on a backbone network from the computer vision (CV) domain, which involves transforming audio data into spectrograms and fine-tuning to enhance the network's accuracy in recognizing different voice categories. During training, a dataset containing both real and synthetic voice samples is utilized to ensure the model adequately learns and captures features relevant to male and female chest and falsetto voices. Through this approach, the model can finely classify different genders and chest/falsetto voices, providing a reliable solution for accurate voice discrimination in audio. This model holds broad potential applications in fields such as speech processing and music production, offering an efficient and precise tool for audio analysis and processing. Its training and fine-tuning strategies based on computer vision principles highlight the model's adaptability and robustness across different domains, providing beneficial examples for further research and application.
16
 
17
+ ## Demo
18
+ <https://huggingface.co/spaces/ccmusic-database/chest-falsetto>
19
+
20
  ## Usage
21
  ```python
22
  from modelscope import snapshot_download
 
43
  <table id="pianos">
44
  <tr>
45
  <th>Loss curve</th>
46
+ <td><img src="https://www.modelscope.cn/api/v1/models/ccmusic-database/chest_falsetto/repo?Revision=master&FilePath=.%2Fsqueezenet1_1_cqt%2Floss.jpg&View=true"></td>
47
  </tr>
48
  <tr>
49
  <th>Training and validation accuracy</th>
50
+ <td><img src="https://www.modelscope.cn/api/v1/models/ccmusic-database/chest_falsetto/repo?Revision=master&FilePath=.%2Fsqueezenet1_1_cqt%2Facc.jpg&View=true"></td>
51
  </tr>
52
  <tr>
53
  <th>Confusion matrix</th>
54
+ <td><img src="https://www.modelscope.cn/api/v1/models/ccmusic-database/chest_falsetto/repo?Revision=master&FilePath=.%2Fsqueezenet1_1_cqt%2Fmat.jpg&View=true"></td>
55
  </tr>
56
  </table>
57
 
58
+ ## Dataset
59
+ <https://huggingface.co/datasets/ccmusic-database/chest_falsetto>
60
+
61
  ## Mirror
62
  <https://www.modelscope.cn/models/ccmusic-database/chest_falsetto>
63
 
64
+ ## Evaluation
65
+ <https://github.com/monetjoe/ccmusic_eval>
66
+
67
+ ## Cite
68
+ ```bibtex
69
+ @dataset{zhaorui_liu_2021_5676893,
70
+ author = {Monan Zhou, Shenyang Xu, Zhaorui Liu, Zhaowen Wang, Feng Yu, Wei Li and Baoqiang Han},
71
+ title = {CCMusic: an Open and Diverse Database for Chinese and General Music Information Retrieval Research},
72
+ month = {mar},
73
+ year = {2024},
74
+ publisher = {HuggingFace},
75
+ version = {1.2},
76
+ url = {https://huggingface.co/ccmusic-database}
77
+ }
78
+ ```