ccmusic-database
/

GZ_IsoTech

Model card Files Files and versions

admin commited on 18 days ago

Commit

5afd5b6

1 Parent(s): 59b6383

upd md

Browse files

Files changed (1) hide show

README.md +23 -25

README.md CHANGED Viewed

@@ -2,64 +2,62 @@
 license: mit
 ---
-# Intro 简介
 The Guzheng Performance Technique Recognition Model is trained on the GZ_IsoTech Dataset, which consists of 2,824 audio clips that showcase various Guzheng playing techniques. Of these, 2,328 clips are from a virtual sound library, and 496 clips are performed by a highly skilled professional Guzheng artist, covering the full tonal range inherent to the Guzheng instrument. The audio clips are categorized into eight different playing techniques based on the unique performance practices of the Guzheng: Vibrato (chanyin), Slide-up (shanghuayin), Slide-down (xiahuayin), Return Slide (huihuayin), Glissando (guazou, huazhi, etc.), Thumb Plucking (yaozhi), Harmonics (fanyin), and Plucking Techniques (gou, da, mo, tuo, etc.). The model utilizes feature extraction, time-domain and frequency-domain analysis, and pattern recognition to accurately identify these distinct Guzheng playing techniques. The application of this model provides strong support for the automatic recognition, digital analysis, and educational research of Guzheng performance techniques, promoting the preservation and innovation of Guzheng art.
-古筝演奏技法识别模型是基于古筝演奏技法数据集训练的，该数据集包含2,824个音频片段，展示了各种古筝演奏技巧的特征。数据集中的2,328个音频片段来自虚拟声音库，496个片段由一位技艺高超的专业古筝艺术家演奏，涵盖了古筝乐器固有的全面音调范围。这些音频片段根据古筝特有的演奏技巧被划分为八个类别：颤音（chanyin）、上滑音（shanghuayin）、下滑音（xiahuayin）、回滑音（huihuayin）、刮奏（guazou, huazhi等）、摇指（yaozhi）、泛音（fanyin）以及拨弦技巧（gou, da, mo, tuo等）。该模型通过对这些音频片段进行特征提取、时域与频域分析、以及模式识别，能够准确识别出不同古筝演奏技巧。该模型的应用能够为古筝演奏技巧的自动识别、数字化分析与教学研究提供有力支持，推动古筝艺术的传承与创新。
-## Demo 在线演示
 <https://huggingface.co/spaces/ccmusic-database/GZ_IsoTech>
-## Usage 使用
 ```python
 from modelscope import snapshot_download
 model_dir = snapshot_download("ccmusic-database/GZ_IsoTech")
 ```
-## Maintenance 维护
 ```bash
 git clone git@hf.co:ccmusic-database/GZ_IsoTech
 cd GZ_IsoTech
 ```
-## Results 训练结果
-|      Backbone      | Size(M) |                 Mel                  |     CQT     |   Chroma    |
-| :----------------: | :-----: | :----------------------------------: | :---------: | :---------: |
-|      vit_l_16      |  304.3  | [**_0.855_**](#best-result-最佳结果) | **_0.824_** | **_0.770_** |
-|      maxvit_t      |  30.9   |                0.763                 |    0.776    |    0.642    |
-|                    |         |                                      |             |             |
-|  resnext101_64x4d  |  83.5   |                0.713                 |    0.765    |    0.639    |
-|     resnet101      |  44.5   |                0.731                 |    0.798    | **_0.719_** |
-|    regnet_y_8gf    |  39.4   |                0.804                 | **_0.807_** |    0.716    |
-| shufflenet_v2_x2_0 |   7.4   |                0.702                 |    0.799    |    0.665    |
-| mobilenet_v3_large |   5.5   |             **_0.806_**              |    0.798    |    0.657    |
-### Best result 最佳结果
 <table>
     <tr>
         <th>Loss curve</th>
-        <td><img src="https://www.modelscope.cn/api/v1/models/ccmusic-database/GZ_IsoTech/repo?Revision=master&FilePath=.%2Fvit_l_16_mel_2024-12-06_08-28-13%2Floss.jpg&View=true"></td>
     </tr>
     <tr>
         <th>Training and validation accuracy</th>
-        <td><img src="https://www.modelscope.cn/api/v1/models/ccmusic-database/GZ_IsoTech/repo?Revision=master&FilePath=.%2Fvit_l_16_mel_2024-12-06_08-28-13%2Facc.jpg&View=true"></td>
     </tr>
     <tr>
         <th>Confusion matrix</th>
-        <td><img src="https://www.modelscope.cn/api/v1/models/ccmusic-database/GZ_IsoTech/repo?Revision=master&FilePath=.%2Fvit_l_16_mel_2024-12-06_08-28-13%2Fmat.jpg&View=true"></td>
     </tr>
 </table>
-## Dataset 数据集
 <https://huggingface.co/datasets/ccmusic-database/GZ_IsoTech>
-## Mirror 镜像
 <https://www.modelscope.cn/models/ccmusic-database/GZ_IsoTech>
-## Evaluation 校验
 <https://github.com/monetjoe/ccmusic_eval>
-## Cite 引用
 ```bibtex
 @dataset{zhaorui_liu_2021_5676893,
   author       = {Monan Zhou, Shenyang Xu, Zhaorui Liu, Zhaowen Wang, Feng Yu, Wei Li and Baoqiang Han},

 license: mit
 ---
+# Intro
 The Guzheng Performance Technique Recognition Model is trained on the GZ_IsoTech Dataset, which consists of 2,824 audio clips that showcase various Guzheng playing techniques. Of these, 2,328 clips are from a virtual sound library, and 496 clips are performed by a highly skilled professional Guzheng artist, covering the full tonal range inherent to the Guzheng instrument. The audio clips are categorized into eight different playing techniques based on the unique performance practices of the Guzheng: Vibrato (chanyin), Slide-up (shanghuayin), Slide-down (xiahuayin), Return Slide (huihuayin), Glissando (guazou, huazhi, etc.), Thumb Plucking (yaozhi), Harmonics (fanyin), and Plucking Techniques (gou, da, mo, tuo, etc.). The model utilizes feature extraction, time-domain and frequency-domain analysis, and pattern recognition to accurately identify these distinct Guzheng playing techniques. The application of this model provides strong support for the automatic recognition, digital analysis, and educational research of Guzheng performance techniques, promoting the preservation and innovation of Guzheng art.
+## Demo
 <https://huggingface.co/spaces/ccmusic-database/GZ_IsoTech>
+## Usage
 ```python
 from modelscope import snapshot_download
 model_dir = snapshot_download("ccmusic-database/GZ_IsoTech")
 ```
+## Maintenance
 ```bash
 git clone git@hf.co:ccmusic-database/GZ_IsoTech
 cd GZ_IsoTech
 ```
+## Results
+|      Backbone      | Size(M) |             Mel             |     CQT     |   Chroma    |
+| :----------------: | :-----: | :-------------------------: | :---------: | :---------: |
+|      vit_l_16      |  304.3  | [**_0.855_**](#best-result) | **_0.824_** | **_0.770_** |
+|      maxvit_t      |  30.9   |            0.763            |    0.776    |    0.642    |
+|                    |         |                             |             |             |
+|  resnext101_64x4d  |  83.5   |            0.713            |    0.765    |    0.639    |
+|     resnet101      |  44.5   |            0.731            |    0.798    | **_0.719_** |
+|    regnet_y_8gf    |  39.4   |            0.804            | **_0.807_** |    0.716    |
+| shufflenet_v2_x2_0 |   7.4   |            0.702            |    0.799    |    0.665    |
+| mobilenet_v3_large |   5.5   |         **_0.806_**         |    0.798    |    0.657    |
+### Best result
 <table>
     <tr>
         <th>Loss curve</th>
+        <td><img src="https://www.modelscope.cn/models/ccmusic-database/GZ_IsoTech/resolve/master/vit_l_16_mel_2024-12-06_08-28-13/loss.jpg"></td>
     </tr>
     <tr>
         <th>Training and validation accuracy</th>
+        <td><img src="https://www.modelscope.cn/models/ccmusic-database/GZ_IsoTech/resolve/master/vit_l_16_mel_2024-12-06_08-28-13/acc.jpg"></td>
     </tr>
     <tr>
         <th>Confusion matrix</th>
+        <td><img src="https://www.modelscope.cn/models/ccmusic-database/GZ_IsoTech/resolve/master/vit_l_16_mel_2024-12-06_08-28-13/mat.jpg"></td>
     </tr>
 </table>
+## Dataset
 <https://huggingface.co/datasets/ccmusic-database/GZ_IsoTech>
+## Mirror
 <https://www.modelscope.cn/models/ccmusic-database/GZ_IsoTech>
+## Evaluation
 <https://github.com/monetjoe/ccmusic_eval>
+## Cite
 ```bibtex
 @dataset{zhaorui_liu_2021_5676893,
   author       = {Monan Zhou, Shenyang Xu, Zhaorui Liu, Zhaowen Wang, Feng Yu, Wei Li and Baoqiang Han},