Audio Classification
FBAGSTM commited on
Commit
ca7b1fc
·
verified ·
1 Parent(s): 40c05cc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -65,8 +65,8 @@ It outputs embedding vectors of size 2048 for the 2 stacks version, and 3548 for
65
 
66
  | Model | Format | Resolution | Series | Activation RAM (KiB) | Runtime RAM (KiB)| Weights Flash (KiB) | Code Flash (KiB) | Total RAM (KiB) | Total Flash (KiB)| STM32Cube.AI version |
67
  |-------------------|--------|------------|---------|----------------|-------------|---------------|------------|-------------|-------------|-----------------------|
68
- | [MiniResNet 1stack ](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_1stacks_64x50_tl/miniresnet_1stacks_64x50_tl_int8.tflite) | int8 | 64x50x1 | B-U585I-IOT02A | 59.89 | 5.38 | 123.6 | 56.9 | 65.27 | 180.5 | 10.0.0 |
69
- | [MiniResNet 2stacks ](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_2stacks_64x50_tl/miniresnet_2stacks_64x50_tl_int8.tflite) | int8 | 64x50x1 | B-U585I-IOT02A | 59.89 | 8.37 | 431.1 | 63.69 | 68.26 | 494.9 | 10.0.0 |
70
 
71
 
72
  ### Reference inference time based on ESC-10 dataset
@@ -74,8 +74,8 @@ It outputs embedding vectors of size 2048 for the 2 stacks version, and 3548 for
74
 
75
  | Model | Format | Resolution | Board | Execution Engine | Frequency | Inference time (ms) | STM32Cube.AI version |
76
  |-------------------|--------|------------|------------------|------------------|-------------|-----------------|-----------------------|
77
- | [MiniResNet 1stacks ](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_1stacks_64x50_tl/miniresnet_1stacks_64x50_tl_int8.tflite) | int8 | 64x50x1 | B-U585I-IOT02A | 1 CPU | 160 MHz | 92.25 | 10.0.0 |
78
- | [MiniResNet 2stacks ](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_2stacks_64x50_tl/miniresnet_2stacks_64x50_tl_int8.tflite) | int8 | 64x50x1 | B-U585I-IOT02A | 1 CPU | 160 MHz | 142.69 | 10.0.0 |
79
 
80
 
81
  ### Accuracy with ESC-10 dataset
@@ -86,10 +86,10 @@ The reason this metric is used instead of patch-level accuracy is because patch-
86
 
87
  | Model | Format | Resolution | Clip-level Accuracy |
88
  |-------|--------|------------|----------------|
89
- | [MiniResNet 1stack ](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_1stacks_64x50_tl/miniresnet_1stacks_64x50_tl.h5) | float32 | 64x50x1 | 89.9% |
90
- | [MiniResNet 1stack ](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_1stacks_64x50_tl/miniresnet_1stacks_64x50_tl_int8.tflite) | int8 | 64x50x1 | 88.9% |
91
- | [MiniResNet 2stacks ](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_2stacks_64x50_tl/miniresnet_2stacks_64x50_tl.h5) | float32 | 64x50x1 | 92.4% |
92
- | [MiniResNet 2stacks ](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_2stacks_64x50_tl/miniresnet_2stacks_64x50_tl_int8.tflite) | int8 | 64x50x1 | 93.6% |
93
 
94
  ## Retraining and Integration in a simple example:
95
 
 
65
 
66
  | Model | Format | Resolution | Series | Activation RAM (KiB) | Runtime RAM (KiB)| Weights Flash (KiB) | Code Flash (KiB) | Total RAM (KiB) | Total Flash (KiB)| STM32Cube.AI version |
67
  |-------------------|--------|------------|---------|----------------|-------------|---------------|------------|-------------|-------------|-----------------------|
68
+ | [MiniResNet 1stack ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_1stacks_64x50_tl/miniresnet_1stacks_64x50_tl_int8.tflite) | int8 | 64x50x1 | B-U585I-IOT02A | 59.89 | 5.38 | 123.6 | 56.9 | 65.27 | 180.5 | 10.0.0 |
69
+ | [MiniResNet 2stacks ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_2stacks_64x50_tl/miniresnet_2stacks_64x50_tl_int8.tflite) | int8 | 64x50x1 | B-U585I-IOT02A | 59.89 | 8.37 | 431.1 | 63.69 | 68.26 | 494.9 | 10.0.0 |
70
 
71
 
72
  ### Reference inference time based on ESC-10 dataset
 
74
 
75
  | Model | Format | Resolution | Board | Execution Engine | Frequency | Inference time (ms) | STM32Cube.AI version |
76
  |-------------------|--------|------------|------------------|------------------|-------------|-----------------|-----------------------|
77
+ | [MiniResNet 1stacks ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_1stacks_64x50_tl/miniresnet_1stacks_64x50_tl_int8.tflite) | int8 | 64x50x1 | B-U585I-IOT02A | 1 CPU | 160 MHz | 92.25 | 10.0.0 |
78
+ | [MiniResNet 2stacks ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_2stacks_64x50_tl/miniresnet_2stacks_64x50_tl_int8.tflite) | int8 | 64x50x1 | B-U585I-IOT02A | 1 CPU | 160 MHz | 142.69 | 10.0.0 |
79
 
80
 
81
  ### Accuracy with ESC-10 dataset
 
86
 
87
  | Model | Format | Resolution | Clip-level Accuracy |
88
  |-------|--------|------------|----------------|
89
+ | [MiniResNet 1stack ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_1stacks_64x50_tl/miniresnet_1stacks_64x50_tl.h5) | float32 | 64x50x1 | 89.9% |
90
+ | [MiniResNet 1stack ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_1stacks_64x50_tl/miniresnet_1stacks_64x50_tl_int8.tflite) | int8 | 64x50x1 | 88.9% |
91
+ | [MiniResNet 2stacks ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_2stacks_64x50_tl/miniresnet_2stacks_64x50_tl.h5) | float32 | 64x50x1 | 92.4% |
92
+ | [MiniResNet 2stacks ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_2stacks_64x50_tl/miniresnet_2stacks_64x50_tl_int8.tflite) | int8 | 64x50x1 | 93.6% |
93
 
94
  ## Retraining and Integration in a simple example:
95