speechbrain
/

emotion-diarization-wavlm-large

Audio Classification

Model card Files Files and versions Community

yingzhi commited on Jul 4, 2023

Commit

845a974

•

1 Parent(s): a2ce4ef

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -34,12 +34,12 @@ For a better experience, we encourage you to learn more about [SpeechBrain](http
 | Release | EDER(%) |
 |:-------------:|:--------------:|
-| 19-10-21 | 29.7 (Avg: 30.2) |
 ## Pipeline description
-This system is composed of an wavlm model. It is a combination of convolutional and residual blocks. The task aimes to predict the correct emotion composants and their boundaries within an utterance. For now, the model was trained with audios that contain only 1 non-neutral emotion event.
 The system is trained with recordings sampled at 16kHz (single channel).
 The code will automatically normalize your audio (i.e., resampling + mono channel selection) when calling *diarize_file* if needed.
@@ -70,14 +70,14 @@ classifier = foreign_class(
 diary = classifier.diarize_file("speechbrain/emotion-diarization-wavlm-large/example.wav")
 print(diary)
 ```
-The output will contain a dictionary of emotion composants and their boundaries.
 ### Inference on GPU
 To perform inference on the GPU, add  `run_opts={"device":"cuda"}`  when calling the `from_hparams` method.
 ### Training
 The model was trained with SpeechBrain (aa018540).
-To train it from scratch follows these steps:
 1. Clone SpeechBrain:
 ```bash
 git clone https://github.com/speechbrain/speechbrain/

 | Release | EDER(%) |
 |:-------------:|:--------------:|
+| 05-07-23 | 29.7 (Avg: 30.2) |
 ## Pipeline description
+This system is composed of a wavlm encoder a downstream frame-wise classifier. The task aimes to predict the correct emotion components and their boundaries within an utterance. For now, the model was trained with audios that contain only 1 non-neutral emotion event.
 The system is trained with recordings sampled at 16kHz (single channel).
 The code will automatically normalize your audio (i.e., resampling + mono channel selection) when calling *diarize_file* if needed.
 diary = classifier.diarize_file("speechbrain/emotion-diarization-wavlm-large/example.wav")
 print(diary)
 ```
+The output will contain a dictionary of emotion components and their boundaries.
 ### Inference on GPU
 To perform inference on the GPU, add  `run_opts={"device":"cuda"}`  when calling the `from_hparams` method.
 ### Training
 The model was trained with SpeechBrain (aa018540).
+To train it from scratch follow these steps:
 1. Clone SpeechBrain:
 ```bash
 git clone https://github.com/speechbrain/speechbrain/