yangwang825
commited on
Commit
•
07f32ca
1
Parent(s):
11ad04f
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Speaker Identification with ECAPA-TDNN embeddings on Voxceleb
|
2 |
+
|
3 |
+
This repository provides a pretrained ECAPA-TDNN model using SpeechBrain. The system can be used to extract speaker embeddings as well. It is trained on Voxceleb 2 development data only.
|
4 |
+
|
5 |
+
# Pipeline description
|
6 |
+
|
7 |
+
This system is composed of an ECAPA-TDNN model. It is a combination of convolutional and residual blocks. The embeddings are extracted using attentive statistical pooling. The system is trained with Additive Margin Softmax Loss.
|
8 |
+
|
9 |
+
# Compute the speaker embeddings
|
10 |
+
|
11 |
+
The system is trained with recordings sampled at 16kHz (single channel).
|
12 |
+
|
13 |
+
```python
|
14 |
+
import torchaudio
|
15 |
+
from speechbrain.pretrained import EncoderClassifier
|
16 |
+
classifier = EncoderClassifier.from_hparams(source="yangwang825/ecapa-tdnn-vox2")
|
17 |
+
signal, fs = torchaudio.load('spk1_snt1.wav')
|
18 |
+
embeddings = classifier.encode_batch(signal)
|
19 |
+
```
|
20 |
+
|
21 |
+
You can find our training results (models, logs, etc) here.
|