kamilakesbi
/

speaker-segmentation-fine-tuned-callhome-jpn

Model card Files Files and versions Metrics Training metrics Community

kamilakesbi commited on May 17

Commit

2c619f0

•

1 Parent(s): 720e5d3

Upload folder using huggingface_hub

Browse files

Files changed (3) hide show

README.md +14 -57
config.yaml +21 -0
pytorch_model.bin +3 -0

README.md CHANGED Viewed

@@ -1,69 +1,26 @@
 ---
 license: mit
-base_model: pyannote/segmentation-3.0
 tags:
 - speaker-diarization
 - speaker-segmentation
 - generated_from_trainer
 datasets:
 - diarizers-community/callhome
 model-index:
 - name: speaker-segmentation-fine-tuned-callhome-jpn
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# speaker-segmentation-fine-tuned-callhome-jpn
-This model is a fine-tuned version of [pyannote/segmentation-3.0](https://huggingface.co/pyannote/segmentation-3.0) on the diarizers-community/callhome jpn dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.7433
-- Der: 0.2234
-- False Alarm: 0.0478
-- Missed Detection: 0.1328
-- Confusion: 0.0428
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 0.001
-- train_batch_size: 32
-- eval_batch_size: 32
-- seed: 42
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: cosine
-- num_epochs: 5.0
-### Training results
-| Training Loss | Epoch | Step | Validation Loss | Der    | False Alarm | Missed Detection | Confusion |
-|:-------------:|:-----:|:----:|:---------------:|:------:|:-----------:|:----------------:|:---------:|
-| 0.5771        | 1.0   | 328  | 0.7534          | 0.2321 | 0.0564      | 0.1261           | 0.0496    |
-| 0.5388        | 2.0   | 656  | 0.7503          | 0.2261 | 0.0485      | 0.1347           | 0.0429    |
-| 0.5061        | 3.0   | 984  | 0.7486          | 0.2248 | 0.0475      | 0.1350           | 0.0423    |
-| 0.4883        | 4.0   | 1312 | 0.7374          | 0.2227 | 0.0492      | 0.1315           | 0.0421    |
-| 0.493         | 5.0   | 1640 | 0.7433          | 0.2234 | 0.0478      | 0.1328           | 0.0428    |
-### Framework versions
-- Transformers 4.40.0
-- Pytorch 2.2.2+cu121
-- Datasets 2.18.0
-- Tokenizers 0.19.1

 ---
 license: mit
 tags:
 - speaker-diarization
 - speaker-segmentation
 - generated_from_trainer
+- pyannote
+- pyannote.audio
+- pyannote-audio-model
+- audio
+- voice
+- speech
+- speaker
+- speaker-change-detection
+- voice-activity-detection
+- overlapped-speech-detection
+- resegmentation
+base_model: pyannote/segmentation-3.0
 datasets:
 - diarizers-community/callhome
+licence: mit
 model-index:
 - name: speaker-segmentation-fine-tuned-callhome-jpn
   results: []
 ---
+This is the model card of a pyannote model that has been pushed on the Hub. This model card has been automatically generated.

config.yaml ADDED Viewed

	@@ -0,0 +1,21 @@

+model:
+  _target_: pyannote.audio.models.segmentation.PyanNet.PyanNet
+  linear:
+    hidden_size: 128
+    num_layers: 2
+  lstm:
+    batch_first: true
+    bidirectional: true
+    dropout: 0.0
+    hidden_size: 128
+    monolithic: true
+    num_layers: 4
+  num_channels: 1
+  sample_rate: 16000
+  sincnet:
+    sample_rate: 16000
+    stride: 10
+task:
+  duration: 10.0
+  max_speakers_per_chunk: 3
+  max_speakers_per_frame: 2

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cf667e302cb3ad72316803868e2cf007d35d506e4ac6daafdd527dfd69f3fa72
+size 5912144