File size: 1,325 Bytes
bb91756
 
d87a106
 
 
 
 
bb91756
d87a106
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
---
license: apache-2.0
language:
- en
library_name: fairseq
pipeline_tag: automatic-speech-recognition
inference: false
---


<br>
<br>

# ARMHuBERT Model Card

This repo contains the models from our paper [**Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation**](https://arxiv.org/abs/2305.11685), INTERSPEECH 2023.


## Model details

**Model type:**
ARMHuBERT is an open-source speech SSL model distilled from HuBERT-Base, by attention map reusing and masking distillation. 
We also provide the model checkpoints of MaskHuBERT (without attention map reusing) and ARMwavLM (wavLM-Base teacher). 

- Attention Map Reusing: Reuse previous layer's attention map to remove key & query parameters in Transformer.
- Masking Distillation: Masking distillation treating masked frames and unmasked frames separately.

**License:**
Apache 2.0 License

**Where to send questions or comments about the model:**
https://github.com/sungnyun/ARMHuBERT/issues


## Training dataset
Pretraining data: [LibriSpeech](https://www.openslr.org/12)
- ``[ModelName]-100h.ckpt``: train-clean-100
- ``[ModelName]-960h.ckpt``: train-clean-100 + train-clean-360 + train-other-500


<br>

More detials are in our github, https://github.com/sungnyun/ARMHuBERT.