Update README.md
Browse files
README.md
CHANGED
@@ -85,7 +85,7 @@ model-index:
|
|
85 |
|
86 |
[The model is still training, we will be releasing the latest checkpoints soon...]
|
87 |
|
88 |
-
SpeechLLM is a multi-modal LLM trained to predict the metadata of the speaker's turn in a conversation.
|
89 |
1. **SpeechActivity** : if the audio signal contains speech (True/False)
|
90 |
2. **Transcript** : ASR transcript of the audio
|
91 |
3. **Gender** of the speaker (Female/Male)
|
|
|
85 |
|
86 |
[The model is still training, we will be releasing the latest checkpoints soon...]
|
87 |
|
88 |
+
SpeechLLM is a multi-modal LLM trained to predict the metadata of the speaker's turn in a conversation. speechllm-2B model is based on HubertX acoustic encoder and TinyLlama LLM. The model predicts the following:
|
89 |
1. **SpeechActivity** : if the audio signal contains speech (True/False)
|
90 |
2. **Transcript** : ASR transcript of the audio
|
91 |
3. **Gender** of the speaker (Female/Male)
|