Update README.md
Browse files
README.md
CHANGED
@@ -57,6 +57,14 @@ model-index:
|
|
57 |
|
58 |
# SpeechLLM
|
59 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
60 |
## Usage
|
61 |
```python
|
62 |
# Load model directly from huggingface
|
|
|
57 |
|
58 |
# SpeechLLM
|
59 |
|
60 |
+
SpeechLLM is a multi-modal LLM trained to predict the metadata of the speaker's turn in a conversation. SpeechLLM model is based on HubertX acoustic encoder and TinyLlama LLM. The model predicts the following:
|
61 |
+
1. Speech Activity
|
62 |
+
2. ASR Transcript
|
63 |
+
3. Gender of the speaker
|
64 |
+
4. Age of the speaker
|
65 |
+
5. Accent of the speaker
|
66 |
+
6. Emotion of the speaker
|
67 |
+
|
68 |
## Usage
|
69 |
```python
|
70 |
# Load model directly from huggingface
|