Spaces:
Paused
Paused
Rename WhisperBot to WhisperFusion
Browse files- README.md +11 -12
- README.qmd +7 -7
- docker/Dockerfile +4 -4
- docker/build.sh +3 -3
- docker/publish.sh +2 -2
- docker/scripts/{run-whisperbot.sh β run-whisperfusion.sh} +1 -1
- docker/scripts/{setup-whisperbot.sh β setup-whisperfusion.sh} +2 -2
- docker/scripts/setup.sh +1 -1
README.md
CHANGED
@@ -1,8 +1,8 @@
|
|
1 |
-
#
|
2 |
|
3 |
|
4 |
-
Welcome to
|
5 |
-
[WhisperLive](https://github.com/collabora/WhisperLive) and
|
6 |
[WhisperSpeech](https://github.com/collabora/WhisperSpeech) by
|
7 |
integrating Mistral, a Large Language Model (LLM), on top of the
|
8 |
real-time speech-to-text pipeline. WhisperLive relies on OpenAI Whisper,
|
@@ -149,17 +149,17 @@ cp -r phi-2 "$dest"
|
|
149 |
cp -r "$phi_path" "$dest/phi-orig-model"
|
150 |
```
|
151 |
|
152 |
-
## Build
|
153 |
|
154 |
> [!NOTE]
|
155 |
>
|
156 |
-
> These steps are included in `docker/scripts/setup-
|
157 |
|
158 |
Clone this repo and install requirements
|
159 |
|
160 |
``` bash
|
161 |
-
[ -d "
|
162 |
-
cd
|
163 |
apt update
|
164 |
apt install ffmpeg portaudio19-dev -y
|
165 |
```
|
@@ -174,7 +174,6 @@ Install all the other dependencies normally
|
|
174 |
|
175 |
``` bash
|
176 |
pip install -r requirements.txt
|
177 |
-
pip install openai-whisper whisperspeech soundfile
|
178 |
```
|
179 |
|
180 |
force update huggingface_hub (tokenizers 0.14.1 spuriously require and
|
@@ -191,7 +190,7 @@ curl -L -o /root/.cache/whisper-live/silero_vad.onnx https://github.com/snakers4
|
|
191 |
python -c 'from transformers.utils.hub import move_cache; move_cache()'
|
192 |
```
|
193 |
|
194 |
-
### Run
|
195 |
|
196 |
Take the folder path for Whisper TensorRT model, folder_path and
|
197 |
tokenizer_path for Mistral/Phi-2 TensorRT from the build phase. If a
|
@@ -200,11 +199,11 @@ huggingface repo name as the tokenizer path.
|
|
200 |
|
201 |
> [!NOTE]
|
202 |
>
|
203 |
-
> These steps are included in `docker/scripts/run-
|
204 |
|
205 |
``` bash
|
206 |
test -f /etc/shinit_v2 && source /etc/shinit_v2
|
207 |
-
cd
|
208 |
if [ "$1" != "mistral" ]; then
|
209 |
exec python3 main.py --phi \
|
210 |
--whisper_tensorrt_path /root/whisper_small_en \
|
@@ -222,7 +221,7 @@ fi
|
|
222 |
execute `run_client.py`
|
223 |
|
224 |
``` bash
|
225 |
-
cd
|
226 |
pip install -r requirements.txt
|
227 |
python3 run_client.py
|
228 |
```
|
|
|
1 |
+
# WhisperFusion
|
2 |
|
3 |
|
4 |
+
Welcome to WhisperFusion. WhisperFusion builds upon the capabilities of
|
5 |
+
the [WhisperLive](https://github.com/collabora/WhisperLive) and
|
6 |
[WhisperSpeech](https://github.com/collabora/WhisperSpeech) by
|
7 |
integrating Mistral, a Large Language Model (LLM), on top of the
|
8 |
real-time speech-to-text pipeline. WhisperLive relies on OpenAI Whisper,
|
|
|
149 |
cp -r "$phi_path" "$dest/phi-orig-model"
|
150 |
```
|
151 |
|
152 |
+
## Build WhisperFusion
|
153 |
|
154 |
> [!NOTE]
|
155 |
>
|
156 |
+
> These steps are included in `docker/scripts/setup-whisperfusion.sh`
|
157 |
|
158 |
Clone this repo and install requirements
|
159 |
|
160 |
``` bash
|
161 |
+
[ -d "WhisperFusion" ] || git clone https://github.com/collabora/WhisperFusion.git
|
162 |
+
cd WhisperFusion
|
163 |
apt update
|
164 |
apt install ffmpeg portaudio19-dev -y
|
165 |
```
|
|
|
174 |
|
175 |
``` bash
|
176 |
pip install -r requirements.txt
|
|
|
177 |
```
|
178 |
|
179 |
force update huggingface_hub (tokenizers 0.14.1 spuriously require and
|
|
|
190 |
python -c 'from transformers.utils.hub import move_cache; move_cache()'
|
191 |
```
|
192 |
|
193 |
+
### Run WhisperFusion with Whisper and Mistral/Phi-2
|
194 |
|
195 |
Take the folder path for Whisper TensorRT model, folder_path and
|
196 |
tokenizer_path for Mistral/Phi-2 TensorRT from the build phase. If a
|
|
|
199 |
|
200 |
> [!NOTE]
|
201 |
>
|
202 |
+
> These steps are included in `docker/scripts/run-whisperfusion.sh`
|
203 |
|
204 |
``` bash
|
205 |
test -f /etc/shinit_v2 && source /etc/shinit_v2
|
206 |
+
cd WhisperFusion
|
207 |
if [ "$1" != "mistral" ]; then
|
208 |
exec python3 main.py --phi \
|
209 |
--whisper_tensorrt_path /root/whisper_small_en \
|
|
|
221 |
execute `run_client.py`
|
222 |
|
223 |
``` bash
|
224 |
+
cd WhisperFusion
|
225 |
pip install -r requirements.txt
|
226 |
python3 run_client.py
|
227 |
```
|
README.qmd
CHANGED
@@ -27,9 +27,9 @@ These steps are included in `{fname}`
|
|
27 |
if code: print("```")
|
28 |
```
|
29 |
|
30 |
-
#
|
31 |
|
32 |
-
Welcome to
|
33 |
|
34 |
## Features
|
35 |
- **Real-Time Speech-to-Text**: Utilizes OpenAI WhisperLive to convert spoken language into text in real-time.
|
@@ -60,23 +60,23 @@ include_file('docker/scripts/build-mistral.sh')
|
|
60 |
include_file('docker/scripts/build-phi-2.sh')
|
61 |
```
|
62 |
|
63 |
-
## Build
|
64 |
|
65 |
```{python}
|
66 |
-
include_file('docker/scripts/setup-
|
67 |
```
|
68 |
|
69 |
-
### Run
|
70 |
|
71 |
Take the folder path for Whisper TensorRT model, folder_path and tokenizer_path for Mistral/Phi-2 TensorRT from the build phase. If a huggingface model is used to build mistral/phi-2 then just use the huggingface repo name as the tokenizer path.
|
72 |
|
73 |
```{python}
|
74 |
-
include_file('docker/scripts/run-
|
75 |
```
|
76 |
|
77 |
- On the client side clone the repo, install the requirements and execute `run_client.py`
|
78 |
```bash
|
79 |
-
cd
|
80 |
pip install -r requirements.txt
|
81 |
python3 run_client.py
|
82 |
```
|
|
|
27 |
if code: print("```")
|
28 |
```
|
29 |
|
30 |
+
# WhisperFusion
|
31 |
|
32 |
+
Welcome to WhisperFusion. WhisperFusion builds upon the capabilities of the [WhisperLive](https://github.com/collabora/WhisperLive) and [WhisperSpeech](https://github.com/collabora/WhisperSpeech) by integrating Mistral, a Large Language Model (LLM), on top of the real-time speech-to-text pipeline. WhisperLive relies on OpenAI Whisper, a powerful automatic speech recognition (ASR) system. Both Mistral and Whisper are optimized to run efficiently as TensorRT engines, maximizing performance and real-time processing capabilities.
|
33 |
|
34 |
## Features
|
35 |
- **Real-Time Speech-to-Text**: Utilizes OpenAI WhisperLive to convert spoken language into text in real-time.
|
|
|
60 |
include_file('docker/scripts/build-phi-2.sh')
|
61 |
```
|
62 |
|
63 |
+
## Build WhisperFusion
|
64 |
|
65 |
```{python}
|
66 |
+
include_file('docker/scripts/setup-whisperfusion.sh')
|
67 |
```
|
68 |
|
69 |
+
### Run WhisperFusion with Whisper and Mistral/Phi-2
|
70 |
|
71 |
Take the folder path for Whisper TensorRT model, folder_path and tokenizer_path for Mistral/Phi-2 TensorRT from the build phase. If a huggingface model is used to build mistral/phi-2 then just use the huggingface repo name as the tokenizer path.
|
72 |
|
73 |
```{python}
|
74 |
+
include_file('docker/scripts/run-whisperfusion.sh')
|
75 |
```
|
76 |
|
77 |
- On the client side clone the repo, install the requirements and execute `run_client.py`
|
78 |
```bash
|
79 |
+
cd WhisperFusion
|
80 |
pip install -r requirements.txt
|
81 |
python3 run_client.py
|
82 |
```
|
docker/Dockerfile
CHANGED
@@ -1,8 +1,8 @@
|
|
1 |
-
FROM ghcr.io/collabora/
|
2 |
|
3 |
WORKDIR /root
|
4 |
-
COPY scripts/setup-
|
5 |
-
RUN ./setup-
|
6 |
|
7 |
-
CMD ./run-
|
8 |
|
|
|
1 |
+
FROM ghcr.io/collabora/whisperfusion-base:latest as base
|
2 |
|
3 |
WORKDIR /root
|
4 |
+
COPY scripts/setup-whisperfusion.sh scripts/run-whisperfusion.sh scratch-space/models /root/
|
5 |
+
RUN ./setup-whisperfusion.sh
|
6 |
|
7 |
+
CMD ./run-whisperfusion.sh
|
8 |
|
docker/build.sh
CHANGED
@@ -4,11 +4,11 @@
|
|
4 |
|
5 |
(
|
6 |
cd base-image &&
|
7 |
-
docker build $ARGS -t ghcr.io/collabora/
|
8 |
)
|
9 |
|
10 |
mkdir -p scratch-space
|
11 |
cp -r scripts/build-* scratch-space
|
12 |
-
|
13 |
|
14 |
-
docker build $ARGS -t ghcr.io/collabora/
|
|
|
4 |
|
5 |
(
|
6 |
cd base-image &&
|
7 |
+
docker build $ARGS -t ghcr.io/collabora/whisperfusion-base:latest .
|
8 |
)
|
9 |
|
10 |
mkdir -p scratch-space
|
11 |
cp -r scripts/build-* scratch-space
|
12 |
+
docker run --gpus all --shm-size 64G -v "$PWD"/scratch-space:/root/scratch-space -w /root/scratch-space -it ghcr.io/collabora/whisperfusion-base:latest ./build-models.sh
|
13 |
|
14 |
+
docker build $ARGS -t ghcr.io/collabora/whisperfusion:latest .
|
docker/publish.sh
CHANGED
@@ -1,4 +1,4 @@
|
|
1 |
#!/bin/bash -e
|
2 |
|
3 |
-
docker push ghcr.io/collabora/
|
4 |
-
docker push ghcr.io/collabora/
|
|
|
1 |
#!/bin/bash -e
|
2 |
|
3 |
+
docker push ghcr.io/collabora/whisperfusion-base:latest
|
4 |
+
docker push ghcr.io/collabora/whisperfusion:latest
|
docker/scripts/{run-whisperbot.sh β run-whisperfusion.sh}
RENAMED
@@ -2,7 +2,7 @@
|
|
2 |
|
3 |
test -f /etc/shinit_v2 && source /etc/shinit_v2
|
4 |
|
5 |
-
cd
|
6 |
if [ "$1" != "mistral" ]; then
|
7 |
exec python3 main.py --phi \
|
8 |
--whisper_tensorrt_path /root/whisper_small_en \
|
|
|
2 |
|
3 |
test -f /etc/shinit_v2 && source /etc/shinit_v2
|
4 |
|
5 |
+
cd WhisperFusion
|
6 |
if [ "$1" != "mistral" ]; then
|
7 |
exec python3 main.py --phi \
|
8 |
--whisper_tensorrt_path /root/whisper_small_en \
|
docker/scripts/{setup-whisperbot.sh β setup-whisperfusion.sh}
RENAMED
@@ -1,9 +1,9 @@
|
|
1 |
#!/bin/bash -e
|
2 |
|
3 |
## Clone this repo and install requirements
|
4 |
-
[ -d "
|
5 |
|
6 |
-
cd
|
7 |
apt update
|
8 |
apt install ffmpeg portaudio19-dev -y
|
9 |
|
|
|
1 |
#!/bin/bash -e
|
2 |
|
3 |
## Clone this repo and install requirements
|
4 |
+
[ -d "WhisperFusion" ] || git clone https://github.com/collabora/WhisperFusion.git
|
5 |
|
6 |
+
cd WhisperFusion
|
7 |
apt update
|
8 |
apt install ffmpeg portaudio19-dev -y
|
9 |
|
docker/scripts/setup.sh
CHANGED
@@ -3,4 +3,4 @@
|
|
3 |
./setup-whisper.sh
|
4 |
#./setup-mistral.sh
|
5 |
./setup-phi-2.sh
|
6 |
-
./setup-
|
|
|
3 |
./setup-whisper.sh
|
4 |
#./setup-mistral.sh
|
5 |
./setup-phi-2.sh
|
6 |
+
./setup-whisperfusion.sh
|