Spaces:
Running
Running
File size: 2,092 Bytes
8598b7e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
# Start Agent
## Requirements
- GPU memory: At least 8GB(under quanization), 16GB or more is recommanded.
- Disk usage: 10GB
## Download Model
You can get the model by:
```bash
huggingface-cli download fishaudio/fish-agent-v0.1-3b --local-dir checkpoints/fish-agent-v0.1-3b
```
Put them in the 'checkpoints' folder.
You also need the fish-speech model which you can download instructed by [inference](inference.md).
So there will be 2 folder in the checkpoints.
The `checkpoints/fish-speech-1.4` and `checkpoints/fish-agent-v0.1-3b`
## Environment Prepare
If you already have Fish-speech, you can directly use by adding the follow instruction:
```bash
pip install cachetools
```
!!! note
Please use the Python version below 3.12 for compile.
If you don't have, please use the below commands to build yout environment:
```bash
sudo apt-get install portaudio19-dev
pip install -e .[stable]
```
## Launch The Agent Demo.
To build fish-agent, please use the command below under the main folder:
```bash
python -m tools.api --llama-checkpoint-path checkpoints/fish-agent-v0.1-3b/ --mode agent --compile
```
The `--compile` args only support Python < 3.12 , which will greatly speed up the token generation.
It won't compile at once (remember).
Then open another terminal and use the command:
```bash
python -m tools.e2e_webui
```
This will create a Gradio WebUI on the device.
When you first use the model, it will come to compile (if the `--compile` is True) for a short time, so please wait with patience.
## Gradio Webui
<p align="center">
<img src="../assets/figs/agent_gradio.png" width="75%">
</p>
Have a good time!
## Performance
Under our test, a 4060 laptop just barely runs, but is very stretched, which is only about 8 tokens/s. The 4090 is around 95 tokens/s under compile, which is what we recommend.
# About Agent
The demo is an early alpha test version, the inference speed needs to be optimised, and there are a lot of bugs waiting to be fixed. If you've found a bug or want to fix it, we'd be very happy to receive an issue or a pull request.
|