Spaces:

pineconeT94
/

Fish-Speech

Running

App Files Files Community

Fish-Speech / docs /en /start_agent.md

pineconeT94

init

8598b7e 2 months ago

preview code

raw

history blame

2.09 kB

	# Start Agent

	## Requirements

	- GPU memory: At least 8GB(under quanization), 16GB or more is recommanded.
	- Disk usage: 10GB

	## Download Model

	You can get the model by:

	```bash
	huggingface-cli download fishaudio/fish-agent-v0.1-3b --local-dir checkpoints/fish-agent-v0.1-3b
	```

	Put them in the 'checkpoints' folder.

	You also need the fish-speech model which you can download instructed by [inference](inference.md).

	So there will be 2 folder in the checkpoints.

	The `checkpoints/fish-speech-1.4` and `checkpoints/fish-agent-v0.1-3b`

	## Environment Prepare

	If you already have Fish-speech, you can directly use by adding the follow instruction:
	```bash
	pip install cachetools
	```

	!!! note
	Please use the Python version below 3.12 for compile.

	If you don't have, please use the below commands to build yout environment:

	```bash
	sudo apt-get install portaudio19-dev

	pip install -e .[stable]
	```

	## Launch The Agent Demo.

	To build fish-agent, please use the command below under the main folder:

	```bash
	python -m tools.api --llama-checkpoint-path checkpoints/fish-agent-v0.1-3b/ --mode agent --compile
	```

	The `--compile` args only support Python < 3.12 , which will greatly speed up the token generation.

	It won't compile at once (remember).

	Then open another terminal and use the command:

	```bash
	python -m tools.e2e_webui
	```

	This will create a Gradio WebUI on the device.

	When you first use the model, it will come to compile (if the `--compile` is True) for a short time, so please wait with patience.

	## Gradio Webui
	<p align="center">
	<img src="../assets/figs/agent_gradio.png" width="75%">
	</p>

	Have a good time!

	## Performance

	Under our test, a 4060 laptop just barely runs, but is very stretched, which is only about 8 tokens/s. The 4090 is around 95 tokens/s under compile, which is what we recommend.

	# About Agent

	The demo is an early alpha test version, the inference speed needs to be optimised, and there are a lot of bugs waiting to be fixed. If you've found a bug or want to fix it, we'd be very happy to receive an issue or a pull request.