metadata
datasets:
- gozfarb/ShareGPT_Vicuna_unfiltered
Convert tools
https://github.com/practicaldreamer/vicuna_to_alpaca
Training tool
https://github.com/oobabooga/text-generation-webui
ATM I'm using 2023.05.04v0 of the dataset and training full context.
How to test?
- Download LLaMA-13B-HF: https://huggingface.co/Neko-Institute-of-Science/LLaMA-30B-HF
- Replace special_tokens_map.json and tokenizer_config.json using the ones on this repo.
- Rename LLaMA-30B-HF to vicuna-30b
- Load ooba:
python server.py --listen --model vicuna-30b --load-in-8bit --chat --lora checkpoint-xxxx
- Instruct mode: Vicuna-v1 it will load Vicuna-v0 by defualt