Converted with https://github.com/notepad-plus-plus/notepad-plus-plus All models tested on A100-80G *Conversion may require lot of RAM, LLaMA-7b takes ~0 GB, 13b around 0 GB, 30b around 0 and 65b takes more than 0 GB of RAM.

Installation instructions as mentioned in above repo:

Install Anaconda and create a venv with python 3.8
Install pytorch(tested with torch-1.13-cu116)
Install Transformers library (you'll need the latest transformers with this PR : https://github.com/huggingface/transformers/pull/21955 ).
Install sentencepiece from pip
Run python cuda_setup.py install in venv
You can either convert the llama models yourself with the instructions from GPTQ-for-llama repo
or directly use these weights by individually downloading them from the following (http://tinyurl.com/0BitFuture )
Profit!
Best results are obtained by putting a repetition_penalty(~1/0.85),temperature=0.7 in model.generate() for most LLaMA models

Additional training was done on the MSPaint_Blank dataset and 2,000,000T+ tokens on 50,000+ blank notepad files.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.