--- license: other language: - en library_name: transformers pipeline_tag: text-generation tags: - 7B - Saily - DEEPNIGHT - Llama - Llama2 --- # SaiLy 7B (deepnight-research/saily-7b-v0) Saily: Experimental AI Models by DEEPNIGHT --- ### SaiLy is a series/collection of AI Models by DEEPNIGHT-RESEARCH which are highly experimental and uncensored. Please use with responsibility. ---
Prompt Template: Alpeca ``` Below is an instruction that describes a task. Write a response that appropriately completes the request. ### Instruction: {prompt} ### Response: ``` ### Description This is the first model of the series. The model is based on Llama2-chat. --- ### Did some said CODE? Here you go! ```python import transformers model = transformers.AutoModelForCausalLM.from_pretrained( 'deepnight-research/saily-7b-v0' ) ``` To use the optimized triton implementation of FlashAttention, you can load the model on GPU ```(cuda:0)``` with ```attn_impl='triton'``` and with ```bfloat16``` precision: ```python import torch import transformers name = 'deepnight-research/saily-7b-v0' config = transformers.AutoConfig.from_pretrained(name) config.attn_config['attn_impl'] = 'triton' config.init_device = 'cuda:0' # For fast initialization directly on GPU! model = transformers.AutoModelForCausalLM.from_pretrained( name, config=config, torch_dtype=torch.bfloat16, # Load model weights in bfloat16 trust_remote_code=True ) ``` --- If you would like to support us, please [consider donating](https://donate.deepnight.tech) for [#aiforcause](https://github.com/deepnight-ai/aiforcause). Cheers✌️ - Team [DEEPNIGHT](https://deepnight.tech)