SaiLy 7B (deepnight-research/saily-7b-v0)

Saily: Experimental AI Models by DEEPNIGHT

SaiLy is a series/collection of AI Models by DEEPNIGHT-RESEARCH which are highly experimental and uncensored. Please use with responsibility.



Prompt Template: Alpeca
Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
{prompt}
### Response:

Description

This is the first model of the series. The model is based on Llama2-chat.


Did some said CODE?

Here you go!

import transformers
model = transformers.AutoModelForCausalLM.from_pretrained(
  'deepnight-research/saily-7b-v0'
)

To use the optimized triton implementation of FlashAttention, you can load the model on GPU (cuda:0) with attn_impl='triton' and with bfloat16 precision:

import torch
import transformers

name = 'deepnight-research/saily-7b-v0'

config = transformers.AutoConfig.from_pretrained(name)
config.attn_config['attn_impl'] = 'triton'
config.init_device = 'cuda:0' # For fast initialization directly on GPU!

model = transformers.AutoModelForCausalLM.from_pretrained(
  name,
  config=config,
  torch_dtype=torch.bfloat16, # Load model weights in bfloat16
  trust_remote_code=True
)

If you would like to support us, please consider donating for #aiforcause.

Cheers✌️

Downloads last month
18
Safetensors
Model size
13B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for deepnight-research/saily-13b-v0

Quantizations
1 model

Collection including deepnight-research/saily-13b-v0