hipnologo/GPT-Neox-20b-QLoRA-FineTune-english_quotes_dataset
Training procedure
The following bitsandbytes
quantization config was used during training:
- load_in_8bit: False
- load_in-4bit: True
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: nf4
- bnb_4bit_use_double_quant: True
- bnb_4bit_compute_dtype: bfloat16
Model description
This model is a fine-tuned version of the EleutherAI/gpt-neox-20b
model using the QLoRa library and the PEFT library.
How to use
The code below performs the following steps:
- Imports the necessary libraries:
torch
and classes from thetransformers
library. - Specifies the
model_id
as "hipnologo/GPT-Neox-20b-QLoRA-FineTune-english_quotes_dataset". - Defines a
BitsAndBytesConfig
object namedbnb_config
with the following configuration:load_in_4bit
set toTrue
bnb_4bit_use_double_quant
set toTrue
bnb_4bit_quant_type
set to "nf4"bnb_4bit_compute_dtype
set totorch.bfloat16
- Initializes an
AutoTokenizer
object namedtokenizer
by loading the tokenizer for the specifiedmodel_id
. - Initializes an
AutoModelForCausalLM
object namedmodel
by loading the pre-trained model for the specifiedmodel_id
and providing thequantization_config
asbnb_config
. The model is loaded on devicecuda:0
. - Defines a variable
text
with the value "Twenty years from now". - Defines a variable
device
with the value "cuda:0", representing the device on which the model will be executed. - Encodes the
text
using thetokenizer
and converts it to a PyTorch tensor, assigning it to theinputs
variable. The tensor is moved to the specifieddevice
. - Generates text using the
model.generate
method by passing theinputs
tensor and setting themax_new_tokens
parameter to 20. The generated output is assigned to theoutputs
variable. - Decodes the
outputs
tensor using thetokenizer
to obtain the generated text without special tokens, and assigns it to thegenerated_text
variable. - Prints the
generated_text
.
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
# Load the base pre-trained model
base_model_id = "EleutherAI/gpt-neox-20b"
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
model = AutoModelForCausalLM.from_pretrained(base_model_id)
# Fine-tuning model
model_id = "hipnologo/GPT-Neox-20b-QLoRA-FineTune-english_quotes_dataset"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16
)
# Load the fine-tuned model
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map={"":0})
text = "Twenty years from now"
device = "cuda:0"
inputs = tokenizer(text, return_tensors="pt").to(device)
outputs = model.generate(**inputs, max_new_tokens=20)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)
Framework versions
- PEFT 0.4.0.dev0
Training procedure
- Trainable params: 8650752
- all params: 10597552128
- trainable%: 0.08162971878329976
License
This model is licensed under Apache 2.0. Please see the LICENSE for more information.
- Downloads last month
- 8
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for hipnologo/GPT-Neox-20b-QLoRA-FineTune-english_quotes_dataset
Base model
EleutherAI/gpt-neox-20b