Model Card for Model ID
Model Details
Model Description
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
- Developed by: Ayan Javeed Shaikh and Srushti Sonavane
- Finetuned from model: unsloth/Llama-3.2-1B-bnb-4bit
Model Sources [optional]
Mental Health Llama 3.2 - 1B ConversationalBot
Inference
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "Ayansk11/Mental_health_Llama3.2-1B_conversationalBot",
max_seq_length = 5020,
dtype = None,
load_in_4bit = True)
Using this text to feed into model for getting the response
text="I'm going through some things with my feelings and myself. I barely sleep and I do nothing but think about how I'm worthless and how I shouldn't be here. I've never tried or contemplated suicide. I've always wanted to fix my issues, but I never get around to it. How can I change my feeling of being worthless to everyone?"
Key Points to Note:
-
The
model = FastLanguageModel.for_inference(model)
command prepares the model specifically for inference, ensuring it is optimized for generating responses efficiently. -
The input text is processed using the
tokenizer
, which converts it into a format suitable for the model. Thedata_prompt
is used to structure the input text, leaving a placeholder for the model's response. Additionally, thereturn_tensors = "pt"
argument ensures the output is in PyTorch tensor format, which is then transferred to the GPU using.to("cuda")
for faster processing. -
The
model.generate
function generates responses based on the tokenized input. Parameters likemax_new_tokens = 5020
anduse_cache = True
enable the model to produce lengthy, coherent outputs efficiently by leveraging cached computations from prior layers.
model = FastLanguageModel.for_inference(model)
inputs = tokenizer(
[
data_prompt.format(
#instructions
text,
#answer
"",
)
], return_tensors = "pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens = 5020, use_cache = True)
answer=tokenizer.batch_decode(outputs)
answer = answer[0].split("### Response:")[-1]
print("Answer of the question is:", answer)
Model tree for Ayansk11/Mental_health_Llama3.2-1B_conversationalBot
Base model
meta-llama/Llama-3.2-1B