Edit model card

Model Description: armaGPT is a finetuned version of Gemma 7b, a pre-trained language model developed by Google. It is designed to generate human-like text based on the input it receives. And armaGPT is finetuned using DPO Training for fair and safe generation.

Model Architecture: The architecture of armaGPT is based on the transformer model, which is a type of recurrent neural network (RNN) that uses self-attention mechanisms to process input sequences.

Model Size: The model has approximately 7 billion parameters.

Context Length

Models are trained on a context length of 8192 tokens.

Running the model on a CPU

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("sidharthsajith7/armaGPT")
model = AutoModelForCausalLM.from_pretrained("sidharthsajith7/armaGPT")
input_text = "Write me a poem about Machine Learning."
input_ids = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))

Running the model on a single / multi GPU

# pip install accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("sidharthsajith7/armaGPT")
model = AutoModelForCausalLM.from_pretrained("sidharthsajith7/armaGPT", device_map="auto")
input_text = "Write me a poem about Machine Learning."
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
Downloads last month
8
Safetensors
Model size
8.54B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for sidharthsajith7/armaGPT

Quantizations
2 models

Datasets used to train sidharthsajith7/armaGPT