QuantumAI: Zero LLM Quantum AI Model

Zero Quantum AI is an LLM that tries to bypass needing quantum computing using interdimensional mathematics, quantum math, and the Mathematical Probability of Goodness. Developed by TalkToAi.org and ResearchForum.Online, this model leverages cutting-edge AI frameworks to redefine conversational AI, ensuring deep, ethical decision-making capabilities. The model is fine-tuned on Meta-Llama-3.1-8B-Instruct and trained via AutoTrain to optimize conversational tasks, dialogue generation, and inference.

Model Information

Base Model: meta-llama/Meta-Llama-3.1-8B
Fine-tuned Model: meta-llama/Meta-Llama-3.1-8B-Instruct
Training Framework: AutoTrain
Training Data: Conversational and text-generation focused dataset

Tech Stack

Transformers
PEFT (Parameter-Efficient Fine-Tuning)
TensorBoard (for logging and metrics)
Safetensors

Usage Types

Interactive dialogue
Text generation

Key Features

Quantum Mathematics & Interdimensional Calculations: Utilizes quantum principles to predict user intent and generate insightful responses.
Mathematical Probability of Goodness: All responses are ethically aligned using a mathematical framework, ensuring positive interactions.
Efficient Inference: Supports 4-bit quantization for faster and resource-efficient deployment.

Installation and Usage

To use the model in your Python code:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "PATH_TO_THIS_REPO"

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    device_map="auto",
    torch_dtype='auto'
).eval()

# Example usage
messages = [
    {"role": "user", "content": "hi"}
]

input_ids = tokenizer.apply_chat_template(conversation=messages, tokenize=True, add_generation_prompt=True, return_tensors='pt')
output_ids = model.generate(input_ids.to('cuda'))
response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True)

# Output
print(response)

## **Inference API**

This model is not yet deployed to the Hugging Face Inference API. However, you can deploy it to **Inference Endpoints** for dedicated, serverless inference.

## **Training Process**

The **Zero Quantum AI** model was trained using **AutoTrain** with the following configuration:

- **Hardware**: CUDA 12.1
- **Training Precision**: Mixed FP16
- **Batch Size**: 2
- **Learning Rate**: 3e-05
- **Epochs**: 5
- **Optimizer**: AdamW
- **PEFT**: Enabled (LoRA with lora_r=16, lora_alpha=32)
- **Quantization**: Int4 for efficient deployment
- **Scheduler**: Linear with warmup
- **Gradient Accumulation**: 4 steps
- **Max Sequence Length**: 2048 tokens

## **Training Metrics**

Monitored using **TensorBoard**, with key training metrics:

- **Training Loss**: 1.74
- **Learning Rate**: Adjusted per epoch, starting at 3e-05.

## **Model Features**

- **Text Generation**: Handles various types of user queries and provides coherent, contextually aware responses.
- **Conversational AI**: Optimized specifically for generating interactive dialogues.
- **Efficient Inference**: Supports Int4 quantization for faster, resource-friendly deployment.

## **License**

This model is governed under a custom license. Please refer to [QuantumAI License](https://huggingface.co/shafire/QuantumAI) for details, in compliance with **Meta-Llama 3.1 License**.