Alpha-Ophiuchi-mini-128k-v0.1

Disclaimer

Note: All models and LoRAs from the Ophiuchus series were created with the sole purpose of research. The usage of this model and/or its related LoRA implies agreement with the following terms:

The user is responsible for what they might do with it, including how the output of the model is interpreted and used;
The user should not use the model and its outputs for any illegal purposes;
The user is the only one resposible for any misuse or negative consequences from using this model and/or its related LoRA.

I do not endorse any particular perspectives presented in the training data.

Ophiuchus Series

This series aims to develop highly uncensored Large Language Models (LLMs) with the following focuses:

Science, Technology, Engineering, and Mathematics (STEM)
Computer Science (including programming)
Social Sciences

And several key cognitive skills, including but not limited to:

Reasoning and logical deduction
Critical thinking
Analysis

While maintaining strong overall knowledge and expertise, the models will undergo refinement through:

Fine-tuning processes
Model merging techniques including Mixture of Experts (MoE)

Please note that these models are experimental and may demonstrate varied levels of effectiveness. Your feedback, critique, or queries are most welcome for improvement purposes.

Base

This model and its related LoRA was fine-tuned on https://huggingface.co/failspy/Phi-3-mini-128k-instruct-abliterated-v3.

LoRA

The LoRA merged with the base model is available at https://huggingface.co/fearlessdots/Alpha-Ophiuchi-mini-128k-v0.1-LoRA.

Datasets

https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal

Fine Tuning

- Quantization Configuration

load_in_4bit=True
bnb_4bit_quant_type="fp4"
bnb_4bit_compute_dtype=compute_dtype
bnb_4bit_use_double_quant=False

- PEFT Parameters

lora_alpha=64
lora_dropout=0.05
r=128
bias="none"

- Training Arguments

num_train_epochs=1
per_device_train_batch_size=1
gradient_accumulation_steps=4
optim="adamw_bnb_8bit"
save_steps=25
logging_steps=25
learning_rate=2e-4
weight_decay=0.001
fp16=False
bf16=False
max_grad_norm=0.3
max_steps=-1
warmup_ratio=0.03
group_by_length=True
lr_scheduler_type="constant"

Credits

Microsoft (https://huggingface.co/microsoft): for the original Phi-3;
HuggingFace: for hosting this model and for creating the fine-tuning tools used;
failspy (https://huggingface.co/failspy): for the base model and the orthogonalization implementation;
NobodyExistsOnTheInternet (https://huggingface.co/NobodyExistsOnTheInternet): for the incredible dataset;

A huge thank you to all of them ☺️

fearlessdots
/

Alpha-Ophiuchi-mini-128k-v0.1