ML1 Previews

This repository contains the previews for the ML1 model - Reddit Post

Watch training live here: https://api.wandb.ai/links/nickmitchko/t5d47kzr

Checkpoints

Model	1 Epoch Pct	Link
ML1-34b	15%	Directory
ML1-34b	50%	~
ML1-34b	100%	~
ML1-mistral-7b	50%	~
ML1-mistral-7b	100%	~
ML1-70b	15%	~
ML1-70b	50%	~
ML1-70b	100%	~

Model Description

The goal is to develop a series of models that can express superior performance given high quality data. To achieve this, I plan to experiment with the lovely dataset produced by /u/docsoc1. Huge shout out to him/her! If you'd like to view that dataset, the link is below.

Dataset: emrgnt-cmplxty/sciphi-textbooks-are-all-you-need

Prompt Format

The model is trained using the alpaca format. Please see here or below for that format:

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{instruction}

### Response:

Architecture

nmitchko/ML1-34b-previews is a large language model repository of LoRA checkpoints specifically fine-tuned to add text-book synthesized data in the style of Phi 1/1.5. It is based on codellama-34b-hf at 34 billion parameters.

The primary goal of this model is to test various fine tuning methods around high quality data. It was trained using LoRA, specifically QLora Multi GPU, to reduce memory footprint.

See Training Parameters for more info This Lora supports 4-bit and 8-bit modes.

Requirements

bitsandbytes>=0.41.0
peft@main
transformers@main

Steps to load this model:

Load base model (codellama-34b-hf) using transformers
Download a checkpoint folder (checkpoint-1)
Apply LoRA using peft

Training Parameters

The model is currently training on emrgnt-cmplxty/sciphi-textbooks-are-all-you-need

emrgnt-cmplxty/sciphi-textbooks-are-all-you-need contains textbook synthesized data.

Item	Amount	Units
LoRA Rank	64	~
LoRA Alpha	16	~
Learning Rate	1e-4	SI
Dropout	5	%

Training procedure

The following bitsandbytes quantization config was used during training:

quant_method: QuantizationMethod.BITS_AND_BYTES
load_in_8bit: False
load_in_4bit: True
llm_int8_threshold: 6.0
llm_int8_skip_modules: None
llm_int8_enable_fp32_cpu_offload: False
llm_int8_has_fp16_weight: False
bnb_4bit_quant_type: nf4
bnb_4bit_use_double_quant: True
bnb_4bit_compute_dtype: bfloat16

Framework versions

PEFT 0.6.0.dev0

nmitchko
/

ML1-previews