unboundedmarket/smart_contract_explainer_open_llama_7b_v2

Model description

This model is a fine-tuned version of Open Llama 7B V2, trained to generate explanations for Cardano smart contracts.

Base Model: Open Llama 7B V2
Fine-Tuning Framework: Axolotl
Hardware Used: NVIDIA L40 GPU
Objective: Simplify and explain Cardano smart contracts.

For more information on the model, see UnboundedMarket AI Explainer Models, and our interface for visualizing and browsing the smart contracts UnboundedMarket AI Explainer Interface

Intended uses & limitations

The model is designed to help users and developers to better understand Cardano smart contracts.

Training and evaluation data

The model was trained on a dataset of Cardano smart contracts, see train_data.jsonl.

Training procedure

The model was instructed tuned, using Lora via Axolotl.

See axolotl config

axolotl version: 0.6.0

base_model: openlm-research/open_llama_7b_v2
# optionally might have model_type or tokenizer_type
model_type: LlamaForCausalLM
tokenizer_type: LlamaTokenizer
# Automatically upload checkpoint and final model to HF
# hub_model_id: username/custom_model_name

load_in_8bit: true
load_in_4bit: false
strict: false
push_dataset_to_hub:
datasets:
  - path: train_dataset.jsonl
    type: alpaca
dataset_prepared_path:
val_set_size: 0.1
adapter: lora
lora_model_dir:
sequence_len: 1024
sample_packing: false
lora_r: 8
lora_alpha: 16
lora_dropout: 0.0
lora_target_modules:
  - gate_proj
  - down_proj
  - up_proj
  - q_proj
  - v_proj
  - k_proj
  - o_proj
lora_fan_in_fan_out:
wandb_project:
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:
output_dir: ./outputs/open_llama_7b_v2_explain_contracts
gradient_accumulation_steps: 1
micro_batch_size: 2
num_epochs: 4
optimizer: adamw_bnb_8bit
torchdistx_path:
lr_scheduler: cosine
learning_rate: 0.0002
train_on_inputs: false
group_by_length: false
bf16: false
fp16: true
tf32: false
gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true
gptq_groupsize:
s2_attention:
gptq_model_v1:
warmup_steps: 20
evals_per_epoch: 4
saves_per_epoch: 1
debug:
deepspeed:
weight_decay: 0.1
fsdp:
fsdp_config:
special_tokens:
  bos_token: "<s>"
  eos_token: "</s>"
  unk_token: "<unk>"

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 2
eval_batch_size: 2
seed: 42
optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 20
num_epochs: 4
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
1.0718	0.0049	1	0.9605
0.7756	0.2512	51	0.6831
0.7316	0.5025	102	0.6300
0.5161	0.7537	153	0.5952
0.2465	1.0049	204	0.5775
0.3408	1.2562	255	0.5715
0.5834	1.5074	306	0.5610
0.4347	1.7586	357	0.5540
0.272	2.0099	408	0.5428
0.2509	2.2611	459	0.5885
0.2044	2.5123	510	0.5848
0.4006	2.7635	561	0.5771
0.2471	3.0148	612	0.5739
0.0865	3.2660	663	0.6318
0.1475	3.5172	714	0.6396
0.3631	3.7685	765	0.6394

Framework versions

PEFT 0.14.0
Transformers 4.47.1
Pytorch 2.5.1+cu124
Datasets 3.1.0
Tokenizers 0.21.0

unboundedmarket
/

smart_contract_explainer_open_llama_7b_v2