Ruckus-PyAssi-13b
This model is a fine-tuned version of meta-llama/Llama-2-13b-hf on a 10 000 examples from flytech/llama-python-codes-30k dataset.
Model description
Model trained in 4-bit architecture using SFT (Supervised Fine Tuning) and LoRA (Low-Rank Adaptation) methods, fine-tuning further is possible.
Intended uses & limitations
Code-generation, but as like all Ruckus models
- Created to serve as an executional layer
- Rich in Python codes and instructional tasks
- Specially formatted for chat (see inference)
Training procedure
Model was being trained for 13 hours of A6000 single 48GB vRAM GPU
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 32
- eval_batch_size: 32 * 2
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant
- num_epochs: 5
Inference
- Make sure to format your prompt: [INST]This is my prompt[/INST]
[INST]Ruckus, open google[/INST]
Framework versions
- Transformers 4.34.0
- Pytorch 2.0.1+cu118
- Datasets 2.14.5
- Tokenizers 0.14.1
- Downloads last month
- 19
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for flytech/Ruckus-PyAssi-13b
Base model
meta-llama/Llama-2-13b-hf