CodeLlama_7B_nlp_pp
This model is a fine-tuned version of codellama/CodeLlama-7b-hf on the AshtonIsNotHere/nlp_pp_code_dataset dataset. It achieves the following results on the evaluation set:
- Loss: 0.4129
- Accuracy: 0.8968
Model description
This model has been fine-tuned for code completion on a dataset of NLP++ code.
Intended uses & limitations
More information needed
Training and evaluation data
Dataset consists of a combination of scraped NLP++ code and NLP++ code examples from the VisualText website.
Training procedure
This model is trained in a multinode, multi-gpu setup with DeepSpeed Z3. For more information on the training setup, check out the GitHub repo.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.00012
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 4
- total_train_batch_size: 16
- total_eval_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 7.0
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
No log | 1.0 | 61 | 0.5100 | 0.8726 |
No log | 1.99 | 122 | 0.4129 | 0.8968 |
No log | 2.99 | 183 | 0.4166 | 0.9072 |
No log | 4.0 | 245 | 0.4595 | 0.9090 |
No log | 5.0 | 306 | 0.5181 | 0.9093 |
No log | 5.99 | 367 | 0.5553 | 0.9090 |
No log | 6.97 | 427 | 0.5603 | 0.9089 |
Framework versions
- Transformers 4.30.2
- Pytorch 2.0.1+cu117
- Datasets 2.13.0
- Tokenizers 0.13.3
- Downloads last month
- 3
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Dataset used to train AshtonIsNotHere/CodeLlama_7B_nlp_pp
Space using AshtonIsNotHere/CodeLlama_7B_nlp_pp 1
Evaluation results
- Accuracy on AshtonIsNotHere/nlp_pp_code_datasettest set self-reported0.897