license: apache-2.0
base_model: codeparrot/codeparrot-small
tags:
- generated_from_trainer
model-index:
- name: solidity-generator
results: []
datasets:
- mwritescode/slither-audited-smart-contracts
pipeline_tag: text-generation
language:
- en
library_name: transformers
widget:
- text: contract MyToken is ERC20{
solidity-generator
This model is a model specialized in generating Solidity contract codes. Derived from the codeparrot/codeparrot-small model, it's been meticulously trained on an extensive set of Solidity contracts and patterns, making it apt for assisting in drafting or suggesting contract structures.
Model description
This model has been designed specifically for generating Solidity contracts. Being a derivative of the codeparrot-small
model, it retains the broader capabilities of the parent model while demonstrating a keen proficiency in understanding and generating Solidity-centric texts.
Performance
The model reported a loss of 0.2180
on the evaluation set.
Intended Uses & Limitations
Intended Uses:
- Assist developers by auto-generating contract code snippets based on prompts.
- Help in understanding and drafting complex contract structures.
Limitations:
- The generated code must be reviewed for security and functional correctness.
- The clarity of the generated code largely depends on the specificity of the prompt.
Training Details
Dataset
The model was fine-tuned on mwritescode/slither-audited-smart-contracts dataset comprised of a range of Solidity contracts.
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 7e-05
- train_batch_size: 5
- eval_batch_size: 5
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 144
- num_epochs: 8
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.302 | 0.35 | 2000 | 0.3237 |
0.298 | 0.69 | 4000 | 0.2871 |
0.232 | 1.04 | 6000 | 0.2645 |
0.2415 | 1.38 | 8000 | 0.2522 |
0.2261 | 1.73 | 10000 | 0.2431 |
0.1924 | 2.07 | 12000 | 0.2332 |
0.1913 | 2.42 | 14000 | 0.2282 |
0.2152 | 2.76 | 16000 | 0.2215 |
0.1508 | 3.11 | 18000 | 0.2180 |
Framework versions
- Transformers 4.31.0
- Pytorch 2.0.1+cu118
- Datasets 2.14.3
- Tokenizers 0.13.3
How to Use
If you wish to use this model to generate Solidity contract code, follow the steps below:
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("ckandemir/solidity_generator")
model = AutoModelForCausalLM.from_pretrained("ckandemir/solidity_generator")
# Input your code prompt
input_text = "contract MyToken is ERC20{"
input_ids = tokenizer.encode(input_text, return_tensors='pt')
sample_output = model.generate(input_ids, do_sample=True, max_length=400, num_return_sequences=1, temperature=0.7)
# Decode and print the generated text
generated_text = tokenizer.decode(sample_output[0], skip_special_tokens=True)
print(generated_text)