solidity-t5 / README.md
hululuzhu's picture
Change language to EN
fdb96c8
|
raw
history blame
2.75 kB
metadata
language:
  - en
license: apache-2.0
tags:
  - solidity
  - web3
  - code generation
widget:
  - text: |-
      pragma solidity ^0.5.7;
      // Context: ParentA | Functions: helloA helloB | Constants: constantA 
      contract HelloWorld is ParentA {

A code autocomplete T5 model for solidity

  • Hello world example to use this model, notice the input text includes
    • Header solidity version like pragma solidity ^0.5.7
    • Ancestor class/library info, e.g. public functions and constants from ParentA
    • Contract/Library/Interface declaration header, e.g. HelloWorld ended with {
from transformers import AutoTokenizer, T5ForConditionalGeneration
tokenizer = AutoTokenizer.from_pretrained("hululuzhu/solidity-autocomplete")
model = T5ForConditionalGeneration.from_pretrained("hululuzhu/solidity-autocomplete")

text = """pragma solidity ^0.5.7;
// Context: ParentA | Functions: helloA helloB | Constants: constantA 
contract HelloWorld is ParentA {"""
input_ids = model.tokenizer(text, return_tensors="pt", truncation=True).input_ids.to('cuda')

# Need to tune beam/topk/topp params to get good outcome
generated_ids = model.model.generate(input_ids, max_length=256, num_beams=5, top_p=0.95, top_k=50)
print(model.tokenizer.decode(generated_ids[0], skip_special_tokens=True))
  • Base T5 code model: https://huggingface.co/Salesforce/codet5-large
  • Source data: https://huggingface.co/datasets/mwritescode/slither-audited-smart-contracts
    • Processing steps: Clean, contract-level segmentation sepration, split in and out

    • After processing input sample

      pragma solidity 0.5.7;
      // Context: PauserRole | Functions: isPauser addPauser renouncePauser | Constants: 
      contract Pausable is PauserRole {
      
    • After processing output sample (notice indentation is bad, this is intentional to reduce token size)

      event Paused(address account);
      event Unpaused(address account);
      bool private _pausableActive;
      bool private _paused;
      constructor () internal {
      _paused = false;
      }
      function paused() public view returns (bool) {
      return _paused;
      }
      modifier whenNotPaused() {
      require(!_paused);
      _;
      }
      modifier whenPaused() {
      require(_paused);
      _;
      }
      function pause() public onlyPauser whenNotPaused whenPausableActive {
      _paused = true;
      emit Paused(msg.sender);
      }
      function unpause() public onlyPauser whenPaused whenPausableActive {
      _paused = false;
      emit Unpaused(msg.sender);
      }
      function _setPausableActive(bool _active) internal {
      _pausableActive = _active;
      }
      modifier whenPausableActive() {
      require(_pausableActive);
      _;
      }
      }
      
  • Source training code: To be added