hululuzhu commited on
Commit
7f55b90
1 Parent(s): 53c0d8a

Add readme with examples and inference code sample

Browse files
Files changed (1) hide show
  1. README.md +82 -1
README.md CHANGED
@@ -1,3 +1,84 @@
1
  ---
2
- license: mit
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - zh
4
+ license: apache-2.0
5
+ tags:
6
+ - solidity
7
+ - web3
8
+ - code generation
9
+ widget:
10
+ - text: "pragma solidity ^0.5.7;\n// Context: ParentA | Functions: helloA helloB | Constants: constantA \ncontract HelloWorld is ParentA {"
11
  ---
12
+
13
+ # A code autocomplete T5 model for solidity
14
+ - Hello world example to use this model, notice the input `text` includes
15
+ - Header solidity version like `pragma solidity ^0.5.7`
16
+ - Ancestor class/library info, e.g. public functions and constants from `ParentA`
17
+ - Contract/Library/Interface declaration header, e.g. `HelloWorld` ended with `{`
18
+
19
+ ```python
20
+ from transformers import AutoTokenizer, T5ForConditionalGeneration
21
+ tokenizer = AutoTokenizer.from_pretrained("hululuzhu/solidity-autocomplete")
22
+ model = T5ForConditionalGeneration.from_pretrained("hululuzhu/solidity-autocomplete")
23
+
24
+ text = """pragma solidity ^0.5.7;
25
+ // Context: ParentA | Functions: helloA helloB | Constants: constantA
26
+ contract HelloWorld is ParentA {"""
27
+ input_ids = model.tokenizer(text, return_tensors="pt", truncation=True).input_ids.to('cuda')
28
+
29
+ # Need to tune beam/topk/topp params to get good outcome
30
+ generated_ids = model.model.generate(input_ids, max_length=256, num_beams=5, top_p=0.95, top_k=50)
31
+ print(model.tokenizer.decode(generated_ids[0], skip_special_tokens=True))
32
+ ```
33
+
34
+
35
+ - Base T5 code model: https://huggingface.co/Salesforce/codet5-large
36
+ - Source data: https://huggingface.co/datasets/mwritescode/slither-audited-smart-contracts
37
+ - Processing steps: Clean, contract-level segmentation sepration, split in and out
38
+ - After processing input sample
39
+
40
+ ```
41
+ pragma solidity 0.5.7;
42
+ // Context: PauserRole | Functions: isPauser addPauser renouncePauser | Constants:
43
+ contract Pausable is PauserRole {
44
+ ```
45
+
46
+ - After processing output sample (**notice indentation is bad, this is intentional to reduce token size**)
47
+
48
+ ```
49
+ event Paused(address account);
50
+ event Unpaused(address account);
51
+ bool private _pausableActive;
52
+ bool private _paused;
53
+ constructor () internal {
54
+ _paused = false;
55
+ }
56
+ function paused() public view returns (bool) {
57
+ return _paused;
58
+ }
59
+ modifier whenNotPaused() {
60
+ require(!_paused);
61
+ _;
62
+ }
63
+ modifier whenPaused() {
64
+ require(_paused);
65
+ _;
66
+ }
67
+ function pause() public onlyPauser whenNotPaused whenPausableActive {
68
+ _paused = true;
69
+ emit Paused(msg.sender);
70
+ }
71
+ function unpause() public onlyPauser whenPaused whenPausableActive {
72
+ _paused = false;
73
+ emit Unpaused(msg.sender);
74
+ }
75
+ function _setPausableActive(bool _active) internal {
76
+ _pausableActive = _active;
77
+ }
78
+ modifier whenPausableActive() {
79
+ require(_pausableActive);
80
+ _;
81
+ }
82
+ }
83
+ ```
84
+ - Source training code: To be added