suka-nlp commited on
Commit
0775a8e
·
verified ·
1 Parent(s): 6fed928

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +81 -0
README.md ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - mlabonne/Evol-Instruct-Python-26k
4
+ language:
5
+ - en
6
+ library_name: adapter-transformers
7
+ tags:
8
+ - code
9
+ ---
10
+
11
+ ## Model Details
12
+
13
+ ### Model Description
14
+
15
+ - **Developed by:** Maulida Suryaning Aisha
16
+ - **Model type:** large language model for code generation
17
+ - **Language(s) (NLP):** English
18
+ - **License:** [More Information Needed]
19
+ - **Finetuned from model:** llama2
20
+
21
+ ### Model Sources
22
+
23
+ - **Repository:** https://github.com/unslothai/unsloth
24
+ - **Developed by:** unsloth
25
+
26
+ ### Model parameter
27
+
28
+ - r = 16,
29
+ - target_modules = ["q_proj", "k_proj", "v_proj", "o_proj","gate_proj", "up_proj", "down_proj",],
30
+ - lora_alpha = 16,
31
+ - lora_dropout = 0,
32
+ - bias = "none",
33
+ - use_gradient_checkpointing = "unsloth",
34
+ - random_state = 3407,
35
+ - use_rslora = False,
36
+ - loftq_config = None,
37
+
38
+ ## Usage and limitations
39
+
40
+ This model is used to generate code based on commands given by the user. it should be noted that this model can generate many languages because it takes the initial model from llama2. However, after finetuning it is better at generating python code, because currently it is only trained with python code datasets.
41
+
42
+ ## How to Get Started with the Model
43
+
44
+ use link below to use model
45
+ //
46
+
47
+ ### Training Data
48
+
49
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
50
+
51
+ https://huggingface.co/datasets/mlabonne/Evol-Instruct-Python-26k
52
+
53
+
54
+ #### Training Hyperparameters
55
+
56
+ - **Warmup_step:** 5
57
+ - **lr_scheduler_type:** linear
58
+ - **Learning Rate:** 0.0002
59
+ - **Batch Size:** 8
60
+ - **Activation Function:** SiLU (Sigmoid Linear Unit), GeLU (Gaussian Error Linear Unit), Exact GeLU, Approximate GeLU
61
+ - **Weigh_decay:** 0.001
62
+ - **Epoch:** 60
63
+ - **Optimizer:** adamw_8bit
64
+
65
+ #### Testing Data
66
+
67
+ https://huggingface.co/datasets/google-research-datasets/mbpp/viewer/full
68
+
69
+ ### Testing Document
70
+
71
+ https://docs.google.com/spreadsheets/d/1hr8R4nixQsDC5cGGENTOLUW1jPCS_lltVRIeOzenBvA/edit?usp=sharing
72
+
73
+ ### Results
74
+
75
+ Berfore finetune
76
+ - Accurary : 17%
77
+ - Consistensy : 0%
78
+
79
+ After fine tune
80
+ - Accuracy : 67%
81
+ - Consistency : 100%