jonathanjordan21 commited on
Commit
0364e70
1 Parent(s): 8a9936a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -23
README.md CHANGED
@@ -1,47 +1,81 @@
1
  ---
2
  library_name: peft
3
  base_model: declare-lab/flan-alpaca-base
 
 
4
  ---
5
 
6
- # Model Card for Model ID
7
 
8
- <!-- Provide a quick summary of what the model is/does. -->
9
 
 
10
 
 
 
 
 
 
11
 
12
- ## Model Details
13
 
14
- ### Model Description
 
 
15
 
16
- <!-- Provide a longer summary of what this model is. -->
 
 
 
 
17
 
 
 
 
 
 
 
 
 
18
 
 
 
19
 
20
- - **Developed by:** [More Information Needed]
21
- - **Funded by [optional]:** [More Information Needed]
22
- - **Shared by [optional]:** [More Information Needed]
23
- - **Model type:** [More Information Needed]
24
- - **Language(s) (NLP):** [More Information Needed]
25
- - **License:** [More Information Needed]
26
- - **Finetuned from model [optional]:** [More Information Needed]
27
 
28
- ### Model Sources [optional]
 
 
 
29
 
30
- <!-- Provide the basic links for the model. -->
 
 
31
 
32
- - **Repository:** [More Information Needed]
33
- - **Paper [optional]:** [More Information Needed]
34
- - **Demo [optional]:** [More Information Needed]
35
 
36
- ## Uses
 
 
37
 
38
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
 
40
- ### Direct Use
41
 
42
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 
 
 
 
 
 
 
 
 
43
 
44
- [More Information Needed]
45
 
46
  ### Downstream Use [optional]
47
 
@@ -205,4 +239,4 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
205
  ### Framework versions
206
 
207
 
208
- - PEFT 0.6.2
 
1
  ---
2
  library_name: peft
3
  base_model: declare-lab/flan-alpaca-base
4
+ datasets:
5
+ - knowrohit07/know_sql
6
  ---
7
 
8
+ ## Model Details
9
 
10
+ ### Model Description
11
 
12
+ This model is based on the declare-lab/flan-alpaca-base model finetuned with knowrohit07/know_sql dataset.
13
 
14
+ - **Developed by:** Jonathan Jordan
15
+ - **Model type:** [FLAN Alpaca]
16
+ - **Language(s) (NLP):** [English]
17
+ - **License:** [More Information Needed]
18
+ - **Finetuned from model [optional]:** [declare-lab/flan-alpaca-base]
19
 
20
+ ## Uses
21
 
22
+ The model generates a string of SQL query based on a question and MySQL table schema.
23
+ You can modify the table schema to match MySQL table schema if you are using different type of SQL database (e.g. PostgreSQL, Oracle, etc).
24
+ The generated SQL query can be run perfectly on the python SQL connection (e.g. psycopg2, mysql_connector, etc).
25
 
26
+ #### Limitations
27
+ 1. The question MUST be in english
28
+ 2. Keep in mind about the difference in data type naming between MySQL and the other SQL databases
29
+ 3. The output always starts with SELECT *, you can't choose which columns to retrieve.
30
+ 4. Aggregation function is not supported
31
 
32
+ ### Input Example
33
+ ```python
34
+ """Question: what is What was the result of the election in the Florida 18 district?\nTable: table_1341598_10 (result VARCHAR, district VARCHAR)\nSQL: """
35
+ ```
36
+ ### Output Example
37
+ ```python
38
+ """SELECT * FROM table_1341598_10 WHERE district = "Florida 18""""
39
+ ```
40
 
41
+ ### How to use
42
+ Load model
43
 
44
+ ```python
45
+ from peft import get_peft_config, get_peft_model, TaskType
46
+ from peft import PeftConfig, PeftModel
47
+ from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
 
 
 
48
 
49
+ model_id = "jonathanjordan21/flan-alpaca-base-finetuned-lora-knowSQL"
50
+ config = PeftConfig.from_pretrained(model_id)
51
+ model_ = AutoModelForSeq2SeqLM.from_pretrained(config.base_model_name_or_path, return_dict=True)
52
+ tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
53
 
54
+ model = PeftModel.from_pretrained(model_, model_id)
55
+ model = get_peft_model(model,config)
56
+ ```
57
 
58
+ Model inference
 
 
59
 
60
+ ```python
61
+ question = "server of user id 11 with status active and server id 10"
62
+ table = "table_name_77 ( user id INTEGER, status VARCHAR, server id INTEGER )"
63
 
64
+ test = f"""Question: {question}\nTable: {table}\nSQL: """
65
 
66
+ p = tokenizer(test, return_tensors='pt')
67
 
68
+ device = "cuda" if torch.cuda.is_available() else "cpu"
69
+
70
+ print("output :", tokenizer.batch_decode(model.to(device).generate(**p.to(device),max_new_tokens=50),skip_special_tokens=True)[0])
71
+
72
+ ```
73
+
74
+ ## Performance
75
+
76
+ ### Speed Performance
77
+ The model inference takes about 2-3 seconds to run with Google Colab Free Tier CPU
78
 
 
79
 
80
  ### Downstream Use [optional]
81
 
 
239
  ### Framework versions
240
 
241
 
242
+ - PEFT 0.6.2