File size: 2,831 Bytes
419c683
 
9e116a9
4042c1d
 
 
 
a5bd959
c4af904
a5bd959
c4af904
 
 
a5bd959
 
 
 
 
 
c4af904
3e1dd57
11740b5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
414363c
 
 
 
 
11740b5
 
 
 
 
 
 
 
 
 
 
414363c
f2079b1
 
 
 
 
227b415
5e6d636
 
 
 
775dc4d
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
---
license: mit
pipeline_tag: text2text-generation
inference:
  parameters:
    max_new_tokens: 100
    do_sample: false
widget:
- text: 'translate English to SQL: Show all employees with salary greater than $50000'
- text: >-
    translate English to SQL: How many models were finetuned using BERT as base
    model?
- text: 'translate English to SQL: how many cars are in blue color'
datasets:
- b-mc2/sql-create-context
language:
- en
tags:
- code
metrics:
- bleu 34.962700
---
# T5-SQL-Translator

## Overview
T5-SQL-Translator is a fine-tuned version of the Google T5-small model, specialized in translating English natural language queries into SQL SELECT queries. This model is trained to understand English queries and generate corresponding SQL SELECT queries for databases, making it valuable for automating the process of translating natural language to SQL, particularly for SELECT operations.

## Model Details
- **Model Name**: T5-SQL-Translator
- **Model Type**: Text-to-Text Transformers
- **Base Model**: Google T5-small
- **Language**: English
- **Task**: English to SQL SELECT Translation
- **Training Data**: Combination of English natural language queries paired with corresponding SQL SELECT queries from diverse domains.
- **Fine-tuning**: The model has been fine-tuned on a dataset of English-to-SQL SELECT translations to optimize its performance for this specific task.

## Example Use Cases
- Automatically translating English questions into SQL SELECT queries for database querying.

## How to Use
1. **Install Hugging Face Transformers**:
   ```bash
   pip install transformers
   ```
## Inference
  ```python
# Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("suriya7/t5-base-text-to-sql")
model = AutoModelForSeq2SeqLM.from_pretrained("suriya7/t5-base-text-to-sql")

def translate_to_sql_select(english_query):
    input_text = "translate English to SQL: "english_query
    input_ids = tokenizer.encode(input_text, return_tensors="pt",max_new_tokens=100,do_sample=False)
    outputs = model.generate(input_ids)
    sql_query = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return sql_query

# Example usage
english_query = "Show all employees with salary greater than $50000"
sql_query = translate_to_sql_select(english_query)
print("SQL Query:", sql_query)
```
## Performance
- **Evaluation Metrics:** BLEU score 34.962700
## Acknowledgments
- The original T5 model was developed by Google Research.
- Training data was sourced from https://huggingface.co/datasets/b-mc2/sql-create-context
- Special thanks to Hugging Face for providing the Transformers library and the Model Hub for easy model sharing.

# Contact
For any inquiries or issues regarding the model, feel free to contact:
- thesuriya3@gmail.com