--- license: apache-2.0 tags: - generated_from_trainer model-index: - name: Ita2SqlModel results: [] datasets: - OMazzuzi90/Ita2Sql_data_train language: - it - sql --- # Ita2SqlModel This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on [OMazzuzi90/Ita2Sql_data_train](https://huggingface.co/datasets/OMazzuzi90/Ita2Sql_data_train). dataset. The following model is created with the aim of verifying if it is possible to translate the Italian language into SQL script. The model is trained on a self-created dataset that associates various Italian sentences with an SQL script, in which synonyms are used. For Example the script SELECT * FROM CLIENTI, in Italian it could be expressed by the following sentences: - "MOSTRAMI TUTTI I CLIENTI" - "VISUALIZZA TUTTI I CLIENTI" - "SELEZIONA TUTTI I CLIENTI" Clearly, with a much larger and more comprehensive dataset, for instance, one that includes all possible synonyms of an Italian verb, the model would be capable of being significantly more accurate. Currently, the dataset consists of about 21k rows. ## Model description The model takes uppercase sentences as input, with the prefix "TRANSLATE ITA TO SQL: " followed by the phrase that you want to try to translate. Here are some functional and usable examples: TRANSLATE ITA TO SQL: MOSTRAMI TUTTI I [TABLE] Ex: TRANSLATE ITA TO SQL: MOSTRAMI TUTTI I CLIENTI SELECT * FROM CLIENTI TRANSLATE ITA TO SQL: MOSTRAMI [COLUMN] DEI [TABLE] Ex: TRANSLATE ITA TO SQL: MOSTRAMI NOME DEI CLIENTI SELECT NOME FROM CLIENTI TRANSLATE ITALIAN TO SQL: MOSTRAMI [COLUMN] DEI [TABLE] QUANDO [COLUMN] COMPRESA TRA [PARAM1] E [PARAM2] Ex: TRANSLATE ITALIAN TO SQL: MOSTRAMI NOME DEI CLIENTI QUANDO ETA COMPRESA TRA 18 E 23 SELECT NOME FROM CLIENTI WHERE ETA BETWEEN 18 AND 23 ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 11 ### Framework versions - Transformers 4.28.0 - Pytorch 2.0.1+cu118 - Datasets 2.12.0 - Tokenizers 0.13.3