Synthesized Dataset
#2
by
Tottowich
- opened
Hi, great job on this project!
Is the synthesized dataset of 200k examples going to be open-sourced?
Looking to train a similar model with Llama3 as base instead.
We released a 25k subset. Hope that helps :) https://huggingface.co/datasets/motherduckdb/duckdb-text2sql-25k
Hello there, have you also some database with query-answer where we could train the model on ? I face an issue founding only sqlite database for text2sql training dataset.