Are there plans to distribute this model on Ollama.ai?

#4
by cboettig - opened

The original (deprecated) sqlcoder model is still available on https://ollama.com/library/sqlcoder , where it is one of only two SQL-specific LLMs. Are there plans to have this newer model listed there instead? Given the excellent integration of Ollama with platforms such as langchain and huggingface for this kind of application I think this could benefit many users and further improve discovery of this.

Fantastic work on this project, much appreciated!

let me try that.. will let you know

Thanks @ucalyptus , much appreciated!

this is great, thanks! I'll have to explore a bit how to get it to play nicely with langchain's agents (https://python.langchain.com/docs/use_cases/sql/quickstart/), they are using a slightly different template I think which is causing it not to pull out the sql command part from the explanation part successfully.

You can change the prompt for sql agent. There is an example in this link https://python.langchain.com/v0.1/docs/use_cases/sql/agents/.

example:

from langchain_community.agent_toolkits import create_sql_agent
from langchain_core.prompts import PromptTemplate
template= """
<|begin_of_text|><|start_header_id|>user<|end_header_id|>

Generate a SQL query to answer this question: `{user_question}`
{instructions}

DDL statements:
{create_table_statements}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

The following SQL query best answers the question `{user_question}`:
```sql
"""

prompt=PromptTemplate(input_variables=['user_question', 'create_table_statements','instructions'], template=template, validate_template=False )


agent = create_sql_agent(
     llm=llm,
     db=db,
     prompt=prompt,
     verbose=True,
     agent_type="openai-tools",
)

Hi guys! Maybe you can help me. So that’s the thing. I tried the quantized model by mannix provided above using Ollama and it is great. But when I try quantized gguf (by others or by myself) version of the original model by defog I get some random characters as the output. I use LangChain and LlamaCpp. Cannot figure out what the issue is.

If you explain more I can help @liashchynskyi

So I’ve been trying to build an app to convert text to sql queries. I use Langchain and llama-cpp-python for local model inference. I’ve found already quantized gguf models of defog/llama3-sqlcoder made by other people. I tried q4_k_m, q5_k_m and a couple more. I have loaded the model using llama-cpp-python as per langchain docs, I’ve used the prompt as in the defog model card. If I ask a question to the model I always get random and meaningless response that goes in a loop like “5bd-ba-ba-ba…” and so on. But if I try the same with the quantized defog model by mannix https://ollama.com/mannix/defog-llama3-sqlcoder-8b then everything goes well and I get the SQL query as a response. In that case I use Ollama for inference instead of llama-cpp. I thought maybe the gguf models were somehow bad quantized or something so today I tried to quantize the original model by myself. Firstly, I converted it to fp16 using llama.cpp and then quantized to q4_k_m gguf. Then I loaded the model using llama-cpp-python and again got random chars response, not an SQL query. I don’t know what I’m doing wrong. Other gguf models like mistral work well with llama-cpp-python for inference. So the thing is that for this defog model Ollama does work and llama-cpp-python does not. All prompts, model configs, etc are the same for both Ollama and llama-cpp.

First, pip install --upgrade llama-cpp-python build. Then download the gguf u model from this repo omeryentur and try the code below.


from langchain_core.prompts import PromptTemplate
from langchain_community.llms import LlamaCpp


template= """
<|begin_of_text|><|start_header_id|>user<|end_header_id|>

Generate a SQL query to answer this question: `{user_question}`
{instructions}

DDL statements:
{create_table_statements}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

The following SQL query best answers the question `{user_question}`:
```sql
"""

prompt=PromptTemplate(input_variables=['user_question', 'create_table_statements','instructions'], template=template, validate_template=False )

llm = LlamaCpp(
    model_path="model.gguf",
    temperature=0,
    n_ctx=2048,
    top_p=1,
    verbose=True,  
)


user_question= """
Question: 
"""

instructions=""

create_table_statements=""

print(llm.invoke({"user_question":user_question,"create_table_statements":create_table_statements,"instructions":instructions}))

@omeryentur Thanks. I tried with your quantized model. Also changed code a bit - instead of calling llm I created a chain prompt | llm to be able to pass arguments dict. Also I used question and DDL statements as here https://defog.ai/sqlcoder-demo/. As a result it generates correct SQL query and not random chars. But if I pass DDL statements from my real database and another question without changing a line of code, then it generates random chars again. Seems like the issue is in DDL statements itself. It's strange because when using Ollama for inference everything works fine.

Also tried quantized models by myself and by other people. It works well with the question and DDl statements from the demo. But as soon as I pass my own DDL and question I get random chars always.

image.png

I understand For repetition, you can use parameters such as llama.cpp repeat_penalty. If you want, use llama.cpp directly instead of llama-cpp-python.

Tried, no results. Anyway, thanks for the points. I will probably stick with Ollama for that one. Frankly, there's no difference for me between using llama-cpp-python or ollama for inference. It's just that I noticed that llama-cpp is a bit faster, I think. I have a M1 Pro.

When using Ollama for inference, then everything works. But as soon as I use Llama-cpp-python no matter what quantized version of the model I use (by myself of other people) I always get random chars as the result.

Could this model support other database like MySQL?

Sign up or log in to comment