LLaMa_3.2_3B_Instruct_Text2SQL-Q4_K_M-GGUF.gguf
This is a GGUF quantized version of the LLaMa 3.2 3B Text2SQL model.
Model Details
- Architecture: LLaMa 3.2 3B
- Task: Text to SQL Generation
- Quantization: Q4_K_M
- Context Length: 65536 tokens (2^16)
- Format: GGUF (Compatible with llama.cpp)
Usage
from llama_cpp import Llama
# Initialize model
llm = Llama(
model_path="downloaded_model.gguf",
n_ctx=65536, # 64K context
n_threads=8 # Adjust based on your CPU
)
# Generate SQL
response = llm(
"Convert this to SQL: Find all users who signed up in January 2024",
max_tokens=1024,
temperature=0.7
)
print(response['choices'][0]['text'])
Model Source
This is a quantized version of XeAI/LLaMa_3.2_3B_Instruct_Text2SQL
- Downloads last month
- 43
Inference API (serverless) does not yet support llama.cpp models for this pipeline type.