Edit model card

LLaMa_3.2_3B_Instruct_Text2SQL-Q4_K_M-GGUF.gguf

This is a GGUF quantized version of the LLaMa 3.2 3B Text2SQL model.

Model Details

  • Architecture: LLaMa 3.2 3B
  • Task: Text to SQL Generation
  • Quantization: Q4_K_M
  • Context Length: 65536 tokens (2^16)
  • Format: GGUF (Compatible with llama.cpp)

Usage

from llama_cpp import Llama

# Initialize model
llm = Llama(
    model_path="downloaded_model.gguf",
    n_ctx=65536,  # 64K context
    n_threads=8   # Adjust based on your CPU
)

# Generate SQL
response = llm(
    "Convert this to SQL: Find all users who signed up in January 2024",
    max_tokens=1024,
    temperature=0.7
)

print(response['choices'][0]['text'])

Model Source

This is a quantized version of XeAI/LLaMa_3.2_3B_Instruct_Text2SQL

Downloads last month
43
GGUF
Model size
3.21B params
Architecture
llama

4-bit

Inference Examples
Inference API (serverless) does not yet support llama.cpp models for this pipeline type.