XeAI
/

LLaMa_3.2_3B_Instruct_Text2SQL-Q4_K_M-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

Edit model card

LLaMa_3.2_3B_Instruct_Text2SQL-Q4_K_M-GGUF.gguf

This is a GGUF quantized version of the LLaMa 3.2 3B Text2SQL model.

Model Details

Architecture: LLaMa 3.2 3B
Task: Text to SQL Generation
Quantization: Q4_K_M
Context Length: 65536 tokens (2^16)
Format: GGUF (Compatible with llama.cpp)

Usage

from llama_cpp import Llama

# Initialize model
llm = Llama(
    model_path="downloaded_model.gguf",
    n_ctx=65536,  # 64K context
    n_threads=8   # Adjust based on your CPU
)

# Generate SQL
response = llm(
    "Convert this to SQL: Find all users who signed up in January 2024",
    max_tokens=1024,
    temperature=0.7
)

print(response['choices'][0]['text'])

Model Source

This is a quantized version of XeAI/LLaMa_3.2_3B_Instruct_Text2SQL

Downloads last month: 43

GGUF

Model size

3.21B params

Architecture

llama

4-bit

Inference Examples

Text Generation

Inference API (serverless) does not yet support llama.cpp models for this pipeline type.