#AI#Python#SQL#LLM#FastAPI

Building TextQL — an LLM-Powered Text-to-SQL Pipeline

> September 5, 2024

TextQL is an open-source project from org45 that turns plain English into SQL queries. You describe what you want, and TextQL generates the SQL, runs it, and returns the results — no SQL knowledge required. This post walks through the pipeline architecture and the prompt engineering decisions that made it work reliably.

> Why Text-to-SQL is Hard

LLMs can write SQL, but making them write *correct* SQL for *your specific schema* is the challenge:

They hallucinate table and column names
They use the wrong JOIN type when the schema is ambiguous
They produce valid SQL that answers the wrong question
They format output inconsistently, making parsing brittle

The solution is structured prompting + schema injection + output validation.

> The Pipeline

User query → Schema injection → Prompt construction → Gemini → SQL extraction → Validation → Execution → Result

> FastAPI Backend

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import google.generativeai as genai
import sqlalchemy

app = FastAPI()
genai.configure(api_key=GEMINI_API_KEY)
model = genai.GenerativeModel("gemini-1.5-flash")

class QueryRequest(BaseModel):
    question: str
    database_url: str

@app.post("/query")
async def run_query(req: QueryRequest):
    schema = get_schema(req.database_url)
    sql = generate_sql(req.question, schema)
    results = execute_sql(req.database_url, sql)
    return {"sql": sql, "results": results}

> Schema Injection

The LLM needs to know the exact schema — table names, column names, and types:

def get_schema(database_url: str) -> str:
    engine = sqlalchemy.create_engine(database_url)
    inspector = sqlalchemy.inspect(engine)
    lines = []
    for table_name in inspector.get_table_names():
        columns = inspector.get_columns(table_name)
        col_defs = ", ".join(f"{c['name']} {c['type']}" for c in columns)
        lines.append(f"CREATE TABLE {table_name} ({col_defs});")
    return "\n".join(lines)

Injecting the actual schema (not a description of it) dramatically reduces hallucination.

> Prompt Engineering

The prompt structure that worked best follows a few rules: provide the schema, give a clear instruction, demand a specific output format, and show one example:

def build_prompt(question: str, schema: str) -> str:
    return f"""You are a SQL expert. Given the following database schema, write a SQL query that answers the user's question.

SCHEMA:
{schema}

RULES:
- Return ONLY the SQL query, no explanation.
- Wrap the SQL in <sql> tags.
- Use only table and column names that appear in the schema above.
- Prefer simple queries over complex ones.

EXAMPLE:
Question: How many users signed up last month?
<sql>SELECT COUNT(*) FROM users WHERE created_at >= DATE_TRUNC('month', NOW() - INTERVAL '1 month') AND created_at < DATE_TRUNC('month', NOW());</sql>

Question: {question}
"""

The `<sql>` tags make extraction trivial and reliable:

import re

def extract_sql(response_text: str) -> str:
    match = re.search(r"<sql>(.*?)</sql>", response_text, re.DOTALL)
    if not match:
        raise ValueError("LLM did not return SQL in expected format")
    return match.group(1).strip()

> SQL Validation Before Execution

Never execute LLM-generated SQL without validation. We use SQLAlchemy's text() with a dry-run explain:

def validate_sql(database_url: str, sql: str) -> None:
    engine = sqlalchemy.create_engine(database_url)
    with engine.connect() as conn:
        conn.execute(sqlalchemy.text(f"EXPLAIN {sql}"))

This catches hallucinated table names before they hit production data.

> Executing and Returning Results

def execute_sql(database_url: str, sql: str) -> list[dict]:
    validate_sql(database_url, sql)
    engine = sqlalchemy.create_engine(database_url)
    with engine.connect() as conn:
        result = conn.execute(sqlalchemy.text(sql))
        return [dict(row._mapping) for row in result.fetchall()]

> Results on a Sample Dataset

Testing TextQL on a 10-table e-commerce database with 50 natural language questions:

**Correct SQL generated**: 82%
**Schema hallucination rate**: 4% (down from 31% without schema injection)
**Average response time**: 1.1s (Gemini Flash)

The 18% failure cases were mostly ambiguous questions where even a human would need clarification.

> Lessons Learned

**Schema injection is the single most impactful prompt technique** for text-to-SQL.
**Structured output tags** (`<sql>`) are more reliable than asking the model to "only output SQL".
**Always validate before executing** — an `EXPLAIN` is cheap; a bad query on production data is not.
**Short, fast models often outperform larger ones** for well-structured prompts — Gemini Flash matched Flash 1.5 Pro on this task.

TextQL is open source under org45. Contributions welcome.