~/krishna_dhakal
#AI#Python#SQL#LLM#FastAPI

Building TextQL — an LLM-Powered Text-to-SQL Pipeline

> September 5, 2024

TextQL is an open-source project from org45 that turns plain English into SQL queries. You describe what you want, and TextQL generates the SQL, runs it, and returns the results — no SQL knowledge required. This post walks through the pipeline architecture and the prompt engineering decisions that made it work reliably.


> Why Text-to-SQL is Hard


LLMs can write SQL, but making them write *correct* SQL for *your specific schema* is the challenge:


  • They hallucinate table and column names
  • They use the wrong JOIN type when the schema is ambiguous
  • They produce valid SQL that answers the wrong question
  • They format output inconsistently, making parsing brittle

The solution is structured prompting + schema injection + output validation.


> The Pipeline


User query → Schema injection → Prompt construction → Gemini → SQL extraction → Validation → Execution → Result

> FastAPI Backend


from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import google.generativeai as genai
import sqlalchemy

app = FastAPI()
genai.configure(api_key=GEMINI_API_KEY)
model = genai.GenerativeModel("gemini-1.5-flash")

class QueryRequest(BaseModel):
    question: str
    database_url: str

@app.post("/query")
async def run_query(req: QueryRequest):
    schema = get_schema(req.database_url)
    sql = generate_sql(req.question, schema)
    results = execute_sql(req.database_url, sql)
    return {"sql": sql, "results": results}

> Schema Injection


The LLM needs to know the exact schema — table names, column names, and types:


def get_schema(database_url: str) -> str:
    engine = sqlalchemy.create_engine(database_url)
    inspector = sqlalchemy.inspect(engine)
    lines = []
    for table_name in inspector.get_table_names():
        columns = inspector.get_columns(table_name)
        col_defs = ", ".join(f"{c['name']} {c['type']}" for c in columns)
        lines.append(f"CREATE TABLE {table_name} ({col_defs});")
    return "\n".join(lines)

Injecting the actual schema (not a description of it) dramatically reduces hallucination.


> Prompt Engineering


The prompt structure that worked best follows a few rules: provide the schema, give a clear instruction, demand a specific output format, and show one example:


def build_prompt(question: str, schema: str) -> str:
    return f"""You are a SQL expert. Given the following database schema, write a SQL query that answers the user's question.

SCHEMA:
{schema}

RULES:
- Return ONLY the SQL query, no explanation.
- Wrap the SQL in <sql> tags.
- Use only table and column names that appear in the schema above.
- Prefer simple queries over complex ones.

EXAMPLE:
Question: How many users signed up last month?
<sql>SELECT COUNT(*) FROM users WHERE created_at >= DATE_TRUNC('month', NOW() - INTERVAL '1 month') AND created_at < DATE_TRUNC('month', NOW());</sql>

Question: {question}
"""

The `<sql>` tags make extraction trivial and reliable:


import re

def extract_sql(response_text: str) -> str:
    match = re.search(r"<sql>(.*?)</sql>", response_text, re.DOTALL)
    if not match:
        raise ValueError("LLM did not return SQL in expected format")
    return match.group(1).strip()

> SQL Validation Before Execution


Never execute LLM-generated SQL without validation. We use SQLAlchemy's text() with a dry-run explain:


def validate_sql(database_url: str, sql: str) -> None:
    engine = sqlalchemy.create_engine(database_url)
    with engine.connect() as conn:
        conn.execute(sqlalchemy.text(f"EXPLAIN {sql}"))

This catches hallucinated table names before they hit production data.


> Executing and Returning Results


def execute_sql(database_url: str, sql: str) -> list[dict]:
    validate_sql(database_url, sql)
    engine = sqlalchemy.create_engine(database_url)
    with engine.connect() as conn:
        result = conn.execute(sqlalchemy.text(sql))
        return [dict(row._mapping) for row in result.fetchall()]

> Results on a Sample Dataset


Testing TextQL on a 10-table e-commerce database with 50 natural language questions:


  • **Correct SQL generated**: 82%
  • **Schema hallucination rate**: 4% (down from 31% without schema injection)
  • **Average response time**: 1.1s (Gemini Flash)

The 18% failure cases were mostly ambiguous questions where even a human would need clarification.


> Lessons Learned


  1. **Schema injection is the single most impactful prompt technique** for text-to-SQL.
  2. **Structured output tags** (`<sql>`) are more reliable than asking the model to "only output SQL".
  3. **Always validate before executing** — an `EXPLAIN` is cheap; a bad query on production data is not.
  4. **Short, fast models often outperform larger ones** for well-structured prompts — Gemini Flash matched Flash 1.5 Pro on this task.

TextQL is open source under org45. Contributions welcome.