Building TextQL — an LLM-Powered Text-to-SQL Pipeline
> September 5, 2024
TextQL is an open-source project from org45 that turns plain English into SQL queries. You describe what you want, and TextQL generates the SQL, runs it, and returns the results — no SQL knowledge required. This post walks through the pipeline architecture and the prompt engineering decisions that made it work reliably.
> Why Text-to-SQL is Hard
LLMs can write SQL, but making them write *correct* SQL for *your specific schema* is the challenge:
- They hallucinate table and column names
- They use the wrong JOIN type when the schema is ambiguous
- They produce valid SQL that answers the wrong question
- They format output inconsistently, making parsing brittle
The solution is structured prompting + schema injection + output validation.
> The Pipeline
User query → Schema injection → Prompt construction → Gemini → SQL extraction → Validation → Execution → Result> FastAPI Backend
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import google.generativeai as genai
import sqlalchemy
app = FastAPI()
genai.configure(api_key=GEMINI_API_KEY)
model = genai.GenerativeModel("gemini-1.5-flash")
class QueryRequest(BaseModel):
question: str
database_url: str
@app.post("/query")
async def run_query(req: QueryRequest):
schema = get_schema(req.database_url)
sql = generate_sql(req.question, schema)
results = execute_sql(req.database_url, sql)
return {"sql": sql, "results": results}> Schema Injection
The LLM needs to know the exact schema — table names, column names, and types:
def get_schema(database_url: str) -> str:
engine = sqlalchemy.create_engine(database_url)
inspector = sqlalchemy.inspect(engine)
lines = []
for table_name in inspector.get_table_names():
columns = inspector.get_columns(table_name)
col_defs = ", ".join(f"{c['name']} {c['type']}" for c in columns)
lines.append(f"CREATE TABLE {table_name} ({col_defs});")
return "\n".join(lines)Injecting the actual schema (not a description of it) dramatically reduces hallucination.
> Prompt Engineering
The prompt structure that worked best follows a few rules: provide the schema, give a clear instruction, demand a specific output format, and show one example:
def build_prompt(question: str, schema: str) -> str:
return f"""You are a SQL expert. Given the following database schema, write a SQL query that answers the user's question.
SCHEMA:
{schema}
RULES:
- Return ONLY the SQL query, no explanation.
- Wrap the SQL in <sql> tags.
- Use only table and column names that appear in the schema above.
- Prefer simple queries over complex ones.
EXAMPLE:
Question: How many users signed up last month?
<sql>SELECT COUNT(*) FROM users WHERE created_at >= DATE_TRUNC('month', NOW() - INTERVAL '1 month') AND created_at < DATE_TRUNC('month', NOW());</sql>
Question: {question}
"""The `<sql>` tags make extraction trivial and reliable:
import re
def extract_sql(response_text: str) -> str:
match = re.search(r"<sql>(.*?)</sql>", response_text, re.DOTALL)
if not match:
raise ValueError("LLM did not return SQL in expected format")
return match.group(1).strip()> SQL Validation Before Execution
Never execute LLM-generated SQL without validation. We use SQLAlchemy's text() with a dry-run explain:
def validate_sql(database_url: str, sql: str) -> None:
engine = sqlalchemy.create_engine(database_url)
with engine.connect() as conn:
conn.execute(sqlalchemy.text(f"EXPLAIN {sql}"))This catches hallucinated table names before they hit production data.
> Executing and Returning Results
def execute_sql(database_url: str, sql: str) -> list[dict]:
validate_sql(database_url, sql)
engine = sqlalchemy.create_engine(database_url)
with engine.connect() as conn:
result = conn.execute(sqlalchemy.text(sql))
return [dict(row._mapping) for row in result.fetchall()]> Results on a Sample Dataset
Testing TextQL on a 10-table e-commerce database with 50 natural language questions:
- **Correct SQL generated**: 82%
- **Schema hallucination rate**: 4% (down from 31% without schema injection)
- **Average response time**: 1.1s (Gemini Flash)
The 18% failure cases were mostly ambiguous questions where even a human would need clarification.
> Lessons Learned
- **Schema injection is the single most impactful prompt technique** for text-to-SQL.
- **Structured output tags** (`<sql>`) are more reliable than asking the model to "only output SQL".
- **Always validate before executing** — an `EXPLAIN` is cheap; a bad query on production data is not.
- **Short, fast models often outperform larger ones** for well-structured prompts — Gemini Flash matched Flash 1.5 Pro on this task.
TextQL is open source under org45. Contributions welcome.