Stop getting hallucinated column names from your LLM. Learn schema injection, validation patterns, and copy-paste SQL prompt templates for PostgreSQL, BigQuery, and SQLite.
Your LLM just wrote a JOIN on a column that doesn't exist. Again. Here's how to make sure it never does.
CREATE TABLE DDL into the prompt - schema context is the single biggest factor in SQL generation accuracyLLMs are surprisingly capable at generating syntactically valid SQL. The problem is semantic validity - writing a query that runs correctly against your specific database. Without knowing your schema, the model interpolates from whatever it saw in training. That means generic table names, invented column names, and plausible-sounding joins that reference nothing real.
Research on text-to-SQL systems confirms that schema linking - correctly mapping natural language terms to actual table and column names - is the hardest part of the problem [1]. When you skip schema injection, you're asking the model to solve that problem blind.
Multi-turn conversations make this worse. Recent work on multi-turn text-to-SQL shows that models lose track of which columns and tables are relevant as the conversation progresses, leading to compounding errors across follow-up questions [1]. If you're building a conversational data assistant, you need to explicitly re-anchor the schema at each turn.
The rule is simple: paste your CREATE TABLE statements into the prompt, every time. Not a description. Not a bullet list of column names. The actual DDL.
Here's why this works. The DDL encodes data types, constraints, foreign key relationships, and naming conventions - everything the model needs to reason about joins, aggregations, and filter conditions. Research on table question answering shows that context engineering that reduces ambiguity is one of the most reliable ways to cut hallucination rates [3].
For large schemas with dozens of tables, don't dump everything. Include only the tables relevant to the question, and add a short inline comment on any column whose business meaning isn't obvious from the name.
-- Schema context for the LLM
-- orders table: one row per customer purchase
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
customer_id INTEGER NOT NULL, -- FK to customers.id
created_at TIMESTAMPTZ NOT NULL,
total_cents INTEGER NOT NULL, -- stored in cents, not dollars
status TEXT CHECK (status IN ('pending', 'shipped', 'cancelled'))
);
CREATE TABLE customers (
id SERIAL PRIMARY KEY,
email TEXT UNIQUE NOT NULL,
country_code CHAR(2) NOT NULL -- ISO 3166-1 alpha-2
);
That comment about total_cents matters more than it looks. Without it, the model might write SUM(total_cents) and label it as a dollar figure in the output - technically correct SQL, semantically wrong analysis.
Here's the structure I use for every SQL generation task. It's opinionated, but it works consistently across GPT-4, Claude, and Gemini.
You are a [DIALECT] SQL expert. Use only the tables and columns defined below.
Do not invent column names. If you cannot answer the question with the given schema, say so.
## Schema
[PASTE CREATE TABLE STATEMENTS HERE]
## Business Rules
- [Any constraints or conventions the model should know - e.g., "deleted rows have status='archived'"]
- [Currency is always stored in cents]
## Question
[Your natural language question]
## Output Format
1. The SQL query, formatted and readable
2. A plain-English explanation of what the query does, step by step
3. Any assumptions you made
The "Output Format" section is doing heavy lifting here. Asking for a step-by-step explanation forces the model to reason through its own query, which catches a surprising number of self-generated errors before you ever see the SQL.
Each SQL engine has quirks that trip up generic prompts. Here are the key differences to communicate explicitly.
| Feature | PostgreSQL | BigQuery | SQLite |
|---|---|---|---|
| String quotes | Single quotes ' |
Single quotes ' |
Single quotes ' |
| Identifier quotes | Double quotes " |
Backticks ` |
Double quotes or none |
| Date truncation | DATE_TRUNC('month', ts) |
DATE_TRUNC(ts, MONTH) |
strftime('%Y-%m', ts) |
| Array support | ARRAY[], unnest() |
ARRAY<T>, UNNEST() |
Not native |
| Window functions | Full support | Full support | Limited |
| Regex | ~ operator |
REGEXP_CONTAINS() |
GLOB only |
Tell the model which dialect you're targeting at the top of every prompt. "Write a PostgreSQL query" and "Write a BigQuery query" will produce meaningfully different outputs for anything involving dates, arrays, or JSON.
You are a PostgreSQL 16 expert. Use only the schema below. Use double-quote identifiers only when necessary.
## Schema
[DDL here]
## Question
[Question here]
You are a Google BigQuery SQL expert (standard SQL dialect).
Use backtick identifiers for all table and column references in the format `project.dataset.table`.
## Schema
[DDL or BQ table schema JSON here]
## Question
[Question here]
You are a SQLite 3.45 expert. Do not use window functions or features not available in SQLite.
Avoid FULL OUTER JOIN (not supported). Use strftime() for all date operations.
## Schema
[DDL here]
## Question
[Question here]
Data transformation prompts need one additional ingredient: sample data. Even two or three example rows help the model understand the actual shape of your data - especially for messy CSVs, mixed-type columns, or inconsistently formatted dates.
Recent research on deep tabular reasoning highlights that LLMs struggle specifically with non-canonical layouts and ambiguous headers [2]. The fix is the same: give explicit context before asking for the transformation.
Transform the following CSV data into a normalized SQL INSERT statement for the orders table.
## Target Schema
CREATE TABLE orders (
order_id INTEGER PRIMARY KEY,
created_at DATE NOT NULL, -- format: YYYY-MM-DD
total_cents INTEGER NOT NULL
);
## Sample Input Data (first 3 rows)
order_id,date,amount
1001,2026-01-15,$45.99
1002,01/16/2026,12.00
1003,Jan 17 2026,8.50
## Task
Write a SQL INSERT statement that handles the inconsistent date formats and converts
the amount column from dollar strings to integer cents.
That prompt will get you a working transformation with explicit CASE logic for the date formats. Without the sample rows, you'd get a generic template that ignores the actual problem.
Never run LLM-generated SQL directly against production. This isn't about distrust - it's about the gap between "syntactically valid" and "semantically correct."
Three-step validation that takes under a minute:
First, check the explanation the model provided. Does the logic actually match your question? A model that generates correct SQL for the wrong interpretation is still a problem.
Second, verify column and table names programmatically. In PostgreSQL, query information_schema.columns. In BigQuery, use INFORMATION_SCHEMA.COLUMNS. In SQLite, use PRAGMA table_info(tablename). A quick script that diffs the SQL's referenced identifiers against your actual schema will catch hallucinated columns before execution.
Third, run with LIMIT 10 on a read-only replica or development database first. Inspect the result. Does the row count and data look reasonable? If you're aggregating revenue and getting a negative number, something is wrong upstream.
Tools like Rephrase can help you tighten the prompt itself before you submit it - catching vague phrasing that tends to produce ambiguous SQL before the model ever sees it.
If you're building a data assistant where users ask follow-up questions, schema drift is your biggest enemy. The Track-SQL research [1] found that generative models lose track of relevant schema elements across conversation turns, which causes them to reference stale or incorrect columns in follow-up queries.
The practical fix: treat each turn as a fresh prompt with the full schema injected again. Yes, this uses more tokens. No, there's no reliable shortcut. For very large schemas, at minimum re-inject the tables that were referenced in the previous query.
[Previous query and result summary]
The last query used: orders, customers
## Current Schema (relevant tables)
[Re-paste orders and customers DDL]
## Follow-up Question
[Next question]
This pattern, combined with explicit schema injection and dialect specification, covers the majority of real-world SQL generation failures. The model isn't bad at SQL - it just needs the same context a junior analyst would need on their first day.
More prompt engineering techniques for data workflows are on the Rephrase blog.
Documentation & Research
Community Examples
LLMs generate SQL based on pattern matching from training data, not your actual schema. Without explicit schema context in the prompt, the model guesses column names. Always inject your CREATE TABLE statements or schema definitions directly into the prompt.
Yes. Each engine has dialect-specific syntax - BigQuery uses backtick identifiers and ARRAY/STRUCT types, PostgreSQL uses double-quotes and has richer window functions, and SQLite lacks many aggregate features. Tell the LLM explicitly which dialect to target.
Use a three-part structure: schema context first, then business rules or constraints, then the specific question. For multi-turn conversations, restate the relevant schema at each turn to prevent context drift.