Blog / Prompt tips / Structured Output Prompting: How to Forc…

Structured Output Prompting: How to Force Any AI to Return Clean JSON, Tables, or CSV

Practical prompting patterns (plus a reality check) for getting machine-parseable JSON, tables, and CSV from LLMs-reliably enough for production.

Ilia Ilinskii
Rephrase · Mar 08, 2026

Prompt tips9 min

On this page

The two layers: prompt contracts vs constrained decoding The Prompt Contract: what actually works in practice 1) Delimit the input like you're writing a parser 2) Demand "output-only" with zero extra text 3) Use a schema the model can "see" and copy 4) Specify deterministic behavior for unknowns Practical examples (JSON, tables, CSV)Example 1: "JSON-only" extraction with a strict schema and null rules Example 2: Markdown table output that stays parseable Example 3: CSV that won't explode when commas show up "Forcing" JSON in production: validation, retries, and canonicalization Closing thought References

If you've ever shipped an "LLM → JSON" feature, you've felt this pain: the model nails the content… then slips a trailing comma into the JSON, wraps it in a Markdown code fence, or adds a friendly "Sure! Here you go:" that blows up your parser.

And here's the uncomfortable truth: prompting alone can't "force" perfect structure in the mathematical sense. You can increase reliability a lot, but hard guarantees usually require constrained decoding or a post-processor that validates and retries.

What's interesting is that research is now pretty blunt about where structure fails. On complex extraction tasks, frontier models still collapse as schemas get wide and outputs get long, and a huge chunk of failures are boring formatting issues like truncation and trailing commas [1]. So the goal of "structured output prompting" is really two goals: make the model want to comply, and make non-compliance cheap to detect and correct.

The two layers: prompt contracts vs constrained decoding

I separate "structured output" into two layers.

Layer 1 is the prompt contract: you give explicit formatting rules, an explicit schema, and explicit failure behavior. This is the layer you can use with basically any model, anywhere.

Layer 2 is constrained decoding / structured output mode: you pass a schema to the API as a response format constraint, and the provider compiles it into a grammar that restricts which tokens the model is allowed to emit. This is the closest thing we have to "forcing" JSON.

ExtractBench (a 2026 benchmark for PDF-to-JSON extraction) describes this approach clearly: providers compile schemas into grammar artifacts (think finite-state automata) and use constrained decoding so output is syntactically valid and schema-conforming [1]. The same paper also notes a real production gotcha: the first request with a schema can have extra latency because the provider needs to process/compile it [1].

Also: constrained decoding isn't magic. ExtractBench found that "structured output mode" can introduce a new failure class-schema rejection-because every provider supports only a subset of JSON Schema, and big/deep schemas may be refused outright [1]. Even when accepted, accuracy can degrade because the model is juggling both extraction and strict grammar state across long outputs [1].

So when someone asks, "How do I force any AI to return clean JSON?" my honest answer is:

If the platform supports structured outputs, use it. If it doesn't, use prompt contracts plus validation-and-retry.

The Prompt Contract: what actually works in practice

When you're stuck with plain text generation, you want a contract the model can follow without interpretation.

The contract has four parts: a hard delimiter boundary, a single output format, a schema, and a deterministic "failure" format.

1) Delimit the input like you're writing a parser

Models leak. They blend instructions, data, and examples unless you make it painfully obvious where the data starts and ends.

Use clear, boring delimiters and tell the model what they mean.

2) Demand "output-only" with zero extra text

This is the single highest-leverage instruction. ExtractBench's baseline extraction prompt is basically the simplest form of this: "Please return ONLY valid JSON… Do not include any explanatory text before or after the JSON." [1]

It sounds too simple, but it matters because the model's default behavior is conversational. You're trying to override that default.

3) Use a schema the model can "see" and copy

If you can't pass a schema as a response format (Layer 2), embed it directly in the prompt (Layer 1). But keep it readable. If you paste a massive JSON Schema blob, you may be trading one failure mode (bad formatting) for another (the model runs out of budget or loses the plot).

ScrapeGraphAI-100k is interesting here because it's based on real-world extraction telemetry and keeps track of schema complexity. It reports sharp drops in validation at certain thresholds (like very deep schemas or very high key counts) [2]. That matches what most of us see: you can get perfect JSON on small schemas, then things get weird when you scale.

My take: if your schema is large, you should consider splitting the job. Extract one section at a time and merge.

4) Specify deterministic behavior for unknowns

If the model can't find a value, you want a consistent representation (null, empty string, omit field). ExtractBench's evaluation framework makes a big deal of distinguishing missing vs null, because they mean different failure modes (omission vs hallucination) [1]. Your pipeline should care too.

I usually pick: required fields must always exist; unknown values become null. Optional fields can be omitted, but only if explicitly optional.

Practical examples (JSON, tables, CSV)

Below are three prompt patterns I actually use. They're written to work even without structured-output APIs.

Example 1: "JSON-only" extraction with a strict schema and null rules

You are a data extraction function.

Rules:
1) Output MUST be valid JSON. Output JSON ONLY. No markdown. No code fences. No commentary.
2) Follow the schema exactly. Do not add extra keys.
3) If a value is not present in the input, use null (do not guess).
4) Strings must be plain text (no newlines). Dates must be ISO-8601 if present.
5) If you cannot comply, output exactly: {"error":"NON_COMPLIANT"}

INPUT (between <doc> tags):
<doc>
{{paste text here}}
</doc>

SCHEMA:
{
  "customer_name": "string|null",
  "invoice_number": "string|null",
  "invoice_date": "string|null",
  "line_items": [
    {
      "description": "string|null",
      "quantity": "number|null",
      "unit_price": "number|null"
    }
  ],
  "total": "number|null"
}

This is basically the "ExtractBench baseline prompt" idea, but with explicit null semantics and an explicit non-compliance escape hatch [1]. The escape hatch is important because it gives your caller something predictable to detect.

A Reddit example I've seen repeated (and that works decently) is the blunt "Your output MUST be in valid JSON… Do not include conversational text" pattern [3]. It's not deep, but it's directionally correct.

Example 2: Markdown table output that stays parseable

Tables are tricky because Markdown tables are easy for humans and annoying for parsers. If you truly need a table, constrain it hard.

Return a Markdown table ONLY (no other text).

Constraints:
- Exactly 4 columns with headers: Feature | Type | Required | Notes
- No pipes inside cell values (replace '|' with '/')
- Notes must be <= 80 characters
- If unknown, use "N/A"

Generate the table for this API description:
{{paste description}}

If you control the downstream, I'd rather generate JSON and render a table myself. But if you need the table, this constraint set prevents the most common Markdown breakages.

Example 3: CSV that won't explode when commas show up

CSV is deceptively hard because commas and quotes are everywhere. You need quoting rules.

Return CSV ONLY (no other text).

CSV rules:
- Use comma delimiter
- First row must be headers: id,name,score,comment
- All fields MUST be double-quoted
- Any double quote inside a field MUST be escaped as two double quotes
- Use \n for new lines inside a field (do not output raw newlines)

Create 10 rows based on:
{{paste input}}

If you're automating, I'd still validate and retry. CSV parsers are unforgiving.

"Forcing" JSON in production: validation, retries, and canonicalization

Even with great prompts, you'll still see occasional formatting failures-especially on long outputs. ExtractBench found that trailing commas and truncation made up a big share of invalid outputs, motivating structured output mode in the first place [1]. That's your signal to build a system, not a wish.

Here's what works well:

You validate output strictly (JSON parse + schema validation). If it fails, you retry with a short "repair prompt" that includes the exact parser error. And if you care about diffs, caching, or deterministic pipelines, you canonicalize the JSON (stable key ordering, stable whitespace).

A community pattern I like is treating the model's output as something that should be "diffable" and "verifiable," including canonical formatting rules like minified JSON and key sorting [4]. I wouldn't copy every constraint from that post, but the mindset is right: machine output is an interface, not prose.

Closing thought

Structured output prompting is less about clever wording and more about being a little strict, a little paranoid, and very explicit about failure. When you can use constrained decoding, do it. When you can't, write contracts, validate everything, and assume you'll need retries.

If you want a concrete next step, take the JSON prompt above, wire in schema validation, and log every failure case. After a week, you'll have your own "format failure taxonomy," and your prompts will stop being vibes and start being engineering.

References

Documentation & Research

ExtractBench: A Benchmark and Evaluation Methodology for Complex Structured Extraction - arXiv cs.LG - https://arxiv.org/abs/2602.12247
ScrapeGraphAI-100k: A Large-Scale Dataset for LLM-Based Web Information Extraction - arXiv cs.CL - https://arxiv.org/abs/2602.15189

Community Examples

The "Taxonomy Architect" for organizing messy data sets - r/PromptEngineering - https://www.reddit.com/r/PromptEngineering/comments/1rd8p9v/the_taxonomy_architect_for_organizing_messy_data/
Strict JSON Prompt Generator: One TASK → One Canonical EXECUTOR_PROMPT_JSON (Minified + Key-Sorted) - r/PromptEngineering - https://www.reddit.com/r/PromptEngineering/comments/1qwv3sa/strict_json_prompt_generator_one_task_one/