ChatGPT Prompts for Data Analysis and Excel: The Playbook I Actually Use
A practical prompt library for cleaning data, writing Excel formulas, building pivots, and generating SQL-plus how to make outputs reliable.
-0111.png&w=3840&q=75)
The most expensive Excel mistake in 2026 isn't a wrong VLOOKUP.
It's trusting a fluent model with a vague prompt, pasting the result into a spreadsheet, and shipping a decision that feels data-driven but isn't. The catch is that data analysis is a game of constraints: definitions, units, time grains, joins, missing values, and "what counts" rules. When you don't provide those constraints, ChatGPT happily invents them.
What's interesting is that research is starting to quantify this. On complex "PDF to JSON" extraction, performance collapses as schemas and required outputs get large: models can produce valid-looking structure while still being wrong, and long structured outputs trigger formatting failures and truncation [1]. Different task, same lesson for Excel work: you'll get "clean" tables and "nice" insights that hide silent errors unless you force validation.
So here's my playbook: prompts that turn ChatGPT into a careful analyst for Excel workflows-cleaning, formulas, pivots, charts, forecasting, and even SQL-without pretending the model is a calculator.
The rule: treat analysis like structured extraction
If you do nothing else, adopt this mental model: every analysis task is "extract a structured result from messy context." When you phrase prompts like that, you naturally add the guardrails that prevent hallucinations.
ExtractBench's evaluation work makes a point that's easy to miss: valid structure doesn't imply correct content, and error modes like omission vs hallucination matter a lot downstream [1]. In Excel terms, a formula that returns a number is not the same thing as a correct number, and a tidy pivot table can still be built on broken assumptions.
A second useful idea comes from Text-to-SQL research: split the work into stages. The IESR framework improves Text-to-SQL by decoupling "understand the question + schema link" from "generate the query," and it adds explicit verification steps to reduce drift [2]. In practice, this is exactly how you should prompt for Excel too: first audit + clarify, then generate, then validate.
You'll see that pattern baked into the prompts below.
Prompt patterns that work best for Excel + analysis
The prompts in this post are "copy, paste, and fill in brackets." I'm opinionated about formatting: always demand tables, always demand checks, and always demand assumptions.
1) "Data audit first" (before you touch formulas)
Use this right after you paste a sample or describe your sheet.
You are my data analyst. Before suggesting any formulas or insights, run a data audit.
Context: [what this sheet represents]
Columns: [paste headers]
Sample rows (10-30): [paste]
Business rules/definitions: [paste if you have them]
Tasks:
1) Identify likely data types per column, including date grain.
2) List missing values, weird categories, duplicates, and outliers you can detect from the sample.
3) Identify 5-10 "definition questions" that could change the answer (e.g., what counts as revenue, timezone, refunds).
4) Propose a cleaning plan that is Excel-native (Power Query steps OR worksheet formulas), and label which you chose.
Output format:
- Section A: Audit table (column | guessed type | issues | suggested fix)
- Section B: Clarifying questions (only the ones that matter)
- Section C: Proposed cleaning steps (numbered, Excel-specific)
Why it works: you're forcing the model into an "information understanding" stage instead of letting it jump to solutions-very similar to the staged decomposition in IESR [2].
2) Formula generation that doesn't break in real workbooks
This is my go-to "write the formula, but also defend it" prompt.
Act as an Excel power user.
Goal: Write an Excel formula for: [describe outcome]
Constraints:
- Excel version: [Microsoft 365 / 2019 / Google Sheets]
- Locale: [comma vs semicolon]
- Must handle blanks and errors safely (use IFERROR/LET as needed)
- Do not assume sorted data unless I say so
Data layout:
- Table name or range: [e.g., SalesTbl]
- Key columns: [list]
- Example of expected output: [describe]
Deliver:
1) The final formula (single line).
2) A short explanation of each component.
3) 3 edge cases and what the formula returns.
4) A quick "sanity check" calculation I can do in Excel to verify.
The "edge cases + sanity check" step is you explicitly trying to prevent silent failures-the same family of failure modes highlighted in structured extraction research [1].
3) Pivot tables and Power Query: stop asking for "a dashboard"
If you say "make a dashboard," you'll get generic advice. Instead, specify decisions and grains.
You are a BI analyst working in Excel.
Decision this analysis must enable: [e.g., cut marketing spend, forecast inventory]
Audience: [exec / ops / finance]
Time grain: [daily/weekly/monthly]
Dimensions that matter: [region, product, channel]
Metrics definitions:
- Revenue = [...]
- Active customer = [...]
- Churn = [...]
Task:
Propose an Excel approach using PivotTables + Power Query.
1) Power Query steps to create a clean fact table (name it, list columns).
2) The PivotTable layout (rows/columns/filters/values).
3) 5 calculated fields or measures (if needed) and how to implement them in Excel.
4) Validation checks: totals reconciliation, duplicate detection, and a "spot check" method.
Output as a structured plan with headings and specific field names.
This forces the "schema linking" behavior: map concepts to columns, then build the transformation, then validate-again echoing IESR's split between understanding and generation [2].
Practical prompt library (copy/paste)
Now the fun part: prompts you can use today.
A) Cleaning messy Excel exports
I have an Excel export that is messy.
Here are the column headers: [paste]
Here are 15 sample rows: [paste]
Known problems: [merged cells / extra header rows / totals row / dates as text / currency symbols]
Give me:
1) Power Query M steps (described, not necessarily full code) to clean it.
2) If Power Query is not possible, give worksheet formulas instead.
3) A validation checklist to confirm the cleaned table matches the raw export (row counts, totals, uniqueness).
B) "Explain this formula" (for inherited spreadsheets)
Explain this Excel formula like I'm onboarding to the workbook.
Formula: [paste]
What the cell is supposed to represent: [describe]
Related columns/ranges: [paste]
Output:
- What it does (plain English)
- Step-by-step breakdown
- Hidden assumptions
- Safer rewrite using LET (if Excel 365) or a more maintainable alternative
C) Build charts that answer a decision question (not "make it pretty")
A community prompt that's surprisingly solid is the "visualization brief" approach: specify decision, audience, constraints, and require iteration and validation [3]. I simplify it like this:
I uploaded a dataset/spreadsheet.
Decision the chart must enable: [one sentence]
Audience: [who]
Constraints: [max 3 charts, color rules, must show units, etc.]
Do this:
1) Give me 4 chart options, each with: chart type, fields used, and what decision it supports.
2) Pick the best option and provide:
- The exact Excel steps to build it (or Python code if I ask)
- A title and 1-sentence takeaway
3) Validation: list 5 checks that confirm the chart is not misleading (denominators, missing data, time grain).
D) Convert an Excel question into SQL (for when Excel is the wrong tool)
This is where the Text-to-SQL research becomes directly useful. IESR shows that Text-to-SQL improves when you explicitly do schema selection, entity extraction, and verification rather than "write SQL in one shot" [2]. Here's a prompt that mimics that workflow:
You are a database analyst. I will give you:
- A business question
- A database schema (tables + columns)
- A few metric definitions
Business question: [paste]
Schema: [paste]
Definitions: [paste]
Step 1: Ask up to 5 clarifying questions ONLY if needed.
Step 2: Propose the relevant tables/columns (schema selection) and explain why.
Step 3: Write the SQL.
Step 4: Provide 6 validation queries/checks (row counts, duplicates, join cardinality, null handling, reconciliation to known totals).
Output format:
- Clarifying questions (if any)
- Selected schema elements
- SQL
- Validation checks
The habit that changes everything: force a "validation contract"
Here's what I noticed after using these prompts for months: the highest leverage move is not "better wording." It's forcing the model to commit to a validation plan every time.
ExtractBench calls out that formatting issues, truncation, and schema complexity can dominate failures in structured tasks [1]. In Excel, your equivalent is: partial data pasted, wrong grain, duplicate keys, mismatched joins, and unspoken metric definitions. When you require explicit checks, you catch the same class of "looks right, is wrong" problems before they become decisions.
Next time you open a workbook and feel the urge to type "analyze this," try this instead: ask for an audit, then a plan, then the formula/query, then validation. You'll get fewer magical answers-and way more correct ones.
References
Documentation & Research
- ExtractBench: A Benchmark and Evaluation Methodology for Complex Structured Extraction - arXiv cs.LG. https://arxiv.org/abs/2602.12247
- IESR: Efficient MCTS-Based Modular Reasoning for Text-to-SQL with Large Language Models - arXiv cs.CL. https://arxiv.org/abs/2602.05385
Community Examples
- The ChatGPT prompt that turns spreadsheets into stunning visualizations that drive decisions - r/ChatGPTPromptGenius. https://www.reddit.com/r/ChatGPTPromptGenius/comments/1qo0z5h/the_chatgpt_prompt_that_turns_spreadsheets_into/
Related Articles
-0124.png&w=3840&q=75)
Perplexity AI: How to Write Search Prompts That Actually Pull the Right Sources
A practical way to prompt Perplexity like a research assistant: tighter questions, better constraints, and built-in verification loops.
-0123.png&w=3840&q=75)
How to Write Prompts for Grok (xAI): A Practical Playbook for Getting Crisp, Grounded Answers
A developer-friendly guide to prompting Grok: structure, constraints, iterative refinement, and how to test prompts like a product.
-0122.png&w=3840&q=75)
Best Prompts for Llama Models: Reliable Templates for Llama 3.x Instruct (and Local Runtimes)
Prompt patterns that consistently work on Llama Instruct models: formatting, role priming, structured outputs, and safety-aware prompting.
-0121.png&w=3840&q=75)
GPT-5.2 Prompts vs Claude 4.6 Prompts: What Actually Changes (and What Doesn't)
A practical, prompt-engineering comparison between GPT-5.2 and Claude 4.6: where wording matters, where it doesn't, and how to write prompts that transfer.
