A practical prompt playbook for PM docs-PRDs, user stories, competitor briefs, and roadmap drafts-grounded in oversight research and citation-aware workflows.
Product managers don't ship documents. We ship decisions.
But here's the annoying truth: most "PM docs" are just decision-shaped paperwork. PRDs, user stories, competitor briefs, roadmap drafts. They're all artifacts of the same thing: turning messy input (interviews, sales calls, logs, exec vibes) into something a team can execute.
LLMs are great at the paperwork part. They're also great at confidently inventing details you never approved. So the win isn't "use AI to write a PRD." The win is "use AI to force clarity, surface missing choices, and keep everything traceable."
Two ideas changed how I prompt for PM work.
First, treat prompting like interactive oversight, not a one-shot request. There's research showing that when non-experts need expert-level outputs, you get better alignment by breaking the job into small, closed-form decisions, collecting low-burden feedback, and iteratively rolling that up into a final spec [1]. That maps perfectly onto PM work: we're constantly decomposing ambiguity.
Second, make your drafts grounded. If you can't point to where a claim came from, it's not a "draft," it's fan fiction. Tools like NotebookLM push you in the right direction by drafting from uploaded sources and letting you hover citations back to the original notes-if you constrain it properly ("based only on these sources," explicit structure, explicit exclusions) [3]. And research on citation preferences shows models systematically misjudge what deserves evidence (they often under-cite numbers and named entities) which is exactly where PMs get burned in reviews [2]. Translation: you need to explicitly demand sourcing for risky claims.
When PMs say "write a PRD," what they really mean is "help me choose." AI should behave like a structured interviewer that reduces your cognitive load, not like an intern vomiting a 10-page doc.
The oversight framing in [1] is basically: decompose, ask closed questions, accumulate preferences, then generate. You can use that pattern manually in chat with a simple rule: every prompt should either (a) produce a draft constrained by sources, or (b) ask you questions that shrink ambiguity.
I bake that into prompts with three constraints:
Below are prompt templates I use. They're written to work in plain ChatGPT/Claude/Gemini-style chats, but they're even better if you're using a grounded workspace (NotebookLM or any "docs as sources" flow), because you can enforce "based only on these sources" like [3] recommends.
You are my Principal Product Manager and requirements editor.
Goal: produce a PRD draft that is grounded in the sources I provide. If a detail is not in the sources, mark it as UNSOURCED and ask me for confirmation.
Before drafting, ask me up to 7 clarifying questions that materially affect scope, UX, and success metrics. Use mostly closed-form questions (pick/rank) to reduce my effort.
Sources (paste or attach):
- User research notes:
- Stakeholder notes:
- Constraints (legal/security/platform):
- Existing product context:
- Any competitor notes:
Draft the PRD with exactly these sections:
1) Problem statement (who, pain, why now)
2) Goals + non-goals (explicit)
3) Personas + primary use cases
4) Proposed solution (bulleted, prioritized)
5) User journeys (happy path + edge cases)
6) Requirements
- Functional (with acceptance criteria)
- Non-functional (latency, privacy, accessibility, reliability)
7) Analytics: events + success metrics (include metric definitions)
8) Open questions + risks
9) Rollout plan (phased) + dependencies
Hard constraints:
- Prioritize user pain points over brainstorm ideas.
- Do not invent integrations, timelines, or KPIs.
- Any numbers, dates, pricing, or named-entity claims require a citation to sources or must be marked UNSOURCED.
This is basically the NotebookLM idea ("based only on these sources," enforce structure, exclude irrelevant brainstorms) but generalized beyond that tool [3], with an evidence discipline informed by citation research [2].
Act as a product + QA pair.
Context (paste):
- Feature summary:
- Target persona:
- Constraints:
- Analytics goal:
Task:
Generate user stories in two formats:
A) Job story: "When __, I want to __, so I can __."
B) Agile story: "As a __, I want __, so that __."
For each story, include:
- Acceptance criteria in Gherkin (Given/When/Then)
- Edge cases (max 3)
- Instrumentation: events to track + properties
Rules:
- If you lack info, ask questions first.
- Do not invent backend systems; propose alternatives as options with tradeoffs.
You're explicitly forcing the model into verifiable, testable outputs. That's oversight again: smaller units, easier review [1].
You are a competitive intelligence analyst.
I will provide competitor inputs (URLs, notes, reviews, screenshots, pricing pages).
If I don't provide a source for a claim, label it UNSOURCED.
Inputs:
- Competitor list:
- Market segment + ICP:
- Our current positioning (1 paragraph):
- Sources (paste excerpts with links):
Output a competitor brief with:
- One-line positioning per competitor
- Target user + "why they buy"
- Feature comparison (only sourced)
- Pricing + packaging summary (only sourced)
- Distribution channels + messaging themes (quote exact phrases from sources where possible)
- Weaknesses / gaps (separate: "sourced" vs "hypotheses")
- What we should copy / avoid (with rationale)
- 5 questions to validate with customers next week
The "quote exact phrases" trick is my favorite. It stops the model from laundering your sources into generic MBA-speak, and it gives you language you can reuse in messaging. Community prompt libraries talk a lot about "structured competitor tables" in practice, but they rarely enforce sourcing; this is how you keep it honest [4].
You are my Head of Product. Help me draft a roadmap proposal, not a commitment.
Context:
- Company goals (top 3):
- Current metrics baseline:
- Team capacity constraints:
- Known dependencies:
- Candidate initiatives (list):
- Must-do dates (if any):
Step 1: Ask me up to 5 clarifying questions, then propose a shortlist of 6-10 initiatives.
Step 2: Produce a roadmap draft with:
- Now / Next / Later (or Q2/Q3/Q4 if I request)
- For each initiative: problem, bet, expected impact, confidence, key dependencies, and "kill criteria"
- A risk register (top 5)
- What we are explicitly not doing
Rules:
- No invented dates. Use sequencing and dependency logic instead.
- Where impact is uncertain, propose how to measure it.
"Kill criteria" is the secret sauce. It turns the roadmap from political theater into an experiment plan.
Here's what works well for me.
I start by feeding the model raw material (interview notes, support tickets, sales call snippets, analytics snapshots). If I'm in a tool that supports grounded generation, I lean into it: "based only on these sources," plus a strict output skeleton, like the NotebookLM PRD example [3]. That gives me a draft that's at least anchored.
Then I switch modes and run an oversight loop. I ask the model to identify the top decisions embedded in the draft and convert them into closed questions: rank these goals, pick a primary persona, choose between two UX flows with pros/cons. This is straight out of the "low-burden feedback" idea in scalable interactive oversight research [1]. It's also the fastest way to expose stakeholder disagreements early, before engineering estimates turn into sunk costs.
Finally, I do an evidence pass. I literally prompt: "Highlight every sentence containing a number, date, pricing claim, or named entity. For each, show the supporting source excerpt or mark UNSOURCED." That's me operationalizing the finding that models often under-cite numeric/name-heavy statements even though humans expect citations there [2]. In PM terms: it's how you avoid getting shredded in a roadmap review because your "TAM is $4B" line has no origin story.
If you take one thing from this: stop asking AI to write your PM docs. Ask it to interrogate your thinking, force tradeoffs into the open, and keep a paper trail.
Your PRD shouldn't be "what the model thinks you meant." It should be "what you decided," plus receipts.
Documentation & Research
Steering LLMs via Scalable Interactive Oversight - arXiv (cs.AI) - https://arxiv.org/abs/2602.04210
Aligning Large Language Model Behavior with Human Citation Preferences - arXiv (cs.CL) - https://arxiv.org/abs/2602.05205
Grounded PRD Generation with NotebookLM - KDnuggets - https://www.kdnuggets.com/grounded-prd-generation-with-notebooklm
Community Examples
Curated AI prompt library for founders, marketers, and builders - r/PromptEngineering - https://www.reddit.com/r/PromptEngineering/comments/1r3u4bv/curated_ai_prompt_library_for_founders_marketers/