Rephrase LogoRephrase Logo
FeaturesHow it WorksPricingGalleryDocsBlog
Rephrase LogoRephrase Logo

Better prompts. One click. In any app. Save 30-60 minutes a day on prompt iterations.

Rephrase on Product HuntRephrase on Product Hunt

Product

  • Features
  • Pricing
  • Download for macOS

Use Cases

  • AI Creators
  • Researchers
  • Developers
  • Image to Prompt

Resources

  • Documentation
  • About

Legal

  • Privacy
  • Terms
  • Refund Policy

© 2026 Rephrase-it. All rights reserved.

Available for macOS 13.0+

All product names, logos, and trademarks are property of their respective owners. Rephrase is not affiliated with or endorsed by any of the companies mentioned.

Back to blog
Prompt Tips•Mar 12, 2026•10 min

Prompt Engineering for Roblox Development: NPC Dialogue, Game Logic, and Luau Script Generation

How I prompt LLMs to write Roblox NPC dialogue, design gameplay logic, and generate Luau scripts without shipping broken code-or weird, untestable AI sludge.

Prompt Engineering for Roblox Development: NPC Dialogue, Game Logic, and Luau Script Generation

Roblox dev has a very specific kind of pain: you can almost describe what you want in plain English ("make the NPC talk, then open the door, then give the player a key"), but the last 10% is where your game breaks. Dialogue becomes inconsistent. Logic becomes hand-wavy. Generated Luau looks plausible… until it hits Studio and explodes.

Prompt engineering is what closes that gap. Not "be polite and add more detail" prompt engineering. I mean designing prompts like you're writing a mini-spec and a test harness-so the model produces assets you can actually ship.

What's interesting is that research on tool-using agents keeps rediscovering the same lesson: models degrade when tasks get more complex, especially when they must output precise structured arguments (think "payloads") and keep state consistent over time [2]. That maps perfectly to Roblox: NPCs are stateful, gameplay is stateful, and scripts are nothing but structured payloads.

So here's the approach I use: treat the LLM like a junior teammate who's great at drafting, mediocre at constraints, and needs guardrails for anything that will run in production.


The core trick: separate "what happens" from "how it's implemented"

If you ask for "write me the script," you're forcing the model to make game design decisions, system design decisions, and Luau implementation decisions in one shot. That's how you get brittle code and nonsense features.

Instead, I split prompts into three artifacts, in order:

  1. A dialogue/quest spec (player-facing intent, NPC personality, constraints, state transitions).
  2. A state model (variables, events, allowed transitions, failure modes).
  3. The Luau implementation (modules, remotes, server/client split, and code).

This isn't just vibes. Benchmarks for tool-use agents show that "argument generation" (structured, exact outputs) becomes a major bottleneck under complexity [2]. If you don't isolate it, you make it worse. In Roblox terms: don't ask for final Luau until you've pinned down the state and the exact I/O.

I also keep outputs small. ExtractBench found that reliability drops sharply as structured output volume grows-models hit formatting errors, truncation, or silent failure as schemas get bigger [1]. Same thing happens when you ask an LLM to write 600 lines of Roblox code with three systems intertwined. So I prompt for "one module, one responsibility" and iterate.


NPC dialogue: prompt like a narrative designer, validate like an engineer

Most Roblox NPC dialogue has two jobs: entertain and drive state (quest flags, shop open/closed, reputation, cooldowns). LLMs are good at the entertaining part. They're unreliable at the state part unless you force structure.

A pattern I like is "dialogue as structured events," similar to what folks building LLM-driven games do with inline markup and an engine-mediated "intent parser" [3]. The model writes the what, your game decides the truth.

Here's a prompt template I actually use:

You are writing NPC dialogue for a Roblox game. The dialogue must be grounded in GAME STATE.
Return ONLY a JSON object.

NPC:
- name: "Mara"
- tone: dry humor, short sentences
- role: quest giver for "Power Cell Recovery"

GAME STATE (authoritative):
- playerLevel: 7
- hasPowerCell: false
- maraTrust: 2  (0-5)
- questStage: "NotStarted" | "Accepted" | "Complete"

PLAYER INPUT (verbatim):
"{playerMessage}"

RULES:
- Mara cannot claim the player has the power cell unless hasPowerCell=true.
- Mara cannot advance questStage unless she explicitly gives the quest or receives the item.
- If info is missing, Mara must ask one clarifying question.

OUTPUT SCHEMA:
{
  "npcLine": "string",
  "choices": [{"id":"string","text":"string"}],
  "stateChanges": {
     "questStage": "NotStarted|Accepted|Complete|NO_CHANGE",
     "maraTrustDelta": -1|0|1
  }
}

Why JSON? Because you can parse it, enforce it, and reject it. And this lines up with what structured-extraction work calls treating the schema as an "executable specification" so you can score/validate outputs field by field [1]. You don't need a fancy evaluator-basic checks catch most failures.

One more thing: keep the dialogue turn short. If you're building a full conversation tree in one prompt, you're back in "large output volume" land where quality and validity drop [1].


Game logic: prompt for a state machine, not a script

Roblox gameplay bugs are often state bugs: doors that open twice, quests that skip stages, NPCs that forget what they told you.

ASTRA-bench is a benchmark, but the failure modes are very game-like: performance drops as tasks require more planning, more context, and more correct tool arguments (payloads) [2]. That's your warning sign: if your prompt doesn't explicitly define state and transitions, the model will invent them.

So I ask the model for a state machine before code. Not a bullet list. A compact, checkable representation.

Example prompt:

Design the quest logic as a finite state machine.

Quest: "Power Cell Recovery"
States: NotStarted, Accepted, HasItem, Complete
Events: TalkToMara, CollectPowerCell, ReturnToMara, Abandon

Constraints:
- CollectPowerCell can only transition Accepted -> HasItem
- ReturnToMara can only transition HasItem -> Complete
- Any invalid event in a state must produce "NO_OP" plus a reason

Return ONLY JSON:
{
  "initialState": "...",
  "transitions": [
    {"from":"...", "event":"...", "to":"...", "effects":["..."], "guard":"..."}
  ],
  "invalidEventHandling": {"default":"NO_OP", "reasonStyle":"short"}
}

Now you've got something you can unit test without Roblox Studio even open. You can validate that every (state,event) pair is handled. You can generate tests. You can later prompt the LLM to implement exactly this machine in Luau.


Script generation: make the model write code like it's submitting a PR

When I finally ask for Luau, I'm strict about boundaries: what runs on the server, what runs on the client, and what is "authoritative." The model is not allowed to "decide" game truth on the client.

I also force a PR-style output: file names, modules, and a narrow scope.

A good starting prompt:

Generate Luau code for Roblox Studio.

Goal:
Implement the quest FSM described below on the SERVER.
The client can request actions, but server is authoritative.

Requirements:
- Use a ModuleScript "QuestService" in ServerScriptService
- Provide functions:
  - GetPlayerQuestState(player) -> stateTable
  - HandleEvent(player, eventName) -> (ok:boolean, newStateTable, message:string)
- Persist per-player state in memory (no DataStore in this version)
- Include minimal input validation and clear error messages
- Return code ONLY. No explanations.

FSM JSON:
{...paste the FSM...}

Why this works: you're constraining the "payload generation" problem to a small surface area (a couple functions with explicit I/O). ASTRA-bench's analysis basically says: models can often retrieve info, but they struggle to translate it into correct structured actions as complexity rises [2]. So we don't let complexity rise.

Also: I never request multiple systems in one go (NPC dialogue + quest logic + inventory + UI). ExtractBench shows output volume and breadth correlate with catastrophic failure modes (invalid structure, truncation, silent empties) [1]. If you want reliability, keep outputs tight.


Practical examples: the "engine is the user" mindset

One of the best practical framing tricks I've seen (from an LLM game dev write-up) is: in a game, the "user" is really the engine, not the player [3]. The engine asks the model for a specific artifact (a line, an action resolution, a structured event). The player is just part of the input.

That mindset stops you from building a fragile "LLM decides reality" game. You're using the LLM to propose content and actions. Your Roblox code enforces rules.

Even community discussions about "briefing" agents for games tend to circle around this idea: you're architecting intent and constraints, not just generating text [4]. The difference is: in Roblox, you have the luxury of being strict. Your engine can reject and reprompt.


Closing thought: treat prompts as interfaces, not requests

The moment you're generating NPC dialogue, game logic, and scripts, your prompt is basically an API contract. If it's fuzzy, the output will be fuzzy. If it's structured and small, you can validate it. If it's too big, it'll fail in ways that look "random" but are actually very predictable-formatting errors, missing fields, invented state, and broken payloads [1], [2].

My rule: if I can't write a simple validator for the output, I'm not done prompt-engineering yet.


References

Documentation & Research

  1. ExtractBench: A Benchmark and Evaluation Methodology for Complex Structured Extraction - arXiv cs.LG
    https://arxiv.org/abs/2602.12247

  2. ASTRA-bench: Evaluating Tool-Use Agent Reasoning and Action Planning with Personal User Context - arXiv cs.AI
    https://arxiv.org/abs/2603.01357

  3. From Self-Evolving Synthetic Data to Verifiable-Reward RL: Post-Training Multi-turn Interactive Tool-Using Agents - arXiv cs.AI
    https://arxiv.org/abs/2601.22607

Community Examples

  1. Intra: Design notes on an LLM-driven text adventure - Ian Bicking (HN discussion source)
    https://ianbicking.org/blog/2025/07/intra-llm-text-adventure

  2. Beyond Chatbots: Using Prompt Engineering to "Brief" Autonomous Game Agents - r/PromptEngineering
    https://www.reddit.com/r/PromptEngineering/comments/1rioa1b/beyond_chatbots_using_prompt_engineering_to_brief/

Ilia Ilinskii
Ilia Ilinskii

Founder of Rephrase-it. Building tools to help humans communicate with AI.

Related Articles

How to Make ChatGPT Sound Human
prompt tips•8 min read

How to Make ChatGPT Sound Human

Learn how to make ChatGPT write like a human with better prompts, voice examples, and editing tricks for natural AI emails. Try free.

How to Write Viral AI Photo Editing Prompts
prompt tips•7 min read

How to Write Viral AI Photo Editing Prompts

Learn how to write AI photo editing prompts for LinkedIn, Instagram, and dating profiles with better realism and style control. Try free.

7 Claude PR Review Prompts for 2026
prompt tips•8 min read

7 Claude PR Review Prompts for 2026

Learn how to write Claude code review prompts that catch risks, explain diffs, and improve PR feedback quality. See examples inside.

7 Vibe Coding Prompts for Apps (2026)
prompt tips•8 min read

7 Vibe Coding Prompts for Apps (2026)

Learn how to prompt Cursor, Lovable, Bolt, and Replit to build full apps in 2026 with less rework, better tests, and cleaner handoffs. Try free.

Want to improve your prompts instantly?

On this page

  • The core trick: separate "what happens" from "how it's implemented"
  • NPC dialogue: prompt like a narrative designer, validate like an engineer
  • Game logic: prompt for a state machine, not a script
  • Script generation: make the model write code like it's submitting a PR
  • Practical examples: the "engine is the user" mindset
  • Closing thought: treat prompts as interfaces, not requests
  • References