Rephrase LogoRephrase Logo
FeaturesHow it WorksPricingGalleryDocsBlog
Rephrase LogoRephrase Logo

Better prompts. One click. In any app. Save 30-60 minutes a day on prompt iterations.

Rephrase on Product HuntRephrase on Product Hunt

Product

  • Features
  • Pricing
  • Download for macOS

Use Cases

  • AI Creators
  • Researchers
  • Developers
  • Image to Prompt

Resources

  • Documentation
  • About

Legal

  • Privacy
  • Terms
  • Refund Policy

© 2026 Rephrase-it. All rights reserved.

Available for macOS 13.0+

All product names, logos, and trademarks are property of their respective owners. Rephrase is not affiliated with or endorsed by any of the companies mentioned.

Back to blog
Prompt Tips•Mar 09, 2026•8 min

Apple Intelligence Prompting Is Not ChatGPT Prompting: What to Write for Siri and On‑Device AI

Apple's AI is a router, not a chatbot. Here's how prompts change when the system prefers intents, privacy boundaries, and on‑device constraints.

Apple Intelligence Prompting Is Not ChatGPT Prompting: What to Write for Siri and On‑Device AI

If you've been "prompt engineering" in ChatGPT-style chat boxes, Apple Intelligence can feel weird at first. Not because the models are worse. Because the product is different.

Most chat LLMs are a single surface: you write text, you get text back. Apple Intelligence is closer to a traffic controller. Your input might become an on-device generation request, a structured App Intent, a search across local content, or a handoff to a stronger model somewhere else. And on Apple platforms, that routing decision is a feature, not a bug.

That changes what "a good prompt" even means.


Apple Intelligence is a router, so your prompt competes with everything else

Here's the mental shift I've found most useful: in Apple's world, the best "prompt" often isn't a prompt. It's a clean, unambiguous intent that can be executed deterministically.

When Siri (and now Apple Intelligence features across the OS) can satisfy a request via an intent-set a timer, send a message, log something in your app-that path is usually lower latency, more private, and more reliable than free-form generation. On-device inference stacks are heavily optimized for throughput and latency, but they still have constraints: context length, bandwidth, concurrency, and memory pressure are very real on consumer hardware [1], and decoding tends to run into memory-bandwidth limits as sequences get longer [2]. That's not an academic detail. It's why long, meandering prompts feel "expensive" on-device, and why concise, structured instructions win.

So when you write for Siri and on-device AI, you're effectively writing for a system that's asking: "Can I turn this into a tool call or intent? If yes, do that. If no, fall back to generation."

The practical consequence is brutal: verbosity that helps in a chatbot can hurt you on-device. You're adding tokens that increase compute and memory traffic for little gain [2]. If the system can't extract the actionable core quickly, it may route you down a different path, or ask follow-ups that feel "dumber" than you expected.


The on-device constraint that changes everything: treat tokens like battery

Even if you never touch Apple's frameworks directly, you should write prompts as if every extra sentence costs battery and latency-because it does.

Research on Apple Silicon inference shows big gains come from batching, caching, and avoiding repeated computation (like re-encoding the same image every turn) [1]. The point isn't "you need prefix caches in your prompt." The point is: on-device systems are designed to reward reuse and punish waste. If you can phrase a request so the system can reuse context (or avoid needing it), you'll see more consistent behavior.

Roofline-style benchmarking work on on-device LLMs makes the same theme painfully clear: decoding is often memory-bound, and operational intensity changes with sequence length and architecture; quantization helps most in memory-bound scenarios [2]. Translation: shorter prompts, fewer turns, clearer slots.

So I aim for prompts that are short, slot-like, and completion-friendly. Not because it's prettier. Because it's cooperative with the constraints.


What to write for Siri: make the "action" legible, then allow one clarification

When the user's speaking to Siri, you're not really writing prompts in the classic sense. You're designing utterances and the system's response strategy. The system needs to decide whether it can safely do something, or whether it needs a clarifying question.

So the best Siri-oriented instruction style is:

  1. state the action in the first clause,
  2. provide necessary parameters in natural language,
  3. include a single disambiguation hook.

I like to write with a "first token test": if Siri only heard the first half of the sentence, would it still know what action family this belongs to?

This is the opposite of "role + context + long constraints." For Siri, long roleplay is mostly noise. If you need reliability, your app should expose the action as an intent so Siri can call it deterministically, and your user should speak in ways that map cleanly to that intent.


What to write for on-device generation: fewer instructions, tighter format

When Apple Intelligence does generate text locally, you still benefit from classic prompting ideas (be specific, ask for a format, provide examples). The difference is you should compress those ideas.

Because on-device, the system is juggling more than "make text good." It's also juggling "make text fast."

A pattern that works well is a compact "goal + constraints + output schema" prompt. No preamble. No motivational fluff. And I'm cautious with multi-step reasoning demands, because they tend to balloon tokens and time.

Another subtlety: when you're prompting on-device, you often have local context available implicitly (your note, your email thread, the selected text). Don't restate the whole thing in the prompt. Point to it. The OS already knows what's selected; your prompt should say what to do with it.


Practical prompts that play nicely with Siri + on-device AI

Below are prompts I'd actually ship in a product or recommend to a team. They're short on purpose.

1) Siri-style action request (intent-friendly)

Add an expense: $42.60 for lunch, category Meals, date today.
If anything is missing, ask one question.

2) On-device rewrite (Writing Tools vibe)

Rewrite the selected text to be clearer and shorter.
Keep the meaning. Keep names and numbers unchanged.
Return only the rewritten text.

3) Summarize a long note without copying it into the prompt

Summarize the selected note for a standup update.
Format:
1) What I did
2) What I'm doing next
3) Blockers
Max 80 words.

4) A "Siri fallback" prompt for ambiguity

This is the one place I borrow a community trick: if the system can't proceed, have it interview the user-but keep it bounded. The Reddit version asks for 5 questions; on-device I usually ask for 1-2 to reduce turn cost [3].

I want to do this: schedule a focus block this week.
Before acting, ask me the single most important question you need answered.

5) A prompt that reduces repeated back-and-forth

Because repeated turns cost tokens and time, I'll often ask for a draft plus options in one go:

Draft a reply to the selected email.
Tone: friendly, direct.
Give me two versions: short (2 sentences) and medium (5 sentences).

The takeaway I'd bet on: "prompting" on Apple becomes "interface design"

If you're building for Apple Intelligence, the craft is less "write a magical system prompt" and more: design the shortest path to a reliable outcome.

When an intent can do it, let the system do it. When generation is needed, keep prompts tight, specify output shape, and minimize turns. The hardware and serving research around on-device inference keeps pointing to the same truth: long contexts and repeated work are the enemy; caching, reuse, and shorter sequences are your friend [1], [2].

Try this the next time you test: rewrite your best prompt to half its length, but keep the first clause action-oriented and keep the output schema explicit. If the output gets better, you're feeling the Apple-style routing and on-device constraints in real time.


References

References
Documentation & Research

  1. Native LLM and MLLM Inference at Scale on Apple Silicon - arXiv cs.LG - https://arxiv.org/abs/2601.19139
  2. RooflineBench: A Benchmarking Framework for On-Device LLMs via Roofline Analysis - arXiv cs.LG - https://arxiv.org/abs/2602.11506

Community Examples
3. The "Logic Architect" Prompt: Engineering your own AI path - r/PromptEngineering - https://www.reddit.com/r/PromptEngineering/comments/1rilcm3/the_logic_architect_prompt_engineering_your_own/

Ilia Ilinskii
Ilia Ilinskii

Founder of Rephrase-it. Building tools to help humans communicate with AI.

Related Articles

How to Make ChatGPT Sound Human
prompt tips•8 min read

How to Make ChatGPT Sound Human

Learn how to make ChatGPT write like a human with better prompts, voice examples, and editing tricks for natural AI emails. Try free.

How to Write Viral AI Photo Editing Prompts
prompt tips•7 min read

How to Write Viral AI Photo Editing Prompts

Learn how to write AI photo editing prompts for LinkedIn, Instagram, and dating profiles with better realism and style control. Try free.

7 Claude PR Review Prompts for 2026
prompt tips•8 min read

7 Claude PR Review Prompts for 2026

Learn how to write Claude code review prompts that catch risks, explain diffs, and improve PR feedback quality. See examples inside.

7 Vibe Coding Prompts for Apps (2026)
prompt tips•8 min read

7 Vibe Coding Prompts for Apps (2026)

Learn how to prompt Cursor, Lovable, Bolt, and Replit to build full apps in 2026 with less rework, better tests, and cleaner handoffs. Try free.

Want to improve your prompts instantly?

On this page

  • Apple Intelligence is a router, so your prompt competes with everything else
  • The on-device constraint that changes everything: treat tokens like battery
  • What to write for Siri: make the "action" legible, then allow one clarification
  • What to write for on-device generation: fewer instructions, tighter format
  • Practical prompts that play nicely with Siri + on-device AI
  • 1) Siri-style action request (intent-friendly)
  • 2) On-device rewrite (Writing Tools vibe)
  • 3) Summarize a long note without copying it into the prompt
  • 4) A "Siri fallback" prompt for ambiguity
  • 5) A prompt that reduces repeated back-and-forth
  • The takeaway I'd bet on: "prompting" on Apple becomes "interface design"
  • References