Learn how MCP Tasks use async call-now, fetch-later patterns to dodge 30-second tool timeouts and keep agents moving. See examples inside.
I keep seeing the same failure mode in agent workflows: the model is doing the right thing, but the tool platform gives up too soon. A 30-second timeout is fine for a quick lookup. It is brutal for anything that needs queued work, retries, or external processing.
Async MCP solves the mismatch between how agents think and how tools respond. In real workflows, a tool may need to queue work, wait on downstream services, or run longer than the platform's timeout window. Instead of forcing the model to block, async lets it launch the job, keep the conversation alive, and return for the result later [1].
That sounds simple. It is. And it changes everything.
"Call now, fetch later" works because it treats tool use like a distributed system, not a synchronous function call. The agent makes progress by acknowledging uncertainty instead of pretending the result must be immediate. The model can continue other work, preserve task state, and resume when the response arrives. In benchmark terms, that's exactly the kind of temporal coordination AsyncTool measures [1].
The practical effect is huge: fewer timeouts, fewer retries, less lost context.
The strongest signal from the research is that latency is not a side issue; it is part of the task itself. AsyncTool shows that delayed tool feedback creates clear performance degradation when agents cannot coordinate task switching, dependency tracking, and state maintenance [1]. In other words, the model does not just need reasoning. It needs timing awareness.
That is why synchronous tool loops break down so fast in production.
Async behavior needs to be explicit in the schema and the surrounding instructions. A recent MCP-focused paper argues that schemas should encode semantic completeness, explicit action boundaries, failure mode documentation, and inter-tool relationships [2]. For async tasks, that means the tool description should tell the model what gets returned immediately, what gets returned later, and how to interpret delayed results.
If you hide that behavior, the model will guess. And guessing is where agents go sideways.
A good async workflow has three parts: kickoff, continuation, and reconciliation. First, the agent submits a request that starts work. Second, the system stores enough state to reconnect the later result with the original task. Third, the agent receives a continuation message or fetches the result with an ID and picks up from there [1][2].
Here's the mental model I use: the tool call is not the answer. It is the receipt.
| Approach | What happens | Risk | Best for |
|---|---|---|---|
| Synchronous MCP | Call tool, wait inline, continue only when done | Timeouts, blocked context, brittle retries | Fast lookups, local utilities |
| Async MCP Tasks | Start job, keep state, fetch later | More orchestration complexity | Long jobs, queued work, external APIs |
| Hybrid async + schema hints | Tool schema explains delayed results and continuation rules | Requires disciplined tool design | Agentic workflows that span multiple steps |
The table shows the real tradeoff. Async adds implementation work, but it removes a whole class of timeout failures.
Here's the kind of before/after transformation that matters in an agent workflow.
Before:
Generate the report and return it to me.
After:
Start the report generation job now. Return a job ID immediately.
Do not wait for the report to finish inside this call.
When the report is ready, fetch it using the job ID and continue the workflow.
If the job is delayed, keep the session open and preserve task state.
That single rewrite changes the agent's behavior from "block and hope" to "launch and reconnect."
A more advanced version can be made clearer with tool-level instructions:
This tool starts an asynchronous task and returns a tracking ID.
The result will arrive later in a separate message.
When the result arrives, treat it as a continuation of the original call, not a new task.
This is the kind of prompt hygiene that tools like Rephrase can automate in seconds when you are working across chat, IDEs, or internal ops tools.
Async MCP Tasks make the most sense anywhere a tool call can exceed a short timeout or produce results out of order. I'd use them for report generation, code analysis jobs, data imports, search pipelines, browser automation, and multi-step backend workflows. I would not use them for instant reads that already return in a fraction of a second [1].
The rule is simple: if the tool might stall, make the stall explicit.
The catch is that async only works if your system is disciplined about state. You need a reliable job ID, clear result handoff, and a way to handle late or failed completions. That's why schema design matters so much: the model has to know what is pending, what is final, and what to do if a result arrives after it has moved on [2].
Without that, "async" just becomes "confusing later."
If I were building an MCP agent today, I'd start with one slow tool and convert it to async before touching the rest. I'd define the tracking ID, write the continuation rule into the schema, and test what happens when the tool finishes after the model has already kept chatting. That small change usually reveals whether your stack is truly agent-ready.
For more prompt and workflow ideas, browse the Rephrase blog or improve your own async tool instructions on the Rephrase homepage. It is a tiny step that saves a lot of broken runs.
Documentation & Research
Community Examples
MCP Tasks are an async tool pattern where the agent starts a long-running job now and retrieves the result later. That avoids blocking the conversation on slow tools.
Normal MCP is usually synchronous: call a tool, wait, then continue. Async MCP adds delayed results, state tracking, and continuation handling so long jobs don't break the flow.