AI Agents

Await Inline, Workflow Engine Later: Migration Heuristic

A practitioner heuristic for when to adopt workflow engines like Inngest or Temporal. Start with await inline. Graduate after 5+ runs and a real retry.

Await Inline First, Workflow Engine Later: A Tactical Migration Heuristic | AI PM Portfolio

Await Inline First, Workflow Engine Later: A Tactical Migration Heuristic

April 11, 2026 · 9 min read · AI Agents / Infrastructure

Last Updated: 2026-04-11

Do not adopt a workflow engine (Inngest, Temporal, Step Functions) until the function you are migrating has run at least 5 times successfully in production and you have observed a real retry scenario that needed orchestration. Start with await inline plus maxDuration on your serverless function. The migration from await-inline to a workflow engine is 20-30 minutes of work. The risk of premature migration is invisible job failures in a system you cannot yet monitor.

Why does the conventional wisdom get this wrong?

The platform engineering world has a clear prescription: use a proper queue or workflow engine from day one. Do not write fire-and-forget code. Do not rely on the request lifecycle. Adopt durable execution early and avoid technical debt later. This advice is correct for teams with a dedicated platform engineer, an established monitoring stack, and dozens of background job types. It is wrong for most serverless applications in their first year.

According to the Inngest durable execution guide, the median time to first production function on Inngest is 45 minutes. That sounds fast. But the median time to a fully trusted production pipeline -- with monitoring, alerting, retry verification, and dead-letter queue confidence -- is measured in weeks, not minutes. The 45-minute setup is the easy part. The trust-building is where teams stumble, and where premature adoption creates invisible risk.

I learned this the hard way. Three production bugs in a single evening, all caused by the same root pattern. The fix for each was different -- and only one of them belonged on a workflow engine. I have written previously about agent architecture patterns that reduce this class of error, but this post is specifically about the migration decision: when to stay simple and when to graduate.

What is the 4-layer mental model for background work?

Before deciding whether you need a workflow engine, understand which layer your problem lives on. Most teams conflate these four layers because marketing teams for different tools all use the word "workflow."

Layer	Pattern	Tools	When to Use
L1: Await Inline	`await someFn()` + `maxDuration`	Native serverless (Vercel, Lambda)	Admin paths, <5 min work, single handler
L2: Background Queue	Enqueue message, worker processes	SQS, QStash, pgmq, Supabase Queues	Single-step jobs <30s, decouple request/response
L3: Workflow Engine	Multi-step durable functions with retries	Inngest, Temporal, Step Functions, Trigger.dev	Multi-step pipelines, fan-out, retries with state
L4: Orchestration Platform	DAG scheduling, data pipeline coordination	Airflow, Prefect, Dagster, Argo	ETL, ML training, nightly batch, BI reports

The critical insight: L3 and L4 both get called "workflow orchestrators" by their vendors, but they solve fundamentally different problems. L3 is for application events (user uploads a document, fire a 5-stage pipeline). L4 is for data schedules (every night at 2 AM, transform yesterday's logs into analytics tables). Picking L4 when you need L3 -- or vice versa -- is a category error that creates pain on day one. According to a Temporal blog post on durable execution, 60% of teams adopting workflow engines cite "replacing fragile cron + database polling" as their primary motivation. That is an L1-to-L3 migration, not an L4 migration -- the distinction matters.

What does the heuristic look like in practice?

Here is the decision framework I use after fixing three production bugs in a tax-tech application, all caused by un-awaited async work on Vercel serverless:

The 3-condition migration signal: Migrate from await-inline to a workflow engine only when ALL three conditions are true: (1) the function has run 5+ times successfully in production via the queue on a different use case, (2) you have witnessed at least one failure and successful retry in the workflow engine's dashboard, and (3) you are about to add a 4th background job that would benefit from the same infrastructure.

The reasoning behind each condition:

5+ successful runs: One successful run proves the happy path. It does not prove the system handles payload edge cases, handler timeouts, RLS policies on service clients, missing environment variables in the execution context, or concurrent handler race conditions. Five runs across real production data builds minimum confidence.
Seen a retry: If you have never observed the workflow engine recover from a failure, you do not know if your retry configuration actually works. Dead-letter queues, backoff policies, and idempotency guards all need to be exercised at least once before you trust them with critical paths.
4th background job: Workflow engine overhead (SDK dependency, serve endpoint, dashboard monitoring, event schema management) compounds only when you have 3+ job types. For 1-2 jobs, the overhead exceeds the benefit. At 4+ jobs, the investment pays for itself through shared infrastructure.

How did this play out with three real production bugs?

On a single evening, I discovered three bugs in a production serverless application. All three had the same architectural root cause: un-awaited async work at Vercel request boundaries. Vercel kills the serverless container as soon as return NextResponse.json(...) executes. Any fetch(), IIFE, or .then() callback still pending gets silently dropped. No error. No log. The work just vanishes.

The three bugs and their correct fixes:

Bug	Symptom	Correct Fix	Time to Fix
Email deduplication failure	13 duplicate emails in 2 minutes	Gate check (not queue-shaped)	30 min
Video walkthrough never fires	Feature silently broken for all clients	Inngest (5-stage pipeline, needs retries)	4 hours
Draft notification never sends	Admin-uploaded drafts produce no email	Await inline + maxDuration=300	30 min

Bug #1 was not a background-job problem at all -- it was a missing gate check where the email logger never wrote its deduplication record. Inngest would have been the wrong tool entirely. Bug #2 was genuinely queue-shaped: a 5-stage pipeline (PDF rendering, script generation, audio synthesis, image generation, persistence) that took 60-90 seconds and needed per-stage retries. That one went to Inngest. Bug #3 was an admin-facing path where a longer spinner (2 seconds to 60-90 seconds) was acceptable. The fix was adding await and export const maxDuration = 300.

Total time to fix all three with the heuristic: about 5 hours. Estimated time if I had migrated all three to Inngest: 8-12 hours, plus the risk of introducing queue-related failures in a system that had exactly one successful Inngest run to its name.

What does the await-inline fix look like in code?

The before and after is almost embarrassingly simple:

// BEFORE: fire-and-forget (silently broken on Vercel)
// The IIFE starts but Vercel kills the container before it finishes
(async () => {
  await generateAssets(returnId);
  await sendNotificationEmail(returnId);
})().catch(console.error);

return NextResponse.json({ success: true }); // container dies here

// AFTER: await inline (works on Vercel with maxDuration)
export const maxDuration = 300; // seconds, at top of route file

const assets = await generateAssets(returnId);
await sendNotificationEmail(returnId);

return NextResponse.json({ success: true }); // container lives until here

The migration from this to a workflow engine later is equally simple:

// LATER: migrate to Inngest when the 3 conditions are met
// Replace the inline await with an event send (~7 lines changed)
await inngest.send({
  name: "app/assets.requested",
  data: { returnId },
});

return NextResponse.json({ success: true });

The business logic inside generateAssets does not change. Only the call site changes. This is why the migration from L1 to L3 is cheap: you are moving where the function is called, not rewriting the function itself. That 20-30 minute migration cost is the key insight. There is no penalty for starting at L1.

When should you skip L1 and go directly to a workflow engine?

The heuristic has clear exceptions. Some work is inherently queue-shaped and should use a workflow engine from day one, even if the engine is new and unproven:

Client-facing paths where a 60-90 second spinner breaks UX. Document upload flows, search indexing triggered by user action, email sends that depend on external APIs. The user cannot wait. Await-inline is not a viable alternative.
Retry-with-backoff work expected to fail. Webhook delivery to unreliable endpoints, third-party API calls with rate limits, payment reconciliation. The retry semantics are the entire point.
Fan-out patterns where one event triggers N handlers. A user uploads a document and you need to extract text, generate a thumbnail, update search index, and notify the team -- all independently. Sequential await is wasteful; parallel fire-and-forget is fragile.
Long-running sagas spanning hours or days. Human-in-the-loop approval steps, multi-day onboarding sequences, scheduled follow-ups. These need step.sleep or step.waitForEvent -- primitives that only workflow engines provide.
Scheduled work that is inherently cron-shaped. Nightly digests, weekly reports, monthly billing runs. These belong on a scheduler or workflow engine from the start.

The distinguishing question: can the user or admin tolerate a longer response time on this path? If yes, await-inline is safe. If no, you need a queue from day one. According to the Vercel documentation on function duration, the Pro plan supports maxDuration up to 300 seconds (5 minutes). If your work fits within 5 minutes and the caller can wait, L1 is sufficient.

How do you compare workflow engines when you are ready to migrate?

When your application hits the 3-condition signal, here is the comparison that matters. I evaluated five options for a Vercel/Next.js/Supabase stack:

Criteria	Inngest	Temporal	Step Functions	Supabase Queues
New infrastructure	None (single API route)	Worker process required	AWS account required	None (existing DB)
Time to first function	45 minutes	3-5 days	1-2 days	1 day
Cost at startup scale	$0 (free tier: 50K steps/mo)	~$200/mo + hosting	Pay-per-transition	$0
Multi-step durability	Yes (step.run)	Yes (activities)	Yes (state machines)	No (hand-roll)
Local development	npx inngest-cli dev	Local server (complex)	None (push to AWS)	Real DB locally
Lock-in risk	Medium (TS portable)	Medium (open source)	High (AWS-only ASL)	Low
Best for	Small TS team on Vercel	Platform eng team at scale	AWS-native enterprise	Single-step <30s jobs

For a TypeScript team on Vercel with no dedicated platform engineer, Inngest is the structural match. It adds zero new infrastructure (one Next.js API route at /api/inngest), the functions are async TypeScript you already know how to write, and the free tier covers startup-scale usage by an order of magnitude. Temporal is technically superior for complex orchestration, but the worker process requirement and $200/month minimum make it wrong for most early-stage serverless applications. I have discussed similar build-vs-adopt tradeoffs for AI infrastructure in earlier posts.

What are the three questions to ask before any infrastructure adoption?

Beyond workflow engines, this heuristic generalizes. Every time you are evaluating a new infrastructure dependency, ask three questions in order:

Does it match the deployment topology you have already committed to? If you are on Vercel and the tool requires a long-running worker process, the answer is no, regardless of how technically superior the tool is. The operational tax of running parallel infrastructure topologies wipes out feature advantages.
Does it match the cognitive bandwidth your team has? A solo founder building with AI agents has different bandwidth than a 10-person backend team. The tool should feel like "writing the same code I already write, with one new wrapper" rather than "learning a new paradigm."
What is the lock-in cost if you are wrong? Business logic should be portable. Only the wiring should be vendor-specific. If your function code lives inside step.run callbacks, it is plain TypeScript that can be extracted and called from anywhere.

When comparing two tools that both use the word "workflow" in their marketing but solve different problems, read the homepage example code. If the example moves data from S3 to Snowflake, it is data orchestration (L4). If it handles a Stripe webhook, it is application orchestration (L3). The example shape tells you more than the feature list.

The meta-rule: When your instinct says "let me rebuild this the right way while I am here," the right move is usually to fix the symptom tactically now and schedule the refactor for when you have more data. The migration from await-inline to a workflow engine is 20-30 minutes. The risk of premature migration -- invisible job failures in a system you cannot yet monitor -- is unbounded.

Frequently Asked Questions

Is await-inline just technical debt that you will have to pay back later?

No. Await-inline is a valid production pattern for admin-facing paths and single-handler triggers. The code you write with await-inline is structurally identical to what you would write inside a workflow engine handler -- the business logic does not change, only the call site. When you do migrate, you replace one line (await someFn() becomes await inngest.send(...)). That is 20-30 minutes of work, not a refactor.

What if my serverless function times out at 300 seconds?

If your work consistently exceeds 5 minutes, you have hit the legitimate boundary of L1. Break the work into stages: the first stage does the fast work inline and fires an event to a workflow engine for the slow stages. This is the hybrid pattern -- L1 for the request path, L3 for the heavy lifting. You do not need to migrate the entire route; only the slow steps need the workflow engine.

Does this heuristic apply to non-Vercel serverless platforms?

Yes, with adjusted thresholds. AWS Lambda supports up to 15 minutes (900 seconds) of execution time. Cloudflare Workers have a 30-second limit on the free plan. The principle is the same: if your work fits within the platform's duration limit and the caller can tolerate the wait, await-inline is the simplest correct solution. The 3-condition migration signal applies regardless of platform.

How do you monitor await-inline functions for failures?

The same way you monitor any serverless function: structured logging, error tracking (Sentry or equivalent), and database state assertions. After the function completes, verify the expected state change occurred (database row updated, email sent, file created). If the function throws, the serverless platform captures the error in its logs. This is simpler to monitor than a workflow engine because there is no separate dashboard, no event bus, and no dead-letter queue to check.

When should I skip this heuristic entirely and adopt a workflow engine on day one?

When your application's core value proposition depends on durable execution -- payment processing pipelines, multi-day approval workflows, IoT event processing, or any system where "the job ran exactly once and completed" is a compliance requirement, not a nice-to-have. For these use cases, the workflow engine is not infrastructure overhead; it is a product requirement.

Published April 11, 2026. Part of a series on AI agent infrastructure and serverless architecture patterns. Based on production experience fixing three fire-and-forget bugs in a single evening and the migration decision that followed.

Dinesh Challa is an AI Product Manager building production software with Claude Code. Follow him on LinkedIn.