Promise.race Timeout Guard for External API Calls
Use Promise.race to add timeout guards to every external API call in serverless functions. Prevent silent 504s on Vercel, Lambda, and Cloudflare Workers.
Promise.race as a Timeout Guard for Every External API Call | AI PM Portfolio
Promise.race as a Timeout Guard for Every External API Call
April 11, 2026 · 8 min read · Next.js + Supabase + AI
Wrap every external API call in Promise.race with a hard timeout to prevent silent hangs on serverless platforms. A try-catch only catches rejected promises -- it cannot save you when Vercel kills your function at maxDuration. The pattern is a reusable withTimeout(promise, ms) utility that races your API call against a setTimeout, returning a fallback or throwing before the platform does.
Last Updated: 2026-04-11
Why does try-catch fail on serverless timeouts?
Most developers assume try-catch handles all API failures. It handles rejections -- network errors, 500 responses, DNS failures. But it cannot handle a promise that simply never resolves. On a traditional server, a hung API call blocks one thread. On serverless (Vercel, AWS Lambda, Cloudflare Workers), the runtime enforces a hard maxDuration. When that limit hits, the function is killed mid-execution. No catch fires. No cleanup runs. The response never reaches the user.
I hit this in production with an AI ranking API that averaged 1.8 seconds but occasionally spiked to 30+ seconds. The route had a try-catch. The Vercel function had the default 10-second maxDuration. On spike requests, the function was killed silently -- no error logged, no fallback served, just a 504 to the user. According to Vercel's documentation, the default maxDuration is 10 seconds on the Hobby plan and 60 seconds on Pro. Once exceeded, the function is terminated with no opportunity for cleanup. I covered serverless function patterns in more detail here.
What does the Promise.race timeout pattern look like?
The core utility is 12 lines of TypeScript. It races your actual API call against a timer that resolves to null (or throws) after a specified duration.
// lib/with-timeout.ts — reusable timeout guard
export async function withTimeout<T>(
promise: Promise<T>,
ms: number,
errorMessage = `Operation timed out after ${ms}ms`
): Promise<T> {
const timeout = new Promise<never>((_, reject) =>
setTimeout(() => reject(new Error(errorMessage)), ms)
);
return Promise.race([promise, timeout]);
}
// Usage: wrap any external call
const result = await withTimeout(
fetchTaxCalculation(returnId), // external API call
8000, // 8s timeout (under 10s maxDuration)
'Tax calculation timed out'
);If you prefer a fallback value instead of throwing, use this variant:
// Soft timeout: returns null instead of throwing
export async function withTimeoutOrNull<T>(
promise: Promise<T>,
ms: number
): Promise<T | null> {
const timeout = new Promise<null>((resolve) =>
setTimeout(() => resolve(null), ms)
);
return Promise.race([promise, timeout]);
}
// Usage with fallback
const ranked = await withTimeoutOrNull(rankContent(items), 5000);
if (!ranked) {
// Serve unranked content -- better than a 504
return NextResponse.json({ items, ranked: false });
}The key design decision: set your timeout to at least 2 seconds below your maxDuration. If maxDuration is 10 seconds, use an 8-second timeout. This gives your code time to execute the fallback path and return a response before the platform kills the function.
What do real production latency numbers look like?
In a production tax calculation system, I benchmarked the external tax engine API across 3 consecutive runs on a simple single-filer return:
| Phase | Run 1 | Run 2 | Run 3 | Avg |
|---|---|---|---|---|
| API connection handshake | 703ms | 510ms | 661ms | 625ms |
| Federal calculation | 874ms | 881ms | 931ms | 895ms |
| State calculation | 1,226ms | 1,338ms | 1,364ms | 1,309ms |
| Total (fed + state) | 2,100ms | 2,219ms | 2,295ms | ~2.2s |
| Total incl. handshake | 2,803ms | 2,730ms | 2,956ms | ~2.8s |
That is the happy path. A complex return (married filing jointly, multiple income sources, capital gains, itemized deductions) takes 4-7 seconds. And occasionally -- roughly 2% of calls in production -- a request exceeds 10 seconds due to upstream congestion. Without a timeout guard, those 2% become silent 504 errors. With a timeout guard set to 8 seconds, they become graceful fallbacks: "Calculation is taking longer than expected. We will email you when it is ready."
How does Promise.race compare to other timeout approaches?
| Approach | Cancels the request? | Works with any Promise? | Built-in to Node.js? | Best for |
|---|---|---|---|---|
| Promise.race | No -- request continues in background | Yes | Yes (ES2015+) | Simple timeout on any async operation |
| AbortController + signal | Yes -- truly cancels the fetch | No -- only fetch/streams that accept signal | Yes (Node 15+) | HTTP requests where you want to free resources |
| axios timeout option | Yes (via cancel token) | No -- axios only | No (third-party) | Projects already using axios |
| fetch + AbortSignal.timeout() | Yes | No -- fetch only | Yes (Node 17.3+) | Modern Node.js HTTP-only calls |
The practical recommendation: use Promise.race as the universal guard on every external call, and add AbortController on top for HTTP calls where you want to cancel the underlying connection. The two patterns compose well together. According to the MDN AbortController documentation, AbortSignal.timeout() was added in Node 17.3 -- but Promise.race works in every JavaScript runtime since ES2015. I wrote about composing async patterns in serverless environments previously.
Why is this especially critical on serverless?
On a traditional Node.js server running on a VM, a hung API call wastes one thread (or one slot in the event loop). The server stays alive. Other requests are unaffected. On serverless, a hung call has three compounding costs:
- Wasted compute: you pay for the full
maxDurationeven though the response is never sent. On Vercel Pro, that is up to 60 seconds of billed function time producing zero value. - Cold start cascade: the hung invocation occupies a slot. If your concurrency limit is low (AWS Lambda defaults to 1,000, but many teams set it to 10-50), the next request may cold-start a new instance. According to AWS Lambda documentation, cold starts add 100ms to 2 seconds depending on runtime and bundle size.
- No cleanup: when the platform kills the function, database connections are not closed, partial state is not rolled back, and error tracking never fires. The failure is invisible.
In our production system, adding withTimeout guards to 14 external API call sites reduced our 504 error rate from 3.2% to 0.1% over 30 days. The fallback paths were simple -- cached data, default values, or a "processing" message with an email follow-up -- but they turned invisible failures into handled states.
How should you pair this with lazy SDK initialization?
Timeout guards protect against slow API calls. But there is a related pattern that protects against slow cold starts: lazy SDK initialization. Never instantiate SDK clients at module level when they require environment variables.
// BAD -- runs at import time, blocks cold start, crashes if env var missing
const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY! });
// GOOD -- runs only when the function is called
function getAnthropic() {
return new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY! });
}Module-level SDK initialization has caused build failures 3 times in our codebase -- with Anthropic, Resend, and Stripe SDKs. Next.js collects page data at build time, importing all route modules. If an environment variable is missing during build (common in CI and preview environments), the module-level constructor throws and kills the entire build. The lazy factory pattern defers initialization to request time, when environment variables are guaranteed to exist.
The two patterns together -- lazy init for cold starts, Promise.race for hot calls -- cover the full lifecycle of a serverless function invocation. Your function starts fast and fails gracefully.
Frequently Asked Questions
Does Promise.race actually cancel the underlying API call?
No. Promise.race only stops waiting for the result -- the HTTP request continues in the background until it completes or the serverless runtime is killed. If you need true cancellation (to free sockets or stop bandwidth usage), combine Promise.race with an AbortController signal passed to fetch. For most serverless use cases, the runtime is killed shortly after your function returns, so background requests are terminated anyway.
What timeout value should I use?
Set it to 70-80% of your maxDuration. If your serverless function has a 10-second limit, use an 8-second timeout. If you have 60 seconds, use 45-50 seconds. The gap gives your code time to execute the fallback path, log the timeout, and return a response. For user-facing API calls, 5-8 seconds is the practical upper bound regardless of platform limits -- users abandon after 3-5 seconds of spinner according to Google's Core Web Vitals research.
Should I use this for database calls too?
Yes, if the database is external (e.g., Supabase, PlanetScale, Neon over the network). Local or co-located databases rarely hang, but network-accessed databases can experience connection pool exhaustion or DNS resolution delays. A 3-second timeout on database calls is a reasonable default. Most ORMs and database clients have built-in timeout options -- use those first, and add Promise.race as a second layer if the client does not support timeouts natively.
What should the fallback do when the timeout fires?
Three patterns work well: (1) return cached data with a "last updated" timestamp, (2) return a partial response with a flag indicating the timeout, or (3) queue the operation for background processing and respond with "we will email you when ready." The worst fallback is no fallback -- a raw error or a 504 that tells the user nothing. Even a message like "This is taking longer than expected, please try again" is better than silence.
Is Promise.race a memory leak risk?
The setTimeout inside the timeout promise can technically leak if the API call resolves first and the timer is not cleared. In practice, on serverless, the runtime is destroyed after the response is sent, so leaked timers are cleaned up. If you are using this pattern in a long-running server, clear the timer in a .finally() block to be safe. The withTimeout utility shown above can be extended with a clearTimeout for non-serverless use.
Published April 11, 2026. Post 54 of an ongoing series on building production AI systems. Pattern derived from debugging 504 errors on a serverless tax calculation platform processing thousands of returns.
Dinesh Challa is an AI Product Manager building production software with Claude Code. Follow him on LinkedIn.