SDK Init With Env Vars: Module-Level Is a Timebomb
Module-level SDK init with env vars breaks builds, leaks production keys to staging, and crashes tests. The lazy-init fix in 10 lines of TypeScript.
Module-Level SDK Init Is a Staging Timebomb | AI PM Portfolio
Module-Level SDK Init With Env Vars Is a Staging Timebomb -- Lazy-Init Instead
April 11, 2026 · 7 min read · Next.js + Supabase + AI
Last Updated: 2026-04-11
Initializing SDK clients at module scope -- const client = new SomeSDK(process.env.API_KEY) -- executes at import time, before environment variables may be loaded, during builds, or in test environments where credentials do not exist. This causes build failures, staging-to-production key leaks, and test crashes. The fix is a lazy factory function that defers initialization to first use: 10 lines of TypeScript that eliminate an entire class of deployment failures.
What is the module-level SDK init antipattern?
This code looks perfectly reasonable:
// lib/anthropic.ts
import Anthropic from "@anthropic-ai/sdk";
// Runs the instant this file is imported -- not when you call it
const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
export { anthropic };The problem: this line executes at import time, not at call time. In a Next.js application, import time is not when you think it is. Next.js collects page data at build time by importing all route modules. If the environment variable is missing during build -- which is the default in CI, preview deployments, and most Docker build stages -- this line throws and kills the entire build. I have hit this exact failure with three different SDKs (Anthropic, Resend, and Stripe) across five separate incidents in production deployments.
According to the Next.js deployment documentation, runtime environment variables are only available during request handling, not during the build phase. Module-level initialization ignores this distinction entirely.
What are the three ways this pattern fails?
Failure 1: Build-time crashes
The SDK validates the API key at construction time. During next build in CI, process.env.ANTHROPIC_API_KEY is undefined. The SDK throws:
// Build output:
Error: Missing required API key for Anthropic client
at new Anthropic (/node_modules/@anthropic-ai/sdk/src/index.ts:42)
at Object.<anonymous> (lib/anthropic.ts:3)
// Build fails. Deploy blocked. Every developer on the team gets a Slack alert.
// Time to debug: 15-45 minutes (because the error points to the SDK internals,
// not your code)This happened three times in one month on our project. Each time the error message pointed into SDK internals, not to the problematic import. Average time to identify the root cause: 25 minutes per incident.
Failure 2: Staging uses production keys
Module-level init captures the environment variable value at the moment the module is first imported. In serverless environments like Vercel, if the module is imported before environment overrides are applied, the client may silently use the wrong credentials. I have seen staging environments make API calls against production services because the SDK client was initialized before the staging override took effect. No error. No warning. Just staging data in production.
Failure 3: Tests crash on import
Test files that import any module which transitively imports the SDK file will throw before a single test runs. The error is not "test failed" but "module failed to load." Jest and Vitest report this as a configuration error, not a test failure, making it harder to diagnose.
// my-feature.test.ts
import { processDocument } from "../lib/processor";
// processor.ts imports anthropic.ts at the top
// anthropic.ts throws because ANTHROPIC_API_KEY is not set in test env
// Result: "Cannot find module" or "SyntaxError" -- misleading error messages
// The test file never even loadsWhat is the lazy-init fix?
Replace the module-level constant with a factory function. The entire fix is 10 lines:
// lib/anthropic.ts -- BEFORE (crashes at import time)
import Anthropic from "@anthropic-ai/sdk";
const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
export { anthropic };
// lib/anthropic.ts -- AFTER (initializes on first use)
import Anthropic from "@anthropic-ai/sdk";
let _client: Anthropic | null = null;
export function getAnthropic(): Anthropic {
if (!_client) {
const apiKey = process.env.ANTHROPIC_API_KEY;
if (!apiKey) throw new Error("ANTHROPIC_API_KEY is not set");
_client = new Anthropic({ apiKey });
}
return _client;
}
// Usage: getAnthropic().messages.create({ ... })Key properties of this pattern:
- Zero cost at import time. The module exports a function, not a live client. No SDK code runs during
next build. - Singleton behavior. The client is created once and reused. No performance penalty on subsequent calls.
- Explicit error at the call site. If the env var is missing, the error occurs where you actually use the client, with a clear message you wrote, not an SDK stack trace.
- Test-friendly. Tests that do not call
getAnthropic()never trigger initialization. Tests that do can mock the function directly.
How do the four initialization patterns compare?
| Pattern | Module-Level Init | Lazy Singleton | Factory Function | Dependency Injection |
|---|---|---|---|---|
| When client is created | Import time | First call | Every call | Caller decides |
| Build-safe | No | Yes | Yes | Yes |
| Env var timing | Must exist at import | Must exist at first use | Must exist at each use | Managed externally |
| Performance | One instance (good) | One instance (good) | New instance per call (bad for stateful SDKs) | One instance (good) |
| Testability | Hard (side effect on import) | Medium (mock the getter) | Medium (mock the factory) | Easy (pass mock directly) |
| Complexity | 1 line | 8-10 lines | 3-5 lines | 20+ lines (container setup) |
| Best for | Pure config, no validation | Most SDK clients | Clients needing fresh config | Large apps with DI framework |
The lazy singleton is the right default for 90% of cases. Factory functions make sense when the client configuration changes between calls (e.g., per-tenant API keys). Dependency injection is appropriate only when you already have a DI container. Module-level init is appropriate only for pure configuration objects that perform no validation and make no network calls. More on production-ready development patterns.
Why does Vercel make this worse?
Vercel adds a compounding problem: environment variables pasted via the dashboard or set with echo silently acquire trailing \n characters. This means process.env.API_KEY is actually "sk_live_abc123\n" -- and the SDK sends that literal newline in the HTTP header. The API rejects it with a 401 or 400, and the error message says "invalid API key" when the key is correct.
With module-level init, you cannot catch this at the call site. The client is already constructed with the mangled key. With lazy init, you can add a .trim() or validation step inside the factory:
export function getAnthropic(): Anthropic {
if (!_client) {
const apiKey = process.env.ANTHROPIC_API_KEY?.trim(); // strips trailing \n
if (!apiKey) throw new Error("ANTHROPIC_API_KEY is not set");
_client = new Anthropic({ apiKey });
}
return _client;
}According to Vercel's own environment variables documentation, values are stored as-is with no sanitization. The .trim() in the factory is your defense. This single line prevented three separate auth failures in our production deployment over a two-month period. Integration failures like these are the real last mile problem.
When is module-level init actually fine?
Not every module-level initialization is dangerous. The pattern is safe when all three conditions are true:
- No validation. The constructor does not validate the input or throw on invalid values.
- No network calls. The constructor does not make HTTP requests, open connections, or ping a health endpoint.
- No secrets. The value is a public configuration constant, not a credential that varies between environments.
// SAFE at module level -- pure config object, no validation, no secrets
const config = {
maxRetries: 3,
timeout: 30_000,
region: "us-east-1",
};
// UNSAFE at module level -- validates key, may make network call
const stripe = new Stripe(process.env.STRIPE_SECRET_KEY);
// UNSAFE at module level -- throws if key is missing
const resend = new Resend(process.env.RESEND_API_KEY);
// SAFE at module level -- Supabase client with public (anon) key
const supabase = createClient(
process.env.NEXT_PUBLIC_SUPABASE_URL!,
process.env.NEXT_PUBLIC_SUPABASE_ANON_KEY!
);
// Note: NEXT_PUBLIC_ vars are inlined at build time by Next.js,
// so they ARE available during build. Server-only vars are not.The distinction is subtle but critical: NEXT_PUBLIC_ prefixed variables are statically replaced at build time and are always available. Server-only variables (without the prefix) are runtime-only. Module-level init with server-only variables is the timebomb.
Quick audit: Search your codebase for new.*process\.env\. at the top level of any file in app/api/ or lib/. Every match that does not use a NEXT_PUBLIC_ variable is a candidate for lazy init. In our codebase, this search found 5 files that needed the fix. Each one had caused at least one build failure.
How do you audit and migrate an existing codebase?
The migration from module-level to lazy init is mechanical. Here is the process I used across 5 SDK clients:
- Find all module-level SDK inits:
grep -rn "^const.*new.*process\.env" app/ lib/ - For each match: wrap in a getter function with the singleton pattern shown above.
- Update all call sites: replace
anthropic.messages.create()withgetAnthropic().messages.create(). - Add
.trim()to everyprocess.envread inside the factory (Vercel newline defense). - Test the build: run
next buildwithout the env vars set. If it passes, the migration is complete.
Total time for 5 files: 45 minutes. Build failures prevented per month after migration: 2-3. The ROI is immediate.
Frequently Asked Questions
Does lazy init add latency to the first API call?
Yes, but the overhead is negligible -- typically under 1ms for client construction. The SDK constructor allocates memory and stores configuration; the expensive work (HTTP connection, TLS handshake) happens on the first actual API call regardless of when you construct the client. In serverless environments, the cold start overhead of the function itself (50-500ms) dwarfs the client construction time.
Is this pattern specific to Next.js?
The build-time crash is Next.js-specific because Next.js imports route modules during next build. However, the staging-key-leak and test-crash problems affect any JavaScript application. Nuxt, Remix, SvelteKit, and plain Express apps all benefit from lazy init when deployed to serverless platforms where environment variables are injected at runtime, not at build time.
Should I use a dependency injection framework instead?
For most TypeScript applications with fewer than 20 injectable services, a DI framework adds complexity without proportional benefit. The lazy singleton pattern provides the same build safety and testability with zero additional dependencies. Consider DI (InversifyJS, tsyringe) only when you have a large service graph with complex interdependencies.
What about the Supabase client -- should that be lazy too?
It depends on which client. The browser client using NEXT_PUBLIC_ variables is safe at module level because those values are inlined at build time. The server client using SUPABASE_SERVICE_ROLE_KEY (a server-only secret) should use lazy init. The rule is simple: if the variable lacks the NEXT_PUBLIC_ prefix, use lazy init.
Does this pattern work with edge functions?
Yes. Edge functions on Vercel and Cloudflare Workers have the same runtime-only environment variable behavior. Lazy init is actually more important in edge functions because the runtime is more constrained and build-time errors surface as deployment failures with less descriptive error messages.
Dinesh Challa is an AI Product Manager building production software with Claude Code. Follow him on LinkedIn.
Published April 11, 2026. Part of a series on production engineering patterns for AI-native applications.