Lock-Column Pattern: Postgres Row Lock for Supabase
The lock-column pattern prevents DB race conditions in serverless apps by recording writer priority on each row. No deadlocks, no SELECT FOR UPDATE.
Lock-Column Pattern: Preventing Races When Async Workers Fight Over the Same Row
The Lock-Column Pattern: Preventing Races When Async Workers Fight Over the Same Row
April 11, 2026 · 8 min read · Next.js + Supabase + AI
Last Updated: 2026-04-11
The lock-column pattern prevents database race conditions in serverless environments by adding a column that records which writer last updated a row and its priority level. Before any write, the caller checks whether its source priority is high enough to overwrite the current value. Higher-priority sources (CPA manual edit) can overwrite lower ones (calculation engine), but not vice versa. No deadlocks, no shared transaction context required, and it works across disconnected serverless function invocations.
What problem does the lock-column pattern solve?
In any system with multiple async writers targeting the same database row, you get a coordination problem. An extraction pipeline writes estimated numbers. A calculation engine overwrites them with computed values. A human expert manually corrects the numbers. Then the extraction pipeline runs again on a new document and silently overwrites the human expert's corrections.
This is not a theoretical concern. In a production tax-tech application I built, we discovered that a nightly cron job was silently nulling manually entered financial data on every run. The cron job called a buildReturnIntelligence() function that initialized five estimated fields to null and never populated them. Every night at 02:30 UTC, it wiped whatever the calculation engine or human reviewers had written. The same function also ran on every document upload, compounding the damage.
The result: four stacked bugs silently corrupting financial estimates across 45 returns in production. We discovered the problem only when a reviewer asked why their manually entered numbers kept disappearing. I wrote about fire-and-forget patterns causing similar silent failures in serverless environments.
Why don't traditional locking approaches work in serverless?
The classic database answer to concurrent writes is row-level locking: SELECT ... FOR UPDATE, advisory locks, or optimistic locking with version columns. Each has a fundamental limitation in serverless architectures.
Why does SELECT FOR UPDATE fail in serverless?
SELECT FOR UPDATE requires all competing writers to share a transaction context. In a monolithic server with a connection pool, this works. In serverless, each function invocation gets its own short-lived database connection. A Vercel serverless function processing a document upload has no way to hold a lock for another function invocation that might fire 30 seconds later from a webhook. According to the PostgreSQL documentation on explicit locking, row-level locks are held only until the end of the current transaction. Serverless functions typically run in auto-commit mode or very short transactions, making sustained locks impractical.
Why does optimistic locking add complexity without solving priority?
Optimistic locking (a version column incremented on every write, with a WHERE version = expected_version clause) prevents lost updates but does not encode which writer should win. If the extraction pipeline and a human reviewer both read version 3, one will succeed and one will get a conflict error. But the system has no way to know that the human reviewer's write should always win. You end up building retry logic on top of version checks, which gets complex fast.
Why are advisory locks overkill for row-level writer priority?
PostgreSQL advisory locks (pg_advisory_lock) are session-level or transaction-level. They require the caller to acquire and release the lock explicitly. In a serverless environment where functions can be killed mid-execution (Vercel terminates containers after maxDuration), advisory locks risk being held indefinitely. The Supabase documentation on advisory locks specifically warns about this in pooled connection scenarios.
How does the lock-column pattern work?
The lock-column pattern replaces runtime locking with a persistent priority marker. The core idea: add a column to the row that records who last wrote the data and use that value as a priority gate for future writes.
What does the SQL migration look like?
-- Migration: add numbers_source lock column
-- The CHECK constraint limits values to known writer identities
ALTER TABLE tax_returns
ADD COLUMN IF NOT EXISTS numbers_source text
CHECK (numbers_source IN ('calculation_engine', 'cpa_draft'));
-- Backfill existing rows that already have CPA-reviewed numbers
-- DISTINCT ON picks the preferred extraction source per return
UPDATE tax_returns tr
SET numbers_source = 'cpa_draft'
FROM (
SELECT DISTINCT ON (de.return_id)
de.return_id
FROM document_extractions de
JOIN documents d ON d.id = de.document_id
WHERE d.doc_category = 'cpa_draft_1040'
AND de.status = 'done'
ORDER BY de.return_id,
CASE de.provider
WHEN 'azure-cu' THEN 1 -- prefer CU over DI
WHEN 'azure-doc-intel' THEN 2
END
) sub
WHERE tr.id = sub.return_id;The CHECK constraint is important. It prevents any writer from setting an unknown source value, catching integration bugs at the database level rather than in application code. In production, this constraint caught two bugs in the first week: a test helper that was writing 'manual' and an old code path that was writing 'estimated'.
What does the TypeScript check-before-write logic look like?
// lib/tax/numbers-source.ts
// Priority order: higher index = higher authority
const SOURCE_PRIORITY: Record<string, number> = {
calculation_engine: 1, // automated calculation
cpa_draft: 2, // human-reviewed CPA draft
};
export function canWriteNumbers(
currentSource: string | null,
incomingSource: string
): boolean {
// If no current source, any writer can claim the row
if (!currentSource) return true;
const currentPriority = SOURCE_PRIORITY[currentSource] ?? 0;
const incomingPriority = SOURCE_PRIORITY[incomingSource] ?? 0;
// Equal or higher priority can overwrite
return incomingPriority >= currentPriority;
}
// Usage in an API route:
// ---- Input: currentSource = 'cpa_draft', incomingSource = 'calculation_engine'
// ---- Output: false (calculation engine cannot overwrite CPA draft)
// ---- Input: currentSource = 'calculation_engine', incomingSource = 'cpa_draft'
// ---- Output: true (CPA draft can overwrite calculation engine)
// ---- Input: currentSource = null, incomingSource = 'calculation_engine'
// ---- Output: true (no current source, anyone can write)How does the write path use this check?
// In the persist route handler
export async function POST(req: Request, { params }: { params: { id: string } }) {
const returnId = params.id;
// 1. Read current lock state
const { data: taxReturn } = await supabase
.from('tax_returns')
.select('numbers_source')
.eq('id', returnId)
.single();
// 2. Check priority before writing
if (!canWriteNumbers(taxReturn?.numbers_source, 'calculation_engine')) {
return NextResponse.json(
{ error: 'Row locked by higher-priority source', current_source: taxReturn?.numbers_source },
{ status: 409 } // Conflict — caller knows why the write was rejected
);
}
// 3. Atomic write with source claim
const { error } = await supabase
.from('tax_returns')
.update({
estimated_agi: computedAgi,
estimated_total_tax: computedTax,
estimated_refund: computedRefund,
effective_tax_rate: computedRate, // stored as percentage: 27.2, not 0.272
numbers_source: 'calculation_engine',
numbers_updated_at: new Date().toISOString(),
})
.eq('id', returnId);
// 4. Return success with source provenance
return NextResponse.json({ success: true, numbers_source: 'calculation_engine' });
}The 409 status code is intentional. It tells the caller exactly why the write failed and what currently holds the lock, enabling clean error handling in the UI. In our case, the UI hides the "Get Estimate" CTA entirely when numbers_source === 'cpa_draft' because the CPA's numbers are always authoritative. This pattern of status-aware UI components is something I covered in a previous post on building trust through interface design.
How does the lock-column pattern compare to other approaches?
| Feature | Lock Column | SELECT FOR UPDATE | Optimistic Locking (version) | Advisory Locks |
|---|---|---|---|---|
| Works across disconnected invocations | Yes | No (requires shared transaction) | Yes | No (session/transaction scoped) |
| Encodes writer priority | Yes (core feature) | No | No | No |
| Deadlock risk | None | Yes (multi-row) | None | Yes (if mismanaged) |
| Prevents lost updates | Priority-based (intentional) | Yes (serialized) | Yes (conflict detection) | Yes (serialized) |
| Serverless-safe | Yes | No | Yes | Risky (container termination) |
| Retry complexity | None (409 is final) | Low (retry transaction) | High (re-read, re-check, retry) | Medium (re-acquire lock) |
| Audit trail | Built-in (column value = provenance) | None | Version number only | None |
| Best for | Multiple async writers with clear priority hierarchy | Short-lived transactions in monoliths | Concurrent edits where all writers are equal | Application-level mutexes |
When should you use the lock-column pattern?
The lock-column pattern is the right choice when three conditions are met:
- Multiple async writers target the same row and they do not share a transaction context. This is the default in serverless: each function invocation is isolated.
- Writers have a clear priority hierarchy. Human review beats automated calculation. Automated calculation beats raw extraction. If all writers are equal (like multiple users editing a shared document), optimistic locking with a version column is a better fit.
- Silent overwrites are worse than rejected writes. The lock column does not queue writes or retry them. It rejects lower-priority writes with a 409. If your system needs every write to eventually succeed, you need a queue, not a lock column.
The rule of thumb: If you find yourself building a cron job to "fix" data that keeps getting overwritten, you have a lock-column problem. The cron job treats the symptom. The lock column treats the cause. In our case, deleting the nightly rebuild cron and adding the lock column eliminated an entire class of data corruption bugs.
What are the operational results of this pattern?
After deploying the lock-column pattern across 45 production returns with a backfill migration, three things happened:
- Silent overwrites dropped to zero. Before the lock column, the nightly cron was nulling human-reviewed numbers every night. After: zero overwrites of CPA-reviewed data by automated processes across 3 weeks of monitoring.
- We deleted 340 lines of "fixing" code. The cron job that rebuilt intelligence nightly (02:30 UTC), the fire-and-forget calculation engine calls on every document upload, and three manual SQL scripts used to patch corrupted data -- all deleted. The lock column made them unnecessary.
- The 409 response caught two integration bugs in the first week. A test helper writing an unknown source value and a deprecated code path both hit the
CHECKconstraint or the priority gate, surfacing bugs that would have previously caused silent data corruption.
The effective tax rate storage is worth noting as a concrete example of the kind of bug this pattern prevents. Before the lock column, the calculation engine stored rates as fractions (0.272) while the dashboard rendered them as percentages. A 27.2% effective rate displayed as "0.3%". The lock column forced us to standardize: all writers now store percentages (27.2), capped between 0.1 and 50, and the column is only writable through the priority-checked path. Data format consistency across writers is a recurring theme in production AI systems.
What are the limitations?
The lock-column pattern is not a silver bullet. Three limitations to know:
- No atomicity guarantee between read and write. The check-then-write pattern has a TOCTOU (time-of-check-time-of-use) window. Two calculation engine invocations could both read
numbers_source = nulland both proceed to write. In practice, this is a non-issue for systems where writers run sequentially (document upload triggers extraction, which triggers calculation). For truly concurrent equal-priority writers, add aWHERE numbers_source IS NULL OR numbers_source = 'calculation_engine'clause to the UPDATE statement to make the check atomic. - Priority changes require a migration. Adding a new writer or changing the priority order means updating the
CHECKconstraint and the TypeScript priority map. This is intentional friction -- changing who can overwrite whom should be a deliberate, reviewed decision. - Not suitable for collaborative editing. If multiple users need to edit the same row simultaneously (Google Docs-style), you need CRDTs or operational transforms, not a lock column. The lock column is for system-to-system coordination, not user-to-user collaboration.
Frequently Asked Questions
Can the lock-column pattern work with Supabase Row Level Security?
Yes. The lock column is a regular Postgres column, so RLS policies apply normally. You can add a policy like USING (numbers_source IS DISTINCT FROM 'cpa_draft' OR auth.role() = 'service_role') to prevent even authenticated API calls from overwriting CPA-reviewed data. The application-level priority check and the RLS policy act as defense in depth.
How is this different from a state machine?
A state machine controls transitions between states (draft, review, approved). The lock-column pattern controls which writer can modify specific fields within any state. They are complementary: you might have a return in "review" state where both the calculation engine and a CPA can write, but the CPA's write takes priority. The lock column handles the priority; the state machine handles the workflow.
Does this pattern scale to more than two writers?
Yes. The priority map is a simple numeric ranking. We started with two sources (calculation_engine at priority 1, cpa_draft at priority 2) and could add a third (e.g., irs_confirmed at priority 3) by adding one line to the CHECK constraint and one entry to the priority map. The pattern scales linearly with the number of writers.
What happens if you need to downgrade a row's source?
Admin-only. In our system, only a service-role database call can set numbers_source to a lower-priority value or NULL. This is a deliberate safety mechanism: if a CPA's draft is withdrawn, an admin explicitly unlocks the row. The application code cannot do this through normal API routes.
Should I use this pattern for every table with multiple writers?
No. Use it only when writers have a clear priority hierarchy and silent overwrites cause data corruption. For tables where the most recent write is always correct (like a user updating their profile), a simple updated_at timestamp is sufficient. The lock column adds value specifically when "most recent" is not the same as "most authoritative."
Published April 11, 2026. Part of a series on production database patterns for serverless applications built with Next.js, Supabase, and AI pipelines.
Dinesh Challa is an AI Product Manager building production software with Claude Code. Follow him on LinkedIn.