My Claude Code Operating System: 20 Agents, 8 Skills, and 11 Hooks for a Complete Dev Workflow
A complete daily workflow system built from Claude Code primitives: morning briefings, 4-reviewer code review, 7-stage deployment pipeline, domain-specific debugger agents, and audit logging.
The Problem: 195 Daily Interactions With No System
I was running 195+ Claude Code interactions per day — building a production SaaS app with AI agents, document extraction, payment processing, and 30 real users. But I was doing it chaotically: unnamed sessions I couldn't find later, manually checking deployments after every push, context windows degrading without me noticing, and 33 MCP servers burning tokens before I typed a word.
So I built an operating system. Not an app — a daily workflow system built entirely from Claude Code's native features: 20 custom agents, 8 slash command skills, 11 hooks, 5 shell aliases, and structured audit logging. Here's the complete guide to using it.
The Architecture: What's Running
20 Custom Agents → specialized AI workers for every domain
8 Skills → one-command workflows that chain agents
11 Hooks → deterministic automation on every action
5 Shell Aliases → terminal shortcuts for named sessions
2 Log Files → audit trail + error tracking
1 Rotation Script → weekly log cleanup
11 MCP Servers → pruned from 33 (60% token savings)Morning Routine: 3 Commands, 10 Minutes
Command 1: The Morning Briefing
ccmorningThis single alias opens a named Claude Code session and runs the /morning skill. That skill uses dynamic context injection (! backtick syntax) to pre-load your git state before the LLM even starts thinking:
- Branch: !`git branch --show-current`
- Last staging deploy: !`git log --oneline origin/staging -1`
- Staging ahead of prod by: !`git rev-list --count origin/main..origin/staging` commits
- Nightly QA: !`gh run list --workflow=nightly-qa.yml --limit=1`Then it spawns 3 agents in parallel:
- @client-pulse (background) — queries pipeline distribution, stuck returns, unpaid invoices, missed calls
- @nightly-qa-runner (background) — checks GitHub Actions results, correlates failures with recent commits
- Sentry + Vercel check (foreground) — unresolved errors, latest deployments, 500 error count
Output: a unified brief with a "Recommended First Task" based on everything it found.
Command 2: Fix Whatever the Brief Found
The morning brief tells you what to do first. If a client is stuck:
/debug-client "Vikas Goyal"This chains the @onboarding-debugger agent (full DB diagnosis) + Resend email delivery check + Sentry error search for that user. Outputs ready-to-execute SQL fixes.
Command 3: Deploy if Staging is Ahead
/deploy prodPer-Task Workflow: Features, Bugs, Deploys
Starting a New Feature
Always start with a named session:
cc feature-tax-planningThe cc alias is a wrapper: claude -n "feature-tax-planning". Named sessions show up in /resume, so you can find them later.
Step 1: Interview before building.
/interview add tax planning page with estimated refund projectionsThis runs a 7-phase structured interview using Claude's AskUserQuestion tool — one question at a time, not a form dump:
Phase 1: Requirements (3-5 questions)
Phase 2: Data Model (2-4 questions)
Phase 3: Architecture (2-4 questions)
Phase 4: Edge Cases & Error States (3-5 questions)
Phase 5: Security & Privacy (2-3 questions)
Phase 6: Testing Strategy (2-3 questions)
Phase 7: UX & Design (2-3 questions)Outputs a complete spec with mermaid diagrams to docs/specs/feature-name.md.
Step 2: Review the plan before coding.
/reviewSpawns 4 parallel review agents, each with domain-specific criteria:
| Reviewer | Checks |
|---|---|
| Security Engineer | SQL injection, XSS, auth bypass, IDOR, PII encryption, RLS policies |
| Backend Engineer | Error handling, maxDuration, type safety, N+1 queries, Supabase client choice |
| Frontend Engineer | React hooks rules, SSR safety, brand compliance, accessibility, dynamic imports |
| QA Engineer | Test coverage, regression risk, E2E selector changes, edge cases, contract test updates |
Verdict: APPROVE, CHANGES REQUESTED, or BLOCK. Any security blocker = hard block.
Step 3: Implement. Then /review again before committing.
Fixing a Bug
Pick the right debugger agent:
| Problem | Command |
|---|---|
| Email not arriving | @email-debugger user@email.com |
| Payment not processing | @stripe-debugger invoice_123 |
| Document not extracting | @extraction-debugger doc_456 |
| Client stuck in pipeline | /debug-client "client name" |
| Production incident | @incident-triage |
Each debugger has a multi-step investigation protocol. For example, @email-debugger traces through 5 layers: caller → email-gate (dedup) → email-brain (AI evaluation) → email-log (recording) → Resend (delivery). It checks trigger consistency, daily caps, and known failure patterns automatically.
Deploying
/deployThis is a 7-stage pipeline skill — the most advanced thing I built:
STAGE 1: Pre-flight — branch, author, uncommitted changes
STAGE 2: tsc + @pre-push-audit (parallel)
STAGE 3: Push to staging
STAGE 4: Monitor Vercel deployment (polls every 60s for 5 min)
STAGE 5: Post-deploy health check (Sentry + 500s + site load)
STAGE 6: "Promote to prod?" → push staging to main
STAGE 7: Spawn @deploy-monitor in background for 10 minThe skill uses dynamic context injection to pre-load your branch, staged files, and staging-vs-prod diff before the pipeline starts. If tsc finds type errors, it hard-stops. If Sentry finds new errors post-deploy, it warns you.
The 11 Hooks: What Fires Automatically
These run on every action without you thinking about them:
PreToolUse (4 hooks — fire before commands execute)
| Hook | Trigger | Action |
|---|---|---|
| Safety check | Every Bash command | Blocks: force push, reset --hard, rm -rf, push main→main |
| Type check | git push * only | Runs tsc --noEmit, blocks push on errors |
| Branch isolation | git push * only | Blocks feature/* branches from pushing to staging/main |
| Staging blocker | git add * only | Blocks git add . — forces specific file staging |
Key optimization: Hooks #2-4 use the if field (v2.1.85) to filter by command pattern. Without this, all 4 hooks would spawn a process on every Bash command — ~90% of which aren't git operations. The if field eliminates those wasted process spawns.
PostToolUse (3 hooks — fire after actions complete)
| Hook | Trigger | Action |
|---|---|---|
| Prettier | After Write/Edit | Auto-formats .ts/.tsx/.js/.css/.json |
| Scope check | After git commit | Warns if committing >10 files |
| Audit log | After every Bash | Logs {ts, cmd, exit} to ~/.claude/audit.jsonl |
SessionStart (2 hooks)
| Hook | Trigger | Action |
|---|---|---|
| Context recovery | After /compact | Re-injects branch + last 5 commits + modified files |
| Directory guard | Session start in ~ | Blocks — forces cd to project directory |
Stop + StopFailure (2 hooks)
| Hook | Trigger | Action |
|---|---|---|
| Done notification | Claude finishes | macOS notification: "Branch: staging | 3 files changed" |
| Error notification | Rate limit / API error | macOS alert with sound + logs to stop-failures.log |
The 20 Agents: Your Specialized Workforce
Core Workflow (6 agents)
| Agent | Model | What It Does |
|---|---|---|
@deploy-monitor | Sonnet | Post-push Vercel + Sentry monitoring (runs in background) |
@migration-reviewer | Opus | Supabase migration safety: SECURITY DEFINER, FK policies, RLS |
@scope-check | Sonnet | Pre-commit scope creep detection |
@incident-triage | Opus | Production incident diagnosis — parallel Sentry + Vercel + code investigation |
@pre-push-audit | Sonnet | Checks for console.logs, secrets, missing maxDuration, forbidden colors |
@context-scout | Sonnet | Pre-feature codebase exploration — returns 50-line brief of the area |
Operations (6 agents)
| Agent | Model | What It Does |
|---|---|---|
@db-health | Opus | Weekly: RLS coverage, table sizes, missing indexes, orphans, triggers |
@pr-preparer | Sonnet | Stage → commit → push → create PR in one step |
@extraction-debugger | Opus | Debug document extraction: DB records + code + Langfuse prompts + storage |
@client-pulse | Sonnet | Daily pipeline: stuck returns, unpaid invoices, missed calls |
@blog-publisher | Sonnet | Draft → SEO optimize → format → publish to Ghost |
@prompt-auditor | Opus | Langfuse vs code drift, anti-regression rules, token costs |
Debugging (4 agents)
| Agent | Model | What It Does |
|---|---|---|
@nightly-qa-runner | Opus | Analyze QA failures, correlate with commits, check Sentry |
@email-debugger | Opus | Trace email through gate → brain → log → Resend delivery |
@stripe-debugger | Opus | Trace payments: webhook → invoice status → tax return status |
@onboarding-debugger | Opus | Full client diagnosis: docs, tasks, meetings, invoices, emails |
Maintenance (4 agents)
| Agent | Model | What It Does |
|---|---|---|
@perf-profiler | Sonnet | Bundle size, API latency, Lighthouse, pattern violations |
@dependency-checker | Sonnet | Outdated deps, security advisories, blocked upgrades |
@seo-auditor | Sonnet | Audit blog posts against keyword strategy |
@release-notes | Sonnet | Categorized changelog between any two git refs |
Model selection strategy: Opus for anything that requires judgment (security, debugging, incidents). Sonnet for speed-sensitive operations (monitoring, deployment, maintenance scans).
MCP Server Pruning: 33 → 11
Every connected MCP server costs tokens on every message. Heavy setups (5+ servers) can burn 50,000+ tokens before your first prompt. I was running 33 servers.
After auditing with /context:
- Kept 11: GitHub, Vercel, Supabase, Resend, Stripe, Sentry, Context7, Sequential Thinking, Playwright, Linear, Ideogram
- Disabled 10: duplicate Sentry (two configs), OpenAI Image (overlap with Ideogram), Nano Banana (rarely used), Better Stack (80+ tools!), Azure, Applitools, TestSprite, BrowserStack, Pest, PostHog
Estimated 40-60% reduction in per-message token overhead. Better Stack alone was loading 80+ tool definitions into every conversation.
Mid-Session Commands You Should Memorize
| Situation | Command | Why |
|---|---|---|
| Quick question mid-feature | /btw is staging deployed? | Answers in overlay — never enters context |
| Session getting long | /compact Focus on the auth changes | Manual compaction BEFORE auto-degradation |
| Context feels bloated | /context | See token breakdown per component |
| Finished a sub-task | /clear | Reset before unrelated work |
| Need deep reasoning | Include "ultrathink" in prompt | Bumps effort to high for one turn |
| See hook output | Ctrl+O | Verbose mode — thinking + hooks visible |
| Big feature, need parallel | Ask Claude to create a team | Agent Teams splits work across instances |
Weekly Maintenance Schedule
| Day | What | Command | Time |
|---|---|---|---|
| Monday | Dependency audit | ccaudit | 2 min |
| Monday | DB health check | @db-health | 5 min |
| Wednesday | Prompt drift audit | @prompt-auditor | 5 min |
| Wednesday | Performance profile | @perf-profiler | 3 min |
| Friday | SEO audit | @seo-auditor | 5 min |
| Friday | Release notes | @release-notes | 2 min |
The Cheat Sheet
MORNING: ccmorning
NEW FEATURE: cc feature-name → /interview → build → /review → /deploy
BUG FIX: cc bugfix-name → @[debugger] → fix → /review → /deploy
DEPLOY: /deploy (or ccdeploy from terminal)
QUICK Q: /btw question
LONG SESSION: /compact Focus on X
CLIENT STUCK: /debug-client "name"
STANDUP: /standup
INCIDENT: cc incident → @incident-triage
WEEKLY: @dependency-checker @db-health @prompt-auditor @perf-profilerWhat Changed After Building This
Before: 195 interactions/day, chaotic, manually checking everything, unnamed sessions I couldn't find, context degrading without me noticing.
After:
- Morning brief in one command instead of checking 4 dashboards
- Zero manual deployment monitoring — the /deploy skill handles the full pipeline
- 4-reviewer code review runs in parallel before every commit
- Every Bash command logged with structured JSON — full audit trail
- Rate limits and API errors trigger macOS notifications with sound
- 40-60% less token overhead from MCP pruning
- Named sessions I can find and resume
The key insight: Claude Code's hooks, agents, and skills aren't just features — they're primitives for building a personal development operating system. The agents handle domain expertise. The skills chain agents into workflows. The hooks enforce rules deterministically. And the aliases make it all muscle memory.
Start with ccmorning tomorrow. That one command replaces 15 minutes of manual checking.