From 9eddc5197925f1203498f371d197680c4547ff38 Mon Sep 17 00:00:00 2001
From: Alex Newman <thedotmack@gmail.com>
Date: Thu, 6 Nov 2025 23:10:33 -0500
Subject: [PATCH] feat: Conduct comprehensive overhead analysis of worker
 service

- Added detailed documentation on performance issues within `worker-service.ts`.
- Identified high severity issues including unnecessary polling, artificial delays, and redundant database operations.
- Provided recommendations for immediate and long-term improvements to enhance performance and reduce complexity.
- Suggested architectural changes to replace polling with event-driven patterns and optimize database connection handling.
---
 docs/VIEWER.md                     |  405 ----------
 docs/worker-refactor-plan.md       |  303 --------
 docs/worker-server-architecture.md | 1129 ----------------------------
 docs/worker-service-analysis.md    |  907 ----------------------
 docs/worker-service-overhead.md    |  959 +++++++++++++++++++++++
 5 files changed, 959 insertions(+), 2744 deletions(-)
 delete mode 100644 docs/VIEWER.md
 delete mode 100644 docs/worker-refactor-plan.md
 delete mode 100644 docs/worker-server-architecture.md
 delete mode 100644 docs/worker-service-analysis.md
 create mode 100644 docs/worker-service-overhead.md

diff --git a/docs/VIEWER.md b/docs/VIEWER.md
deleted file mode 100644
index 182810b1..00000000
--- a/docs/VIEWER.md
+++ /dev/null
@@ -1,405 +0,0 @@
-# Viewer UI - Web-Based Memory Stream Visualization
-
-## Overview
-
-The Claude-Mem Viewer UI is a production-ready web interface that provides real-time visualization of your memory stream. Access it at **http://localhost:37777** while the claude-mem worker is running.
-
-**Key Features:**
-- 🔴 **Real-time Updates** - Server-Sent Events (SSE) stream new observations, sessions, and prompts instantly
-- 📜 **Infinite Scroll** - Load historical data progressively with automatic pagination
-- 🎯 **Project Filtering** - Focus on specific codebases with smart project selection
-- 🎨 **Theme Toggle** - Light, dark, or system preference with persistent settings
-- 💾 **Settings Persistence** - Sidebar state and project filters saved automatically
-- 🔄 **Auto-Reconnection** - Exponential backoff ensures connection stability
-- ⚡ **GPU Acceleration** - Smooth animations and transitions
-
-## Architecture
-
-### Technology Stack
-
-| Component | Technology | Purpose |
-|-----------|-----------|---------|
-| **Framework** | React + TypeScript | Component-based UI with type safety |
-| **Build System** | esbuild | Self-contained HTML bundle (no separate assets) |
-| **Real-time** | Server-Sent Events (SSE) | Push-based updates from worker service |
-| **State Management** | React hooks | Local state with custom hooks for SSE, pagination, settings |
-| **Styling** | Inline CSS | No external stylesheets, fully self-contained |
-| **Typography** | Monaspace Radon | Embedded monospace font for code aesthetics |
-
-### File Structure
-
-```
-src/ui/viewer/
-├── App.tsx                    # Main application component
-├── types.ts                   # TypeScript interfaces
-├── components/
-│   ├── Header.tsx             # Top navigation with logo and theme toggle
-│   ├── Sidebar.tsx            # Project filter and stats sidebar
-│   ├── Feed.tsx               # Main feed with infinite scroll
-│   ├── ThemeToggle.tsx        # Light/dark/system theme selector
-│   └── cards/
-│       ├── ObservationCard.tsx  # Displays individual observations
-│       ├── SummaryCard.tsx      # Displays session summaries
-│       ├── PromptCard.tsx       # Displays user prompts
-│       └── SkeletonCard.tsx     # Loading placeholder
-├── hooks/
-│   ├── useSSE.ts              # Server-Sent Events connection
-│   ├── usePagination.ts       # Infinite scroll logic
-│   ├── useSettings.ts         # Settings persistence
-│   ├── useStats.ts            # Database statistics
-│   └── useTheme.ts            # Theme management
-└── utils/
-    ├── constants.ts           # Configuration constants
-    ├── data.ts                # Data merging and deduplication
-    └── formatters.ts          # Date/time formatting helpers
-```
-
-### Data Flow
-
-```
-┌─────────────────────────────────────────────────────────────┐
-│ Worker Service (port 37777)                                 │
-│  - Express HTTP API                                         │
-│  - SSE endpoint: /stream                                    │
-│  - REST endpoints: /api/*                                   │
-└─────────────────────────────────────────────────────────────┘
-                            ↓
-┌─────────────────────────────────────────────────────────────┐
-│ Viewer UI (React App)                                       │
-│  - useSSE hook: Real-time stream                           │
-│  - usePagination hook: Historical data                     │
-│  - useSettings hook: Persistent preferences                │
-└─────────────────────────────────────────────────────────────┘
-                            ↓
-┌─────────────────────────────────────────────────────────────┐
-│ Feed Component                                              │
-│  - Merges real-time + paginated data                       │
-│  - Deduplicates by ID                                       │
-│  - Filters by selected project                             │
-│  - Infinite scroll triggers pagination                     │
-└─────────────────────────────────────────────────────────────┘
-```
-
-## Features In Detail
-
-### Real-Time Updates (SSE)
-
-The viewer uses Server-Sent Events to receive updates instantly:
-
-```typescript
-// SSE message format
-{
-  "type": "observation" | "summary" | "prompt" | "projects" | "processing",
-  "data": { /* record data */ }
-}
-```
-
-**Event Types:**
-- `observation` - New observation created
-- `summary` - Session summary generated
-- `prompt` - User prompt captured
-- `projects` - Project list updated
-- `processing` - Session processing status changed
-
-**Connection Management:**
-- Auto-reconnect on disconnect with exponential backoff
-- Visual connection status indicator in header
-- Graceful degradation if SSE unavailable
-
-### Infinite Scroll Pagination
-
-The feed loads historical data progressively:
-
-1. **Initial Load**: First 20 records loaded on mount
-2. **Scroll Trigger**: When user scrolls to 80% of feed height
-3. **Batch Load**: Next 20 records fetched via `/api/{type}?offset=X&limit=20`
-4. **Deduplication**: Merges with real-time data, removes duplicates by ID
-5. **Loading State**: Skeleton cards show while fetching
-
-**Performance:**
-- Requests debounced to prevent spam
-- Only visible when scrolled near bottom
-- Continues until no more records available
-
-### Project Filtering
-
-Filter memory stream by specific projects:
-
-1. Projects extracted from observations, summaries, and prompts
-2. Sidebar shows all unique project names with counts
-3. Click project name to filter feed
-4. Click "All Projects" to clear filter
-5. Filter persisted to localStorage
-
-**Project Detection:**
-- Extracted from `projectPath` or `project` field in records
-- Basename of path used as project name
-- Empty/null projects shown as "(No Project)"
-
-### Theme Toggle (v5.1.2)
-
-Three theme modes available:
-
-- **Light Mode**: Clean white background, dark text
-- **Dark Mode**: Dark background, light text (default)
-- **System**: Matches OS preference automatically
-
-**Implementation:**
-```typescript
-// Theme preference stored in localStorage
-localStorage.setItem('theme-preference', 'light' | 'dark' | 'system');
-
-// CSS variables updated dynamically
-document.documentElement.setAttribute('data-theme', resolvedTheme);
-```
-
-**CSS Variables:**
-```css
-:root[data-theme="light"] {
-  --bg-primary: #ffffff;
-  --text-primary: #1f2937;
-  /* ... */
-}
-
-:root[data-theme="dark"] {
-  --bg-primary: #111827;
-  --text-primary: #f9fafb;
-  /* ... */
-}
-```
-
-### Settings Persistence
-
-Settings automatically saved to worker service:
-
-**Saved Settings:**
-- `sidebarOpen` - Sidebar expanded/collapsed state
-- `selectedProject` - Current project filter
-- `theme` - Theme preference (light/dark/system)
-
-**API Endpoints:**
-- `GET /api/settings` - Retrieve saved settings
-- `POST /api/settings` - Save settings (debounced 500ms)
-
-**Local Fallback:**
-- If API unavailable, settings stored in localStorage
-- Synced back to API when connection restored
-
-## Usage Guide
-
-### Opening the Viewer
-
-1. Ensure claude-mem worker is running (auto-starts with Claude Code)
-2. Open browser to http://localhost:37777
-3. Viewer loads automatically with recent records
-
-### Navigating the Feed
-
-**Cards Displayed:**
-- **Observation Cards** (blue accent) - Tool usage observations with title, narrative, concepts, files
-- **Summary Cards** (green accent) - Session summaries with request, completion, learnings
-- **Prompt Cards** (purple accent) - Raw user prompts with timestamp and project
-
-**Card Features:**
-- Click to expand/collapse full details
-- Type indicators (🔴 bugfix, 🟣 feature, 🔄 refactor, etc.)
-- Concept tags (clickable for future filtering)
-- File references with paths
-- Timestamps in relative format ("2 hours ago")
-
-### Using Project Filters
-
-1. **Open Sidebar**: Click hamburger menu (☰) in top-left
-2. **View Stats**: See total observations, sessions, prompts
-3. **Select Project**: Click project name to filter
-4. **View Counts**: Numbers show records per project
-5. **Clear Filter**: Click "All Projects" to reset
-
-### Changing Theme
-
-1. **Open Theme Toggle**: Click theme icon in header
-2. **Select Mode**:
-   - ☀️ Light mode
-   - 🌙 Dark mode
-   - 💻 System (follows OS)
-3. **Auto-Save**: Preference saved immediately
-4. **Smooth Transition**: CSS transitions between themes
-
-### Troubleshooting
-
-**Viewer Not Loading:**
-```bash
-# Check worker status
-npm run worker:logs
-
-# Restart worker
-npm run worker:restart
-
-# Check if port 37777 is available
-lsof -i :37777
-```
-
-**SSE Connection Issues:**
-- Check browser console for connection errors
-- Verify no proxy/firewall blocking EventSource
-- Auto-reconnect attempts every 1-5s with exponential backoff
-
-**Theme Not Persisting:**
-- Check localStorage: `localStorage.getItem('theme-preference')`
-- Verify `/api/settings` endpoint responding
-- Clear browser cache if stale
-
-**Infinite Scroll Not Triggering:**
-- Scroll to 80% of feed height
-- Check browser console for fetch errors
-- Verify `/api/{type}` endpoints responding with data
-
-## Development
-
-### Building the Viewer
-
-```bash
-# Build viewer UI
-npm run build
-
-# Output: plugin/ui/viewer.html (self-contained)
-```
-
-### Adding New Features
-
-**Example: Add a new card component**
-
-1. Create component:
-```typescript
-// src/ui/viewer/components/cards/MyCard.tsx
-export function MyCard({ data }: { data: MyData }) {
-  return (
-    <div className="card">
-      <div className="card-header">{data.title}</div>
-      <div className="card-body">{data.content}</div>
-    </div>
-  );
-}
-```
-
-2. Add to Feed component:
-```typescript
-// src/ui/viewer/components/Feed.tsx
-import { MyCard } from './cards/MyCard';
-
-// In render:
-{myData.map(item => <MyCard key={item.id} data={item} />)}
-```
-
-3. Rebuild:
-```bash
-npm run build
-npm run sync-marketplace
-npm run worker:restart
-```
-
-### Testing Changes
-
-1. Make changes to `src/ui/viewer/`
-2. Rebuild: `npm run build`
-3. Restart worker: `npm run worker:restart`
-4. Refresh browser (http://localhost:37777)
-5. Check browser console for errors
-
-## API Integration
-
-The viewer consumes these worker service endpoints:
-
-### Data Retrieval
-
-```typescript
-// Get paginated observations
-GET /api/observations?offset=0&limit=20&project=myproject
-Response: { observations: Observation[], hasMore: boolean }
-
-// Get paginated summaries
-GET /api/summaries?offset=0&limit=20&project=myproject
-Response: { summaries: Summary[], hasMore: boolean }
-
-// Get paginated prompts
-GET /api/prompts?offset=0&limit=20&project=myproject
-Response: { prompts: UserPrompt[], hasMore: boolean }
-
-// Get database stats
-GET /api/stats
-Response: { totalObservations: number, totalSessions: number, ... }
-```
-
-### Real-Time Stream
-
-```typescript
-// Server-Sent Events stream
-GET /stream
-
-// Message format:
-event: observation
-data: {"type":"observation","data":{...}}
-
-event: summary
-data: {"type":"summary","data":{...}}
-```
-
-### Settings
-
-```typescript
-// Get settings
-GET /api/settings
-Response: { sidebarOpen: boolean, selectedProject: string, ... }
-
-// Save settings
-POST /api/settings
-Body: { sidebarOpen: boolean, selectedProject: string, ... }
-Response: { success: boolean }
-```
-
-## Performance Considerations
-
-### Bundle Size
-- Self-contained HTML: ~150KB (gzipped)
-- No external dependencies loaded at runtime
-- Monaspace Radon font embedded (subset)
-
-### Memory Management
-- Virtualization: Only renders visible cards
-- Deduplication: Prevents duplicate records in memory
-- Cleanup: Old records beyond pagination limit pruned
-
-### Network Efficiency
-- SSE: Single long-lived connection for real-time updates
-- REST: Paginated requests (20 records per batch)
-- Debouncing: Settings saves debounced 500ms
-
-### Rendering Performance
-- React.memo: Cards memoized to prevent unnecessary re-renders
-- useMemo: Data merging/filtering memoized
-- CSS transitions: GPU-accelerated for smooth animations
-
-## Future Enhancements
-
-Potential features for future versions:
-
-- **Search**: Full-text search across observations, summaries, prompts
-- **Export**: Download data as JSON, CSV, or markdown
-- **Charts**: Visualize observation frequency, types, concepts over time
-- **Keyboard Shortcuts**: Navigate feed, toggle sidebar, switch themes
-- **Notifications**: Browser notifications for important observations
-- **Dark/Light Auto-Schedule**: Auto-switch theme based on time of day
-- **Custom Themes**: User-defined color schemes
-- **Multi-Project Views**: Compare multiple projects side-by-side
-
-## Resources
-
-- **Source Code**: `src/ui/viewer/`
-- **Built Output**: `plugin/ui/viewer.html`
-- **Worker Service**: `src/services/worker-service.ts`
-- **Build Script**: `scripts/build-viewer.js`
-- **Documentation**: This file
-
----
-
-**Built with React + TypeScript** | **Powered by Server-Sent Events** | **Self-Contained HTML Bundle**
diff --git a/docs/worker-refactor-plan.md b/docs/worker-refactor-plan.md
deleted file mode 100644
index 3cd6f1f5..00000000
--- a/docs/worker-refactor-plan.md
+++ /dev/null
@@ -1,303 +0,0 @@
-# Worker Service Refactor Plan
-
-**Date**: 2025-11-06
-**Based on**: worker-service-analysis.md
-**Branch**: cleanup/worker
-
----
-
-## Decisions Made
-
-### 🔥🔥🔥🔥🔥 Critical Fixes
-
-#### Issue #1: Fragile PM2 String Parsing
-**Decision**: DELETE all PM2 status checking code
-- Remove lines 54-98 in worker-utils.ts (PM2 list parsing)
-- Replace with simple: health check → if unhealthy, restart → wait for health
-- PM2 restart is idempotent - handles "not started" and "started but broken"
-- Rationale: "Just ping localhost:37777" - if unhealthy, restart it
-
-#### Issue #2: Silent PM2 Error Handling
-**Decision**: AUTOMATICALLY RESOLVED by Issue #1
-- Gets deleted with PM2 status checking code
-- New approach naturally fails fast on execSync
-
-#### Issue #3: Session Auto-Creation Duplication
-**Decision**: EXTRACT to helper method
-- Create `private getOrCreateSession(sessionDbId): ActiveSession`
-- Remove 60+ lines of duplicated code from:
-  - handleInit() (lines 663-733)
-  - handleObservation() (lines 754-785)
-  - handleSummarize() (lines 813-844)
-- Rationale: DRY principle
-
-#### Issue #4: No "Running But Unhealthy" Handling
-**Decision**: AUTOMATICALLY RESOLVED by Issue #1
-- New approach always restarts if unhealthy
-- PM2 restart handles all cases
-
-#### Issue #5: Useless getWorkerPort() Wrapper
-**Decision**: CREATE proper settings reader
-- Delete the wrapper function
-- Create settings reader that:
-  1. Reads from `~/.claude-mem/settings.json`
-  2. Falls back to `process.env.CLAUDE_MEM_WORKER_PORT`
-  3. Falls back to `37777`
-- Rationale: UI writes to `~/.claude-mem/settings.json`, worker/hooks must read from there
-
----
-
-### 🔥🔥🔥 Cleanup
-
-#### Issue #6: 1500ms Debounce Too Long
-**Decision**: SKIP - not a concern
-
-#### Issue #7: Magic Numbers Throughout
-**Decision**: DELETE unnecessary magic numbers, UNIFY required ones
-- Remove hardcoded defaults that aren't needed
-- Centralize remaining constants with named variables
-- Locations:
-  - worker-utils.ts: timeout values (100ms, 1000ms, 10000ms)
-  - worker-service.ts: Line 997 (100ms), Line 109 ('50mb'), etc.
-
-#### Issue #8: Configuration Duplication
-**Decision**: AUTOMATICALLY RESOLVED by Issue #7
-- Centralizing constants solves this
-
-#### Issue #9: Hardcoded Model Validation
-**Decision**: AUTOMATICALLY RESOLVED by Issue #7
-- Delete hardcoded model list
-- Let SDK handle validation
-
-#### Issue #10: Hardcoded Version Fallback
-**Decision**: READ from package.json
-- Line 343: Replace `'5.0.3'` with dynamic read from package.json
-- Rationale: Why hardcode a version that gets stale?
-
-#### Issue #11: Unnecessary this.port Instance Variable
-**Decision**: DELETE `this.port`
-- worker-service.ts:100 - remove instance variable
-- Replace all `this.port` uses with direct constant/settings reader
-- Used at lines 351, 738, 742
-
----
-
-## Implementation Plan
-
-### Phase 1: worker-utils.ts Complete Rewrite
-
-**File**: `src/shared/worker-utils.ts`
-
-**Changes**:
-1. Create settings reader function:
-```typescript
-function getWorkerPort(): number {
-  try {
-    const settingsPath = join(homedir(), '.claude-mem', 'settings.json');
-    if (existsSync(settingsPath)) {
-      const settings = JSON.parse(readFileSync(settingsPath, 'utf-8'));
-      const port = parseInt(settings.env?.CLAUDE_MEM_WORKER_PORT, 10);
-      if (!isNaN(port)) return port;
-    }
-  } catch {}
-  return parseInt(process.env.CLAUDE_MEM_WORKER_PORT || '37777', 10);
-}
-```
-
-2. Add named constants:
-```typescript
-const HEALTH_CHECK_TIMEOUT_MS = 100;
-const HEALTH_CHECK_POLL_INTERVAL_MS = 100;
-const HEALTH_CHECK_MAX_WAIT_MS = 10000;
-```
-
-3. Simplify `ensureWorkerRunning()`:
-```typescript
-export async function ensureWorkerRunning(): Promise<void> {
-  if (await isWorkerHealthy()) return;
-
-  const packageRoot = getPackageRoot();
-  const pm2Path = path.join(packageRoot, "node_modules", ".bin", "pm2");
-  const ecosystemPath = path.join(packageRoot, "ecosystem.config.cjs");
-
-  execSync(`"${pm2Path}" restart "${ecosystemPath}"`, {
-    cwd: packageRoot,
-    stdio: 'pipe'
-  });
-
-  if (!await waitForWorkerHealth()) {
-    throw new Error("Worker failed to become healthy after restart");
-  }
-}
-```
-
-4. Update `isWorkerHealthy()` and `waitForWorkerHealth()` to use constants
-
-**Result**: ~50 lines (vs 110 original), all bugs fixed
-
----
-
-### Phase 2: worker-service.ts Cleanup
-
-**File**: `src/services/worker-service.ts`
-
-**Changes**:
-
-1. **Read version from package.json** (line 343):
-```typescript
-import { readFileSync } from 'fs';
-import { join, dirname } from 'path';
-import { fileURLToPath } from 'url';
-
-const __filename = fileURLToPath(import.meta.url);
-const __dirname = dirname(__filename);
-const packageJson = JSON.parse(readFileSync(join(__dirname, '../../package.json'), 'utf-8'));
-const VERSION = packageJson.version;
-```
-
-2. **Extract getOrCreateSession() helper**:
-```typescript
-private getOrCreateSession(sessionDbId: number): ActiveSession {
-  let session = this.sessions.get(sessionDbId);
-  if (session) return session;
-
-  const db = new SessionStore();
-  const dbSession = db.getSessionById(sessionDbId);
-  if (!dbSession) {
-    db.close();
-    throw new Error(`Session ${sessionDbId} not found in database`);
-  }
-
-  session = {
-    sessionDbId,
-    claudeSessionId: dbSession.claude_session_id,
-    sdkSessionId: null,
-    project: dbSession.project,
-    userPrompt: dbSession.user_prompt,
-    pendingMessages: [],
-    abortController: new AbortController(),
-    generatorPromise: null,
-    lastPromptNumber: 0,
-    startTime: Date.now()
-  };
-
-  this.sessions.set(sessionDbId, session);
-
-  session.generatorPromise = this.runSDKAgent(session).catch(err => {
-    logger.failure('WORKER', 'SDK agent error', { sessionId: sessionDbId }, err);
-    const db = new SessionStore();
-    db.markSessionFailed(sessionDbId);
-    db.close();
-    this.sessions.delete(sessionDbId);
-  });
-
-  db.close();
-  return session;
-}
-```
-
-3. **Update handleInit(), handleObservation(), handleSummarize()**:
-Replace duplication with single line:
-```typescript
-const session = this.getOrCreateSession(sessionDbId);
-```
-
-4. **Delete model validation** (lines 407+):
-Remove hardcoded validModels array and validation check
-
-5. **Delete this.port instance variable** (line 100):
-- Remove `private port: number = FIXED_PORT;`
-- Replace all `this.port` references with `FIXED_PORT` or settings reader
-
-6. **Add named constants** at top of file:
-```typescript
-const MESSAGE_POLL_INTERVAL_MS = 100;
-const MAX_REQUEST_SIZE = '50mb';
-```
-
-7. **Use named constants** throughout (lines 109, 997, etc.)
-
----
-
-### Phase 3: Update Hooks
-
-**Files**:
-- `src/hooks/new-hook.ts`
-- `src/hooks/save-hook.ts`
-- `src/hooks/summary-hook.ts`
-- `src/hooks/cleanup-hook.ts`
-
-**Changes**:
-1. Import settings reader from worker-utils
-2. Replace `const FIXED_PORT = parseInt(process.env.CLAUDE_MEM_WORKER_PORT || '37777', 10);`
-   with call to settings reader
-3. Update cleanup-hook.ts line 74 to use settings reader as fallback
-
----
-
-### Phase 4: Update user-message-hook.ts
-
-**File**: `src/hooks/user-message-hook.ts`
-
-**Changes**:
-- Line 53: Replace hardcoded `http://localhost:37777/` with dynamic port from settings reader
-
----
-
-## Files Changed
-
-1. `src/shared/worker-utils.ts` - Complete rewrite (~50 lines)
-2. `src/services/worker-service.ts` - Major cleanup (remove ~60 lines duplication, add helper)
-3. `src/hooks/new-hook.ts` - Use settings reader
-4. `src/hooks/save-hook.ts` - Use settings reader
-5. `src/hooks/summary-hook.ts` - Use settings reader
-6. `src/hooks/cleanup-hook.ts` - Use settings reader
-7. `src/hooks/user-message-hook.ts` - Dynamic port in message
-
----
-
-## Testing Checklist
-
-After implementation:
-
-- [ ] Build: `npm run build`
-- [ ] Sync: `npm run sync-marketplace`
-- [ ] Restart worker: `npm run worker:restart`
-- [ ] Start new Claude Code session (hooks should work)
-- [ ] Change port in UI settings to 38888
-- [ ] Restart worker
-- [ ] Verify worker binds to 38888
-- [ ] Verify hooks connect to 38888
-- [ ] Verify UI connects to 38888
-- [ ] Change port back to 37777
-- [ ] Test all endpoints work
-
----
-
-## Expected Outcomes
-
-**Lines Removed**: ~130 lines (60 from duplication, 70 from PM2 parsing)
-**Lines Added**: ~50 lines (helper method, settings reader, constants)
-**Net Change**: -80 lines
-
-**Bugs Fixed**:
-- ✅ PM2 string parsing false positives
-- ✅ Silent error handling
-- ✅ No restart when unhealthy
-- ✅ Port configuration not synchronized with UI
-
-**Code Quality**:
-- ✅ DRY principle applied (no duplication)
-- ✅ YAGNI principle applied (removed ceremony)
-- ✅ Fail fast error handling
-- ✅ Named constants instead of magic numbers
-- ✅ Single source of truth for configuration
-
----
-
-## Notes
-
-- This plan addresses all Severity 5 and Severity 4 issues from the analysis
-- Skipped Severity 2 issues that aren't actual problems (debounce timing)
-- All "automatically resolved" issues are covered by the main fixes
-- Settings synchronization bug (port not working) is now fixed
diff --git a/docs/worker-server-architecture.md b/docs/worker-server-architecture.md
deleted file mode 100644
index 506acd42..00000000
--- a/docs/worker-server-architecture.md
+++ /dev/null
@@ -1,1129 +0,0 @@
-# Claude-Mem Worker Server Architecture
-
-**Document Version:** 1.0
-**Last Updated:** 2025-01-24
-**Author:** Analysis by Claude Code
-**Purpose:** Comprehensive technical analysis of the worker server architecture, logic flow, blocking behavior, and component value assessment
-
----
-
-## Executive Summary
-
-The claude-mem worker server is a long-running HTTP service managed by PM2 that processes tool execution observations and generates session summaries using the Claude Agent SDK. It implements a **defensive, layered architecture** designed to maximize data persistence while maintaining flexibility.
-
-### Key Design Principles
-
-1. **Maximally Permissive Storage** - System defaults to saving data even if incomplete
-2. **Auto-Recovery** - Worker restarts don't prevent processing (session state reconstructed from database)
-3. **Queue-Based Processing** - HTTP API decoupled from AI processing for reliability
-4. **Defensive Programming** - Auto-creates missing database records, accepts null fields
-5. **Session Isolation** - Each session has independent state and SDK agent
-
-### Architecture at a Glance
-
-```
-┌─────────────────────────────────────────────────────────────┐
-│ Layer 1: HTTP API (Express.js)                              │
-│ - 6 REST endpoints                                          │
-│ - Always queues messages (maximally permissive)             │
-└──────────────────┬──────────────────────────────────────────┘
-                   │
-┌──────────────────▼──────────────────────────────────────────┐
-│ Layer 2: In-Memory Queue                                    │
-│ - pendingMessages array per session                         │
-│ - VULNERABILITY: Lost on worker restart                     │
-└──────────────────┬──────────────────────────────────────────┘
-                   │
-┌──────────────────▼──────────────────────────────────────────┐
-│ Layer 3: SDK Agent (Claude Agent SDK)                       │
-│ - Processes queued messages via async generator             │
-│ - Can fail due to config or AI errors                       │
-└──────────────────┬──────────────────────────────────────────┘
-                   │
-┌──────────────────▼──────────────────────────────────────────┐
-│ Layer 4: Parser (XML Extraction)                            │
-│ - Extracts observations and summaries from AI responses     │
-│ - Permissive (v4.2.5/v4.2.6 fixes ensure partial data saved)│
-└──────────────────┬──────────────────────────────────────────┘
-                   │
-┌──────────────────▼──────────────────────────────────────────┐
-│ Layer 5: Database (SQLite with better-sqlite3)              │
-│ - Permanent storage (once here, data persists)              │
-│ - Auto-creates missing sessions, accepts nulls              │
-└─────────────────────────────────────────────────────────────┘
-```
-
-**Critical Insight:** Data can only be lost between layers 2-4. Once it reaches the database (layer 5), it's permanent.
-
----
-
-## Component Inventory
-
-### HTTP REST API Endpoints
-
-| Endpoint | Purpose | Blocks Data? |
-|----------|---------|--------------|
-| `GET /health` | Worker health check | N/A |
-| `POST /sessions/:id/init` | Initialize session and start SDK agent | Only if session not in DB (expected) |
-| `POST /sessions/:id/observations` | Queue tool observation | ❌ Never (auto-recovery) |
-| `POST /sessions/:id/summarize` | Queue summary request | ❌ Never (auto-recovery) |
-| `GET /sessions/:id/status` | Get session status | N/A |
-| `DELETE /sessions/:id` | Abort session | ⚠️ Queued messages lost |
-
-### Core Processing Components
-
-| Component | File | Lines | Purpose |
-|-----------|------|-------|---------|
-| WorkerService | worker-service.ts | 52-590 | Main service class, manages sessions |
-| runSDKAgent | worker-service.ts | 345-404 | Runs SDK agent for a session |
-| createMessageGenerator | worker-service.ts | 410-502 | Async generator feeding SDK |
-| handleAgentMessage | worker-service.ts | 508-563 | Parses and stores SDK responses |
-| parseObservations | parser.ts | 32-96 | Extracts observations from XML |
-| parseSummary | parser.ts | 102-157 | Extracts summary from XML |
-| SessionStore | SessionStore.ts | 9-1086 | Database operations |
-
----
-
-## Deep Dive: HTTP Endpoints
-
-### GET /health (lines 100-109)
-
-**Purpose:** Health check for monitoring and debugging
-
-**Logic Flow:**
-1. Returns JSON with status, port, PID, active sessions, uptime, memory
-
-**Blocking Analysis:** ❌ N/A (read-only endpoint)
-
-**Value Assessment:** ✅ HIGH VALUE
-- Essential for monitoring worker health
-- Helps debug port conflicts and process state
-- Keep as-is
-
----
-
-### POST /sessions/:sessionDbId/init (lines 115-169)
-
-**Purpose:** Initialize a new session and start the SDK agent
-
-**Logic Flow:**
-1. Parse `sessionDbId` from URL
-2. Extract `project` and `userPrompt` from request body
-3. Fetch session from database using `SessionStore.getSessionById()`
-4. **CRITICAL CHECK:** Return 404 if session not found in DB
-5. Retrieve `claudeSessionId` from database record
-6. Create `ActiveSession` object with initial state:
-   ```typescript
-   {
-     sessionDbId, claudeSessionId, sdkSessionId: null,
-     project, userPrompt, pendingMessages: [],
-     abortController: new AbortController(),
-     generatorPromise: null, lastPromptNumber: 0,
-     observationCounter: 0, startTime: Date.now()
-   }
-   ```
-7. Store session in memory map (`this.sessions`)
-8. Update `worker_port` in database
-9. Start `runSDKAgent()` in background (fire-and-forget promise)
-10. Return success response immediately
-
-**Blocking Analysis:** ⚠️ CONDITIONAL
-- Returns 404 if session doesn't exist in database
-- This is expected behavior - session must be created before init
-- Doesn't prevent future initialization attempts
-- Error logged and hook can retry
-
-**Value Assessment:** ✅ HIGH VALUE
-- Critical initialization step
-- Background SDK agent startup prevents timeout
-- Keep as-is
-
-**Edge Cases:**
-- Session exists but SDK agent fails to start → Session marked as failed, but new init can retry
-- Multiple init calls for same session → First one wins (subsequent calls find session in memory)
-
----
-
-### POST /sessions/:sessionDbId/observations (lines 175-230)
-
-**Purpose:** Queue a tool execution observation for processing
-
-**Logic Flow:**
-1. Parse `sessionDbId` from URL
-2. Extract `tool_name`, `tool_input`, `tool_output`, `prompt_number` from body
-3. Check if session exists in memory map (`this.sessions.get(sessionDbId)`)
-4. **AUTO-RECOVERY** (lines 181-209): If session NOT in memory:
-   - Fetch session from database
-   - Recreate `ActiveSession` object
-   - Start new SDK agent in background
-   - This enables recovery from worker restarts!
-5. Increment `observationCounter` for correlation ID tracking
-6. Push observation message to `pendingMessages` queue:
-   ```typescript
-   {
-     type: 'observation',
-     tool_name, tool_input, tool_output, prompt_number
-   }
-   ```
-7. Return success with queue length
-
-**Blocking Analysis:** ❌ NEVER BLOCKS
-- Auto-creates session state from database if missing
-- Always queues the observation
-- HTTP response confirms receipt immediately
-- Processing happens asynchronously
-
-**Value Assessment:** ✅ HIGH VALUE
-- Auto-recovery is brilliant design
-- Worker restart doesn't lose ability to process observations
-- Keep as-is
-
-**Edge Cases:**
-- Worker restart while observation in queue → Lost (queue is in-memory)
-- But NEW observations after restart are queued successfully (auto-recovery)
-- Database not found → Would throw error, but SessionStore auto-creates sessions
-
----
-
-### POST /sessions/:sessionDbId/summarize (lines 236-284)
-
-**Purpose:** Queue a summary generation request
-
-**Logic Flow:**
-1. Parse `sessionDbId` and `prompt_number` from request
-2. Check if session exists in memory
-3. **AUTO-RECOVERY** (lines 241-270): Same pattern as observations endpoint
-   - Fetches session from database
-   - Recreates `ActiveSession` object
-   - Starts new SDK agent
-4. Push summarize message to `pendingMessages` queue:
-   ```typescript
-   {
-     type: 'summarize',
-     prompt_number
-   }
-   ```
-5. Return success with queue length
-
-**Blocking Analysis:** ❌ NEVER BLOCKS
-- Same auto-recovery mechanism as observations
-- Always queues the summary request
-- Processing happens asynchronously
-
-**Value Assessment:** ✅ HIGH VALUE
-- Auto-recovery pattern prevents data loss
-- Keep as-is
-
-**Code Quality Note:** ⚠️ MEDIUM - Duplicated auto-recovery code (lines 181-209 and 241-270 are nearly identical)
-- Could extract to helper function: `getOrCreateSession(sessionDbId)`
-- Would reduce duplication and improve maintainability
-
----
-
-### GET /sessions/:sessionDbId/status (lines 289-304)
-
-**Purpose:** Get current session status and queue length
-
-**Logic Flow:**
-1. Parse `sessionDbId` from URL
-2. Get session from memory map
-3. Return 404 if not found
-4. Return session info: `sessionDbId`, `sdkSessionId`, `project`, `pendingMessages.length`
-
-**Blocking Analysis:** ❌ N/A (read-only endpoint)
-
-**Value Assessment:** ✅ MEDIUM VALUE
-- Useful for debugging
-- Not critical for core functionality
-- Keep as-is
-
----
-
-### DELETE /sessions/:sessionDbId (lines 309-340)
-
-**Purpose:** Abort a running session and clean up
-
-**Logic Flow:**
-1. Parse `sessionDbId` from URL
-2. Get session from memory map
-3. Return 404 if not found
-4. Call `abortController.abort()` to signal SDK agent to stop
-5. Wait for `generatorPromise` to finish (max 5 seconds timeout)
-6. Mark session as 'failed' in database
-7. Delete session from memory map
-8. Return success
-
-**Blocking Analysis:** ⚠️ BLOCKS QUEUED MESSAGES
-- Aborts SDK agent processing
-- Any messages in `pendingMessages` queue are lost
-- Already-stored observations/summaries remain in database
-
-**Value Assessment:** ✅ MEDIUM VALUE
-- Provides clean shutdown mechanism
-- Used for manual cleanup
-- As of v4.1.0, SessionEnd hook doesn't call DELETE (graceful cleanup)
-- Keep for manual intervention, but not used automatically
-
-**Historical Note:**
-- v4.0.x: SessionEnd hook called DELETE → interrupted summary generation
-- v4.1.0+: Graceful cleanup → workers finish naturally
-
----
-
-## Deep Dive: SDK Agent Processing
-
-### runSDKAgent (lines 345-404)
-
-**Purpose:** Core processing engine that runs continuously for each session
-
-**Logic Flow:**
-1. Call `query()` from Claude Agent SDK with:
-   ```typescript
-   {
-     prompt: this.createMessageGenerator(session),
-     options: {
-       model: MODEL,  // from CLAUDE_MEM_MODEL env var
-       disallowedTools: DISALLOWED_TOOLS,
-       abortController: session.abortController,
-       pathToClaudeCodeExecutable: claudePath
-     }
-   }
-   ```
-2. Iterate over SDK responses using `for await`
-3. For each assistant message:
-   - Extract text content from response
-   - Log response size
-   - Call `handleAgentMessage()` to parse and store
-4. On completion:
-   - Log session duration
-   - Mark session as 'completed' in database
-   - Delete session from memory map
-5. On error:
-   - Log error (or warning for AbortError)
-   - Mark session as 'failed' in database
-   - Throw error (caught by `generatorPromise.catch()`)
-
-**Blocking Analysis:** ⚠️ CAN BLOCK IF:
-- Invalid `CLAUDE_MEM_MODEL` → SDK initialization fails
-- Invalid `CLAUDE_CODE_PATH` → SDK initialization fails
-- SDK crashes → Session marked as failed
-- BUT: Doesn't prevent NEW sessions from being created
-
-**Value Assessment:** ✅ HIGH VALUE
-- Core processing engine
-- Proper error handling with session status tracking
-- Keep as-is
-
-**Configuration Dependencies:**
-- `CLAUDE_MEM_MODEL` (default: 'claude-sonnet-4-5')
-- `CLAUDE_CODE_PATH` (default: specific path to claude executable)
-- `DISALLOWED_TOOLS` (default: ['Glob', 'Grep', 'ListMcpResourcesTool', 'WebSearch'])
-
-**Edge Cases:**
-- SDK hangs → AbortController provides cancellation mechanism
-- Network issues → SDK handles retries internally
-- Multiple concurrent sessions → Each has isolated SDK agent
-
----
-
-### createMessageGenerator (lines 410-502)
-
-**Purpose:** Async generator that feeds messages to the SDK agent (bridge between HTTP API and SDK)
-
-**Logic Flow:**
-1. Build init prompt using `buildInitPrompt(project, claudeSessionId, userPrompt)`
-2. Yield initial user message:
-   ```typescript
-   {
-     type: 'user',
-     session_id: claudeSessionId,  // Real Claude Code session ID
-     parent_tool_use_id: null,
-     message: { role: 'user', content: initPrompt }
-   }
-   ```
-3. Enter infinite loop (`while (true)`):
-   - Check `abortController.signal.aborted` → break if aborted
-   - If no pending messages → sleep 100ms and continue
-   - While `pendingMessages.length > 0`:
-     - Shift message from queue (FIFO)
-     - If type === 'summarize':
-       - Update `lastPromptNumber`
-       - Fetch session from database
-       - Build summary prompt using `buildSummaryPrompt()`
-       - Yield summary prompt as user message
-     - If type === 'observation':
-       - Update `lastPromptNumber`
-       - Build observation prompt using `buildObservationPrompt()`
-       - Yield observation prompt as user message
-
-**Blocking Analysis:** ❌ DOESN'T BLOCK
-- Continuously processes queue until aborted
-- 100ms polling means small delay but no data loss
-- Messages shifted from queue and sent to SDK
-- If SDK fails, messages lost from queue (but already confirmed via HTTP)
-
-**Value Assessment:** ✅ HIGH VALUE
-- Elegant async generator pattern
-- Keep as-is
-
-**Performance Note:** ⚠️ 100ms polling interval
-- Could be improved with event-driven queue (e.g., `AsyncQueue` with notifications)
-- Current implementation is simple and works well
-- Low priority optimization
-
-**Data Flow:**
-```
-HTTP /observations → pendingMessages.push() → [sleep 100ms] →
-pendingMessages.shift() → buildObservationPrompt() → yield to SDK →
-SDK processes → handleAgentMessage()
-```
-
----
-
-### handleAgentMessage (lines 508-563)
-
-**Purpose:** Parse SDK response and store observations/summaries in database
-
-**Logic Flow:**
-1. Call `parseObservations(content, correlationId)`
-2. If observations found:
-   - For each observation:
-     - Call `db.storeObservation(claudeSessionId, project, observation, promptNumber)`
-     - Log success with correlation ID
-3. Call `parseSummary(content, sessionId)`
-4. If summary found:
-   - Call `db.storeSummary(claudeSessionId, project, summary, promptNumber)`
-   - Log success
-5. If NO summary found:
-   - Log warning with content sample
-
-**Blocking Analysis:** ⚠️ CAN BLOCK IF:
-- Parser returns empty array/null → Nothing stored (but this is expected for routine operations)
-- Database error → Would throw and crash handler (rare with permissive schema)
-
-**Value Assessment:** ✅ HIGH VALUE
-- Core storage logic
-- Proper logging for debugging
-- Keep as-is
-
-**Critical Dependencies:**
-- `parseObservations()` must return valid observations
-- `parseSummary()` must return valid summary
-- Database must accept the data (schema constraints)
-
-**Logging:**
-- Extensive logging at INFO, SUCCESS, and WARN levels
-- Correlation IDs for tracking individual observations
-- Debug mode logs full SDK responses
-
----
-
-## Deep Dive: Parser System
-
-### parseObservations (parser.ts lines 32-96)
-
-**Purpose:** Extract observation XML blocks from SDK response and parse into structured data
-
-**Logic Flow:**
-1. Use regex to find all `<observation>...</observation>` blocks (non-greedy):
-   ```typescript
-   /<observation>([\s\S]*?)<\/observation>/g
-   ```
-2. For each block:
-   - Extract all fields: `type`, `title`, `subtitle`, `narrative`, `facts`, `concepts`, `files_read`, `files_modified`
-   - **VALIDATION** (lines 52-67):
-     - If `type` is missing or invalid → default to "change"
-     - Valid types: `['bugfix', 'feature', 'refactor', 'change', 'discovery', 'decision']`
-     - All other fields can be null
-   - Filter out `type` from `concepts` array (types and concepts are separate dimensions)
-   - Push observation to results array
-3. Return all observations
-
-**Blocking Analysis:** ❌ NEVER BLOCKS (as of v4.2.6)
-- **CRITICAL FIX** (v4.2.6): Removed validation that required title, subtitle, and narrative
-- Comment on line 52: "NOTE FROM THEDOTMACK: ALWAYS save observations - never skip. 10/24/2025"
-- Always returns observations with whatever fields exist
-- Only transformation: type defaults to "change" if invalid
-
-**Value Assessment:** ✅ HIGH VALUE
-- Permissive parsing ensures data is never lost
-- v4.2.6 fix was critical for reliability
-- Keep as-is
-
-**Historical Context:**
-- **Before v4.2.6:** Would skip observations missing required fields → data loss
-- **After v4.2.6:** Always saves with defaults → maximally permissive
-
-**Edge Cases:**
-1. No `<observation>` tags → Returns empty array (normal for routine operations)
-2. All fields empty → Returns observation with null fields and type="change"
-3. Malformed XML → Regex won't match → Returns empty array (data loss)
-4. Type in concepts → Filtered out (types and concepts are orthogonal)
-
-**Example:**
-```xml
-<observation>
-  <type>feature</type>
-  <title>Authentication added</title>
-  <subtitle>Implemented OAuth2 flow</subtitle>
-  <facts>
-    <fact>Added OAuth2 provider configuration</fact>
-    <fact>Created callback endpoint</fact>
-  </facts>
-  <narrative>Full OAuth2 authentication...</narrative>
-  <concepts>
-    <concept>how-it-works</concept>
-    <concept>what-changed</concept>
-  </concepts>
-  <files_read>
-    <file>src/auth/oauth.ts</file>
-  </files_read>
-  <files_modified>
-    <file>src/auth/oauth.ts</file>
-  </files_modified>
-</observation>
-```
-
----
-
-### parseSummary (parser.ts lines 102-157)
-
-**Purpose:** Extract summary XML block from SDK response
-
-**Logic Flow:**
-1. Check for `<skip_summary reason="..."/>` tag (lines 104-113)
-   - If found → log reason and return null (intentional skip)
-2. Match `<summary>...</summary>` block (non-greedy):
-   ```typescript
-   /<summary>([\s\S]*?)<\/summary>/
-   ```
-   - If not found → return null (SDK didn't provide summary)
-3. Extract all fields: `request`, `investigated`, `learned`, `completed`, `next_steps`, `notes` (optional)
-4. **VALIDATION REMOVED** (lines 133-147):
-   - Comment: "NOTE FROM THEDOTMACK: 100% of the time we must SAVE the summary, even if fields are missing. 10/24/2025"
-   - Comment: "NEVER DO THIS NONSENSE AGAIN."
-   - Old code checked if all required fields present → would return null
-   - New code returns summary with whatever fields exist
-5. Return `ParsedSummary` object
-
-**Blocking Analysis:** ⚠️ MINIMAL BLOCKING (as of v4.2.5)
-- `<skip_summary>` tag → Returns null (intentional, not a bug)
-- Missing `<summary>` tags → Returns null (SDK didn't provide)
-- Missing fields within `<summary>` → Does NOT block anymore (v4.2.5 fix)
-
-**Value Assessment:** ✅ HIGH VALUE
-- v4.2.5 fix ensures partial summaries are saved
-- Keep as-is
-
-**Historical Context:**
-- **Before v4.2.5:** Would return null if any required field missing → data loss
-- **After v4.2.5:** Returns summary with whatever fields exist → maximally permissive
-
-**Edge Cases:**
-1. `<skip_summary reason="not enough data"/>` → Returns null, logs reason
-2. No `<summary>` tags → Returns null (SDK didn't generate summary)
-3. `<summary>` with all empty fields → Returns summary with empty/null strings
-4. Malformed XML → Regex won't match → Returns null (data loss)
-
-**Example:**
-```xml
-<summary>
-  <request>Add OAuth2 authentication</request>
-  <investigated>Reviewed existing auth system</investigated>
-  <learned>System uses JWT tokens for sessions</learned>
-  <completed>Implemented OAuth2 provider integration</completed>
-  <next_steps>Test with production credentials</next_steps>
-  <notes>Need to configure callback URLs in provider dashboard</notes>
-</summary>
-```
-
----
-
-## Deep Dive: Database Layer
-
-### SessionStore.storeObservation (SessionStore.ts lines 901-964)
-
-**Purpose:** Store a parsed observation in the database
-
-**Logic Flow:**
-1. **AUTO-CREATE SESSION** (lines 920-940):
-   - Check if `sdk_session_id` exists in `sdk_sessions` table
-   - If NOT found:
-     - Auto-create session record
-     - Log: "Auto-created session record for session_id: {id}"
-   - This prevents foreign key constraint errors
-2. Prepare INSERT statement:
-   ```sql
-   INSERT INTO observations
-   (sdk_session_id, project, type, title, subtitle, facts, narrative,
-    concepts, files_read, files_modified, prompt_number, created_at, created_at_epoch)
-   VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
-   ```
-3. Insert observation with:
-   - `facts`, `concepts`, `files_read`, `files_modified` → JSON.stringify()
-   - Timestamps auto-generated
-   - All fields as-is (nulls allowed)
-
-**Blocking Analysis:** ❌ NEVER BLOCKS
-- Auto-creates missing sessions (defensive programming)
-- All fields nullable (except required ones)
-- No validation checks that could fail
-- Schema is permissive
-
-**Value Assessment:** ✅ HIGH VALUE
-- Auto-creation pattern is brilliant
-- Prevents foreign key errors
-- Keep as-is
-
-**Schema Constraints:**
-- `type` must be one of 6 valid types (CHECK constraint)
-  - BUT: Parser ensures type is always valid (defaults to "change")
-- `sdk_session_id` has foreign key to `sdk_sessions`
-  - BUT: Auto-creation ensures session exists
-- Arrays stored as JSON strings
-
-**Edge Cases:**
-- Session doesn't exist → Auto-created
-- Invalid type → Parser prevents this (defaults to "change")
-- Null fields → Allowed by schema
-
----
-
-### SessionStore.storeSummary (SessionStore.ts lines 970-1029)
-
-**Purpose:** Store a parsed summary in the database
-
-**Logic Flow:**
-1. **AUTO-CREATE SESSION** (lines 987-1007):
-   - Same defensive pattern as `storeObservation()`
-   - Ensures session exists before INSERT
-2. Prepare INSERT statement:
-   ```sql
-   INSERT INTO session_summaries
-   (sdk_session_id, project, request, investigated, learned, completed,
-    next_steps, notes, prompt_number, created_at, created_at_epoch)
-   VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
-   ```
-3. Insert summary with:
-   - All content fields as-is (nulls allowed)
-   - Timestamps auto-generated
-
-**Blocking Analysis:** ❌ NEVER BLOCKS
-- Auto-creates missing sessions
-- All content fields nullable
-- No validation checks
-- Multiple summaries per session allowed (migration 7 removed UNIQUE constraint)
-
-**Value Assessment:** ✅ HIGH VALUE
-- Auto-creation ensures reliability
-- Nullable fields allow partial data
-- Keep as-is
-
-**Schema Evolution:**
-- **Before migration 7:** `sdk_session_id` had UNIQUE constraint → Only one summary per session
-- **After migration 7:** UNIQUE removed → Multiple summaries per session (one per prompt)
-
-**Edge Cases:**
-- Session doesn't exist → Auto-created
-- All fields null/empty → Allowed
-- Multiple summaries for same session → Allowed (migration 7)
-
----
-
-### Database Schema Constraints
-
-#### observations table
-```sql
-CREATE TABLE observations (
-  id INTEGER PRIMARY KEY AUTOINCREMENT,
-  sdk_session_id TEXT NOT NULL,  -- Foreign key
-  project TEXT NOT NULL,
-  text TEXT,  -- Nullable (deprecated, migration 9)
-  type TEXT NOT NULL CHECK(type IN ('decision', 'bugfix', 'feature', 'refactor', 'discovery', 'change')),
-  title TEXT,  -- Nullable
-  subtitle TEXT,  -- Nullable
-  facts TEXT,  -- Nullable (JSON array)
-  narrative TEXT,  -- Nullable
-  concepts TEXT,  -- Nullable (JSON array)
-  files_read TEXT,  -- Nullable (JSON array)
-  files_modified TEXT,  -- Nullable (JSON array)
-  prompt_number INTEGER,  -- Nullable
-  created_at TEXT NOT NULL,
-  created_at_epoch INTEGER NOT NULL,
-  FOREIGN KEY(sdk_session_id) REFERENCES sdk_sessions(sdk_session_id) ON DELETE CASCADE
-);
-```
-
-**Blocking Potential:**
-- Invalid `type` → CHECK constraint violation
-  - Mitigated by: Parser defaults to "change"
-- Missing `sdk_session_id` → Foreign key violation
-  - Mitigated by: Auto-creation in storeObservation()
-
-#### session_summaries table
-```sql
-CREATE TABLE session_summaries (
-  id INTEGER PRIMARY KEY AUTOINCREMENT,
-  sdk_session_id TEXT NOT NULL,  -- No longer UNIQUE (migration 7)
-  project TEXT NOT NULL,
-  request TEXT,  -- Nullable
-  investigated TEXT,  -- Nullable
-  learned TEXT,  -- Nullable
-  completed TEXT,  -- Nullable
-  next_steps TEXT,  -- Nullable
-  notes TEXT,  -- Nullable
-  prompt_number INTEGER,  -- Nullable
-  created_at TEXT NOT NULL,
-  created_at_epoch INTEGER NOT NULL,
-  FOREIGN KEY(sdk_session_id) REFERENCES sdk_sessions(sdk_session_id) ON DELETE CASCADE
-);
-```
-
-**Blocking Potential:**
-- Missing `sdk_session_id` → Foreign key violation
-  - Mitigated by: Auto-creation in storeSummary()
-
-**Key Design Decisions:**
-1. **Nullable fields** - Allows partial data to be saved
-2. **Auto-creation** - Prevents foreign key errors
-3. **No UNIQUE constraints** (migration 7) - Multiple summaries per session
-4. **WAL mode** - Better concurrency for multiple sessions
-5. **JSON arrays** - Flexible storage for lists (facts, concepts, files)
-
----
-
-## Deep Dive: Prompt System
-
-### buildInitPrompt (prompts.ts lines 24-125)
-
-**Purpose:** Generate initial prompt that instructs the SDK agent what to observe and how to record
-
-**Content:**
-1. **Role Definition:** "You are observing a development session to create searchable memory FOR FUTURE SESSIONS"
-2. **Critical Instruction:** "Record what was BUILT/FIXED/DEPLOYED/CONFIGURED, not what you (the observer) are doing"
-3. **What to Record:** Focus on deliverables, capabilities, technical changes
-4. **When to Skip:** Routine operations (empty status checks, package installations, file listings)
-5. **Output Format:** XML structure with `<observation>` tags and required fields
-
-**Blocking Analysis:** ⚠️ CAN CAUSE SKIPPING
-- "WHEN TO SKIP" section instructs SDK to not output for routine operations
-- "No output necessary if skipping" means no observations stored
-- **This is intentional filtering**, not a bug
-
-**Value Assessment:** ✅ HIGH VALUE
-- Prevents noise from routine operations
-- Focuses on meaningful changes
-- Keep as-is, but consider making "WHEN TO SKIP" configurable
-
-**Key Instructions:**
-```
-WHEN TO SKIP
-------------
-Skip routine operations:
-- Empty status checks
-- Package installations with no errors
-- Simple file listings
-- Repetitive operations you've already documented
-- **No output necessary if skipping.**
-```
-
-**Impact:**
-- Reduces database size by filtering noise
-- But could cause "missing" observations for operations user cares about
-- Trade-off between signal and completeness
-
----
-
-### buildObservationPrompt (prompts.ts lines 130-153)
-
-**Purpose:** Wrap tool execution data in XML for SDK processing
-
-**Content:**
-```xml
-<tool_used>
-  <tool_name>{name}</tool_name>
-  <tool_time>{ISO timestamp}</tool_time>
-  <tool_input>{JSON}</tool_input>
-  <tool_output>{JSON}</tool_output>
-</tool_used>
-```
-
-**Blocking Analysis:** ❌ NO BLOCKING
-- Just data presentation, no instructions to skip
-- SDK processes based on init prompt rules
-
-**Value Assessment:** ✅ HIGH VALUE
-- Simple, clean data wrapper
-- Keep as-is
-
----
-
-### buildSummaryPrompt (prompts.ts lines 158-178)
-
-**Purpose:** Request summary of the session so far
-
-**Content:**
-1. **Instruction:** "Think about the last request, and write a summary of what was done, what was learned, and what's next"
-2. **Important Note:** "DO NOT summarize the observation process itself - you are summarizing a DIFFERENT claude code session, not this one"
-3. **Output Format:** XML `<summary>` with required fields
-4. **Encouragement:** "Always write at least a minimal summary explaining where we are at currently, even if you didn't learn anything new or complete any work"
-
-**Blocking Analysis:** ❌ NO BLOCKING
-- Encourages always writing summary
-- SDK may still skip if truly nothing to summarize
-
-**Value Assessment:** ✅ HIGH VALUE
-- Ensures summaries are generated
-- "Always write at least a minimal summary" reduces skip rate
-- Keep as-is
-
----
-
-## Data Flow Analysis
-
-### End-to-End Flow: Tool Execution → Database
-
-```
-1. User executes tool in Claude Code
-   ↓
-2. PostToolUse hook captures execution
-   ↓
-3. Hook sends HTTP POST to worker /observations endpoint
-   ↓
-4. Worker queues message in pendingMessages array
-   └─→ HTTP 200 response (confirmed receipt)
-   ↓
-5. createMessageGenerator polls queue (100ms interval)
-   ↓
-6. Message shifted from queue
-   ↓
-7. buildObservationPrompt wraps tool data in XML
-   ↓
-8. Generator yields message to SDK agent
-   ↓
-9. SDK sends message to Claude API
-   ↓
-10. Claude processes tool data based on init prompt
-    ↓
-11. Claude responds with XML (or skips if routine operation)
-    ↓
-12. SDK returns response to runSDKAgent
-    ↓
-13. handleAgentMessage receives response
-    ↓
-14. parseObservations extracts <observation> blocks
-    ↓
-15. For each observation:
-    - db.storeObservation called
-    - Auto-creates session if missing
-    - Inserts into observations table
-    ↓
-16. Data persisted in SQLite database
-```
-
-**Failure Points:**
-- **Point 3:** Worker not running → HTTP request fails → Hook logs error
-- **Point 4:** Worker crashes before processing → Queue lost
-- **Point 9:** Invalid model config → SDK fails → Session marked failed
-- **Point 11:** Malformed XML response → Parser returns empty array
-- **Point 15:** Database error (rare) → Throws exception
-
-**Recovery Mechanisms:**
-- **Auto-recovery:** New requests after worker restart auto-create session
-- **Graceful degradation:** Partial data saved (v4.2.5/v4.2.6 fixes)
-- **Database persistence:** Once stored, data survives all restarts
-
----
-
-## Blocking Assessment Matrix
-
-### Components That CAN Block Data Storage
-
-| Component | Blocking Scenario | Impact | Mitigation |
-|-----------|------------------|---------|------------|
-| Worker not running | HTTP requests fail | Observations not queued | PM2 auto-restart, health monitoring |
-| Invalid CLAUDE_MEM_MODEL | SDK agent fails to start | Queued messages never processed | Validation in settings script |
-| Invalid CLAUDE_CODE_PATH | SDK agent fails to start | Queued messages never processed | Default path, env var fallback |
-| Malformed XML in SDK response | Parser can't extract | Data lost for that response | Better error handling, partial parsing |
-| Worker restart | In-memory queue lost | Queued messages lost | Could persist queue to DB |
-| Session abort (DELETE) | Queue processing stopped | Remaining queue lost | Graceful cleanup (v4.1.0) |
-| Init prompt "WHEN TO SKIP" | SDK intentionally skips | No observation stored | Intentional filtering, configurable? |
-
-### Components That CANNOT Block Data Storage
-
-| Component | Reason | Design Pattern |
-|-----------|--------|----------------|
-| /observations endpoint | Auto-recovery, always queues | Maximally permissive |
-| /summarize endpoint | Auto-recovery, always queues | Maximally permissive |
-| parseObservations() | Defaults to "change" type, accepts nulls | Permissive (v4.2.6 fix) |
-| parseSummary() | Returns partial summaries | Permissive (v4.2.5 fix) |
-| storeObservation() | Auto-creates sessions, accepts nulls | Defensive programming |
-| storeSummary() | Auto-creates sessions, accepts nulls | Defensive programming |
-| Database schema | Nullable fields, no UNIQUE constraints | Flexible storage |
-
----
-
-## Critical Findings
-
-### 1. Auto-Recovery Pattern Prevents Worker Restart Data Loss
-
-**Location:** `/observations` and `/summarize` endpoints (lines 181-209, 241-270)
-
-**How it works:**
-```typescript
-if (!session) {
-  // Fetch session from database
-  const dbSession = db.getSessionById(sessionDbId);
-
-  // Recreate in-memory state
-  session = {
-    sessionDbId,
-    claudeSessionId: dbSession!.claude_session_id,
-    sdkSessionId: null,
-    project: dbSession!.project,
-    userPrompt: dbSession!.user_prompt,
-    pendingMessages: [],
-    abortController: new AbortController(),
-    generatorPromise: null,
-    lastPromptNumber: 0,
-    observationCounter: 0,
-    startTime: Date.now()
-  };
-
-  // Start new SDK agent
-  session.generatorPromise = this.runSDKAgent(session);
-}
-```
-
-**Value:** ✅ HIGH
-- Worker restart doesn't prevent new observations from being processed
-- Database is source of truth
-- Stateless design enables resilience
-
-**Recommendation:** Extract to helper function to reduce duplication
-
----
-
-### 2. Parser Fixes (v4.2.5/v4.2.6) Ensure Partial Data Saved
-
-**parseObservations (v4.2.6):**
-```typescript
-// NOTE FROM THEDOTMACK: ALWAYS save observations - never skip. 10/24/2025
-// All fields except type are nullable in schema
-// If type is missing or invalid, use "change" as catch-all fallback
-
-let finalType = 'change'; // Default catch-all
-if (type && validTypes.includes(type.trim())) {
-  finalType = type.trim();
-}
-
-// All other fields are optional - save whatever we have
-observations.push({
-  type: finalType,
-  title,        // Can be null
-  subtitle,     // Can be null
-  facts,
-  narrative,    // Can be null
-  concepts,
-  files_read,
-  files_modified
-});
-```
-
-**parseSummary (v4.2.5):**
-```typescript
-// NOTE FROM THEDOTMACK: 100% of the time we must SAVE the summary,
-// even if fields are missing. 10/24/2025
-// NEVER DO THIS NONSENSE AGAIN.
-
-return {
-  request,       // Can be null
-  investigated,  // Can be null
-  learned,       // Can be null
-  completed,     // Can be null
-  next_steps,    // Can be null
-  notes          // Can be null
-};
-```
-
-**Value:** ✅ CRITICAL
-- Prevents data loss from incomplete AI responses
-- LLMs make mistakes - system must be resilient
-- Partial data is better than no data
-
-**Recommendation:** Keep as-is, this is the right design
-
----
-
-### 3. In-Memory Queue is Main Vulnerability
-
-**Issue:** `pendingMessages` array is in-memory only
-- Worker restart → All queued messages lost
-- But HTTP response already confirmed receipt
-
-**Current behavior:**
-1. Hook sends observation → Worker responds "queued" → Hook thinks it's saved
-2. Worker crashes before processing → Observation lost
-3. BUT: New observations after restart are still processed (auto-recovery)
-
-**Impact:** ⚠️ MEDIUM
-- Data loss window between queue and processing
-- But observations are idempotent (can be resent)
-- Hooks don't retry on success response
-
-**Recommendation:** ⚠️ CONSIDER
-- Persist queue to database (e.g., `pending_observations` table)
-- Mark as processed when SDK handles
-- Increases reliability but adds complexity
-
----
-
-### 4. Init Prompt "WHEN TO SKIP" Intentionally Filters
-
-**Instruction:**
-```
-WHEN TO SKIP
-------------
-Skip routine operations:
-- Empty status checks
-- Package installations with no errors
-- Simple file listings
-- Repetitive operations you've already documented
-- **No output necessary if skipping.**
-```
-
-**Impact:**
-- Reduces noise in database
-- Focuses on meaningful changes
-- BUT: User might wonder why some tool executions aren't recorded
-
-**Value:** ✅ MEDIUM - Intentional filtering
-- Prevents database bloat
-- Trade-off between signal and completeness
-
-**Recommendation:** ⚠️ CONSIDER
-- Make "WHEN TO SKIP" configurable (env var or settings)
-- Or add verbosity levels (minimal/normal/verbose)
-
----
-
-## Value Assessment by Component
-
-### HIGH VALUE - Keep As-Is
-
-| Component | Reason |
-|-----------|--------|
-| Auto-recovery pattern | Prevents worker restart data loss |
-| Permissive parser (v4.2.5/v4.2.6) | Ensures partial data saved, critical for reliability |
-| Nullable database schema | Flexible storage, allows incomplete data |
-| WAL mode SQLite | Good concurrency, reliable writes |
-| Isolated session state | No cross-contamination between sessions |
-| Queue-based architecture | Decouples HTTP from SDK processing |
-| storeObservation/storeSummary auto-creation | Defensive programming, prevents foreign key errors |
-
-### MEDIUM VALUE - Consider Improvements
-
-| Component | Current State | Potential Improvement |
-|-----------|--------------|----------------------|
-| In-memory queue | Lost on restart | Persist to DB for durability |
-| 100ms polling | Works but inefficient | Event-driven async queue |
-| Duplicated auto-recovery code | Lines 181-209 and 241-270 identical | Extract to `getOrCreateSession()` helper |
-| No try-catch around DB ops | Errors crash handler | Add error handling with logging |
-| Model/port defaults | Hard-coded | Already configurable via env vars ✓ |
-| Init prompt filtering | Fixed "WHEN TO SKIP" rules | Make configurable (verbosity levels) |
-
-### LOW VALUE - Questionable Design
-
-| Component | Issue | Recommendation |
-|-----------|-------|----------------|
-| cleanupOrphanedSessions() | Marks ALL active sessions failed on startup | Aggressive, but necessary with fixed port |
-| 5-second DELETE timeout | Arbitrary | Make configurable via env var |
-| "NO SUMMARY TAGS FOUND" warning | Log level too high | Change to INFO level |
-
----
-
-## Recommendations
-
-### Priority 1: Critical Reliability Improvements
-
-1. **Persist Message Queue to Database**
-   - Create `pending_messages` table
-   - Store queued observations/summaries
-   - Mark as processed when handled by SDK
-   - Prevents data loss on worker restart
-   - **Effort:** Medium, **Impact:** High
-
-2. **Add Error Handling Around Database Operations**
-   - Wrap `db.storeObservation()` and `db.storeSummary()` in try-catch
-   - Log errors with full context
-   - Continue processing other messages on error
-   - **Effort:** Low, **Impact:** Medium
-
-### Priority 2: Code Quality Improvements
-
-3. **Extract Auto-Recovery to Helper Function**
-   ```typescript
-   private async getOrCreateSession(sessionDbId: number): Promise<ActiveSession> {
-     // Consolidate lines 181-209 and 241-270
-   }
-   ```
-   - **Effort:** Low, **Impact:** Low (code quality)
-
-4. **Make Configuration More Flexible**
-   - Add `CLAUDE_MEM_VERBOSITY` env var (minimal/normal/verbose)
-   - Adjust init prompt "WHEN TO SKIP" based on verbosity
-   - Add `CLAUDE_MEM_DELETE_TIMEOUT` env var
-   - **Effort:** Low, **Impact:** Medium
-
-### Priority 3: Performance Optimizations
-
-5. **Replace Polling with Event-Driven Queue**
-   - Use `AsyncQueue` with notifications instead of 100ms polling
-   - Reduces latency from queue to processing
-   - **Effort:** Medium, **Impact:** Low (performance)
-
-6. **Add Queue Metrics**
-   - Track queue length over time
-   - Alert if queue grows unbounded
-   - Add to `/health` endpoint
-   - **Effort:** Low, **Impact:** Low (observability)
-
----
-
-## Appendix: Configuration Reference
-
-### Environment Variables
-
-| Variable | Default | Purpose | Blocking Impact |
-|----------|---------|---------|----------------|
-| `CLAUDE_MEM_MODEL` | `claude-sonnet-4-5` | AI model for processing | Invalid = SDK fails |
-| `CLAUDE_MEM_WORKER_PORT` | `37777` | HTTP server port | Invalid = Worker won't start |
-
-### Constants
-
-| Constant | Value | Purpose |
-|----------|-------|---------|
-| `DISALLOWED_TOOLS` | `['Glob', 'Grep', 'ListMcpResourcesTool', 'WebSearch']` | Tools SDK agent can't use |
-| Polling interval | `100ms` | Queue polling frequency |
-| DELETE timeout | `5000ms` | Max wait for agent shutdown |
-
----
-
-## Conclusion
-
-The claude-mem worker server is a well-designed system with a clear **defensive, layered architecture** that prioritizes **data persistence**. The key strengths are:
-
-1. **Auto-recovery** from worker restarts
-2. **Permissive parsing** that saves partial data
-3. **Nullable schema** that accepts incomplete information
-4. **Session isolation** preventing cross-contamination
-
-The main vulnerability is the **in-memory queue**, which could be mitigated by persisting to the database. Overall, the system achieves its goal of creating a persistent memory system that survives failures and continues operating even with incomplete data.
-
-**Design Philosophy:** "Better to save partial data than lose everything."
-
-This philosophy is evident throughout the codebase, from the v4.2.5/v4.2.6 parser fixes to the auto-creation patterns in the database layer. The system is built to be resilient to AI errors, configuration issues, and process failures.
-
----
-
-**End of Document**
diff --git a/docs/worker-service-analysis.md b/docs/worker-service-analysis.md
deleted file mode 100644
index 03da4793..00000000
--- a/docs/worker-service-analysis.md
+++ /dev/null
@@ -1,907 +0,0 @@
-# Worker Service & Worker Utils: Comprehensive YAGNI Analysis
-
-**Date**: 2025-11-06
-**Files Analyzed**:
-- `src/services/worker-service.ts` (1228 lines)
-- `src/shared/worker-utils.ts` (110 lines)
-
-**Overall Assessment**: 80% excellent architecture, 20% cleanup needed. Worker-service is well-structured with proper error handling priorities, but worker-utils contains critical bugs and YAGNI violations.
-
----
-
-## Executive Summary
-
-### What These Files Do
-
-**worker-service.ts**: Long-running Express HTTP service managed by PM2. Handles AI compression of observations, session management, SSE streaming for web UI, and Chroma vector sync. This is the heart of claude-mem's async processing.
-
-**worker-utils.ts**: Utilities for ensuring the worker is running. Called by hooks at session start to verify/start the PM2 worker process.
-
-### Critical Findings
-
-#### 🔥🔥🔥🔥🔥 SEVERITY 5 - MUST FIX IMMEDIATELY
-
-1. **worker-utils.ts:75** - Fragile string parsing of PM2 output causes false positives
-2. **worker-service.ts:754-844** - 60+ lines of identical session auto-creation code duplicated 3 times
-3. **worker-utils.ts:70** - Silent error handling defers PM2 failures instead of failing fast
-
-#### 🔥🔥🔥 SEVERITY 3 - FIX SOON
-
-4. **worker-utils.ts:77-95** - No handling for "running but unhealthy" case
-5. **worker-utils.ts:107-109** - Useless `getWorkerPort()` wrapper function
-6. **worker-service.ts:316** - 1500ms debounce is 10x too long
-
-#### 🔥🔥 SEVERITY 2 - CLEANUP WHEN CONVENIENT
-
-7. Multiple magic numbers (100ms, 1000ms, 10000ms) without named constants
-8. Hardcoded default values duplicated across multiple locations
-9. Hardcoded model validation list that will become stale
-
----
-
-## Complete Function Catalog
-
-### worker-utils.ts Functions
-
-| Function | Lines | Purpose | Status |
-|----------|-------|---------|--------|
-| `isWorkerHealthy(timeoutMs)` | 10-19 | Check /health endpoint responds | ✅ OK |
-| `waitForWorkerHealth(maxWaitMs)` | 24-36 | Poll until worker healthy | 🔥 Inefficient timeout |
-| `ensureWorkerRunning()` | 43-102 | Main orchestrator to start worker | 🔥🔥🔥🔥🔥 CRITICAL BUGS |
-| `getWorkerPort()` | 107-109 | Returns FIXED_PORT constant | 🔥🔥🔥🔥🔥 DELETE THIS |
-
-### worker-service.ts Functions
-
-| Function | Lines | Purpose | Status |
-|----------|-------|---------|--------|
-| `findClaudePath()` | 35-65 | Find Claude Code executable | ✅ Excellent |
-| Constructor | 107-139 | Setup Express routes | ✅ Good |
-| `start()` | 141-173 | Start HTTP server, init Chroma | ✅ Excellent prioritization |
-| `getUIDirectory()` | 178-189 | Get UI path (CJS/ESM) | ✅ Good defensive code |
-| `handleHealth()` | 194-196 | GET /health | ✅ PERFECT |
-| `handleViewerHTML()` | 201-211 | GET / | ✅ Good |
-| `handleSSEStream()` | 216-245 | GET /stream (SSE) | ✅ Good |
-| `broadcastSSE()` | 250-275 | Broadcast to clients | ✅ Excellent defensive code |
-| `broadcastProcessingStatus()` | 280-286 | Broadcast processing state | ✅ Good |
-| `checkAndStopSpinner()` | 291-318 | Debounced spinner stop | 🔥 1500ms too long |
-| `handleStats()` | 323-365 | GET /api/stats | 🔥 Hardcoded paths/version |
-| `handleGetSettings()` | 370-397 | GET /api/settings | 🔥 Duplicated defaults |
-| `handlePostSettings()` | 402-461 | POST /api/settings | 🔥 Hardcoded model list |
-| `handleGetObservations()` | 467-515 | GET /api/observations | ✅ Excellent |
-| `handleGetSummaries()` | 517-576 | GET /api/summaries | ✅ Excellent |
-| `handleGetPrompts()` | 578-631 | GET /api/prompts | ✅ Excellent |
-| `handleGetProcessingStatus()` | 637-639 | GET /api/processing-status | ✅ Good |
-| `handleInit()` | 645-744 | POST /sessions/:id/init | ✅ Good but has duplication |
-| `handleObservation()` | 750-803 | POST /sessions/:id/observations | 🔥🔥🔥🔥🔥 MASSIVE DUPLICATION |
-| `handleSummarize()` | 809-858 | POST /sessions/:id/summarize | 🔥🔥🔥🔥🔥 MASSIVE DUPLICATION |
-| `handleComplete()` | 864-873 | POST /sessions/:id/complete | ✅ PERFECT |
-| `handleStatus()` | 878-893 | GET /sessions/:id/status | ✅ Good |
-| `runSDKAgent()` | 898-963 | Run SDK agent loop | ✅ Excellent |
-| `createMessageGenerator()` | 969-1060 | Async generator for SDK | ✅ Excellent |
-| `handleAgentMessage()` | 1066-1201 | Parse and store AI response | ✅ EXCELLENT |
-| `main()` | 1205-1225 | Entry point + signals | ✅ Good |
-
----
-
-## Line-by-Line Analysis
-
-### worker-utils.ts
-
-#### Lines 1-5: Imports and Constants
-```typescript
-const FIXED_PORT = parseInt(process.env.CLAUDE_MEM_WORKER_PORT || "37777", 10);
-```
-
-**What**: Parse port from env var with fallback to 37777
-**Why**: Need to know which port to connect to
-**Critique**: ✅ Good - simple constant, no unnecessary abstraction
-
----
-
-#### Lines 10-19: `isWorkerHealthy(timeoutMs = 100)`
-
-```typescript
-async function isWorkerHealthy(timeoutMs: number = 100): Promise<boolean> {
-  try {
-    const response = await fetch(`http://127.0.0.1:${FIXED_PORT}/health`, {
-      signal: AbortSignal.timeout(timeoutMs)
-    });
-    return response.ok;
-  } catch {
-    return false;
-  }
-}
-```
-
-**What**: Checks if /health endpoint responds within timeout
-**Why**: Need to know if worker is running before trying to start it
-**Critique**:
-- Default 100ms is used once (line 45 initial check)
-- Explicit 1000ms passed at line 29 (during startup polling)
-- This inconsistency is actually INTENTIONAL: quick initial check vs. waiting for startup
-- ✅ **VERDICT**: Reasonable pattern
-
-**Why the two timeouts?**
-- 100ms: "Is it already running?" (fast check, don't wait)
-- 1000ms: "Is it starting up?" (wait for initialization)
-
----
-
-#### Lines 24-36: `waitForWorkerHealth(maxWaitMs = 10000)`
-
-```typescript
-async function waitForWorkerHealth(maxWaitMs: number = 10000): Promise<boolean> {
-  const start = Date.now();
-  const checkInterval = 100; // Check every 100ms
-
-  while (Date.now() - start < maxWaitMs) {
-    if (await isWorkerHealthy(1000)) {
-      return true;
-    }
-    // Wait before next check
-    await new Promise(resolve => setTimeout(resolve, checkInterval));
-  }
-  return false;
-}
-```
-
-**What**: Polls health endpoint every 100ms until healthy or timeout
-**Why**: Worker takes time to start, need to wait
-**Critique**:
-
-🔥 **MAGIC NUMBER #1**: Line 26 `checkInterval = 100` - no units! Is this milliseconds? Should be `CHECK_INTERVAL_MS = 100`
-
-🔥 **MAGIC NUMBER #2**: Line 29 `isWorkerHealthy(1000)` - why 1000ms timeout per check?
-
-🔥 **INEFFICIENCY**: Each health check has 1000ms timeout, but we check every 100ms. If the worker is down, each check waits 1000ms to timeout. We could fail faster with a 100ms timeout since we retry quickly anyway.
-
-**The Math**:
-- Check interval: 100ms
-- Health timeout: 1000ms
-- If worker is down, first check fails after 1000ms, then we wait 100ms, then try again
-- Total time to detect "worker is down" on first check: 1000ms (could be 100ms)
-
-**RECOMMENDED**: Use 100ms timeout for health checks since we retry every 100ms anyway:
-```typescript
-const HEALTH_CHECK_TIMEOUT_MS = 100;
-const HEALTH_CHECK_POLL_INTERVAL_MS = 100;
-const HEALTH_CHECK_MAX_WAIT_MS = 10000;
-
-async function waitForWorkerHealth(): Promise<boolean> {
-  const start = Date.now();
-  while (Date.now() - start < HEALTH_CHECK_MAX_WAIT_MS) {
-    if (await isWorkerHealthy(HEALTH_CHECK_TIMEOUT_MS)) return true;
-    await new Promise(resolve => setTimeout(resolve, HEALTH_CHECK_POLL_INTERVAL_MS));
-  }
-  return false;
-}
-```
-
----
-
-#### Lines 43-102: `ensureWorkerRunning()` - 🔥🔥🔥🔥🔥 THE DISASTER ZONE
-
-```typescript
-export async function ensureWorkerRunning(): Promise<void> {
-  // First, check if worker is already healthy
-  if (await isWorkerHealthy()) {
-    return; // Worker is already running and responsive
-  }
-
-  const packageRoot = getPackageRoot();
-  const pm2Path = path.join(packageRoot, "node_modules", ".bin", "pm2");
-  const ecosystemPath = path.join(packageRoot, "ecosystem.config.cjs");
-
-  // Check PM2 status to see if worker process exists
-  const checkProcess = spawn(pm2Path, ["list", "--no-color"], {
-    cwd: packageRoot,
-    stdio: ["ignore", "pipe", "ignore"],
-  });
-
-  let output = "";
-  checkProcess.stdout?.on("data", (data) => {
-    output += data.toString();
-  });
-
-  // Wait for PM2 list to complete
-  await new Promise<void>((resolve, reject) => {
-    checkProcess.on("error", (error) => reject(error));
-    checkProcess.on("close", (code) => {
-      // PM2 list can fail, but we should still continue - just assume worker isn't running
-      // This handles cases where PM2 isn't installed yet
-      resolve();
-    });
-  });
-
-  // Check if 'claude-mem-worker' is in the PM2 list output and is 'online'
-  const isRunning = output.includes("claude-mem-worker") && output.includes("online");
-
-  if (!isRunning) {
-    // Start the worker
-    const startProcess = spawn(pm2Path, ["start", ecosystemPath], {
-      cwd: packageRoot,
-      stdio: "ignore",
-    });
-
-    // Wait for PM2 start command to complete
-    await new Promise<void>((resolve, reject) => {
-      startProcess.on("error", (error) => reject(error));
-      startProcess.on("close", (code) => {
-        if (code !== 0 && code !== null) {
-          reject(new Error(`PM2 start command failed with exit code ${code}`));
-        } else {
-          resolve();
-        }
-      });
-    });
-  }
-
-  // Wait for worker to become healthy (either just started or was starting)
-  const healthy = await waitForWorkerHealth(10000);
-  if (!healthy) {
-    throw new Error("Worker failed to become healthy after starting");
-  }
-}
-```
-
-**What**: Ensure PM2 worker is running - check health, check PM2 status, start if needed, wait for health
-**Why**: Hooks need worker running to process observations
-
-#### 🔥🔥🔥🔥🔥 CRITICAL BUG #1: Fragile String Parsing (Line 75)
-
-```typescript
-const isRunning = output.includes("claude-mem-worker") && output.includes("online");
-```
-
-**THE PROBLEM**: This checks if BOTH strings exist ANYWHERE in the output. This is WRONG.
-
-**Counter-Example**:
-```
-PM2 Process List:
-┌─────┬────────────────────┬─────────┐
-│ id  │ name               │ status  │
-├─────┼────────────────────┼─────────┤
-│  0  │ claude-mem-worker  │ stopped │
-│  1  │ some-other-app     │ online  │
-└─────┴────────────────────┴─────────┘
-```
-
-This would return `true` because output contains "claude-mem-worker" AND "online", even though the worker is STOPPED!
-
-**Impact**:
-- False positive: Worker is stopped, but code thinks it's running
-- Result: Skip starting worker (line 77 `if (!isRunning)`), wait for health
-- Health check fails because worker isn't actually running
-- Entire function fails with "Worker failed to become healthy"
-- User sees cryptic error instead of "Worker is stopped, restarting..."
-
-**THE FIX**: Use PM2's JSON output
-```typescript
-const result = execSync(`"${pm2Path}" jlist`, { encoding: 'utf8' });
-const processes = JSON.parse(result);
-const worker = processes.find(p => p.name === 'claude-mem-worker');
-const isRunning = worker?.pm2_env?.status === 'online';
-```
-
-#### 🔥🔥🔥🔥🔥 CRITICAL BUG #2: Silent Error Handling (Lines 65-72)
-
-```typescript
-await new Promise<void>((resolve, reject) => {
-  checkProcess.on("error", (error) => reject(error));
-  checkProcess.on("close", (code) => {
-    // PM2 list can fail, but we should still continue - just assume worker isn't running
-    // This handles cases where PM2 isn't installed yet
-    resolve(); // ← ALWAYS RESOLVES, NEVER REJECTS
-  });
-});
-```
-
-**THE PROBLEM**:
-1. If PM2 isn't installed, `pm2 list` fails
-2. Line 70: ALWAYS resolves, ignoring the failure
-3. `output` is empty string
-4. Line 75: `isRunning = false` (correct by accident)
-5. Line 77-94: Try to START the worker... which will ALSO fail because PM2 isn't installed
-6. Line 85-93: THIS finally rejects with error
-
-**Why This Is Terrible**:
-- Defers error detection to the start command instead of failing fast
-- Confusing error message: "PM2 start command failed" instead of "PM2 not found - run npm install"
-- User wastes time waiting for PM2 list to fail, then waiting for PM2 start to fail
-- The comment is a LIE: "we should still continue" - no, we shouldn't! If PM2 isn't installed, FAIL IMMEDIATELY.
-
-**THE FIX**: Fail fast
-```typescript
-await new Promise<void>((resolve, reject) => {
-  checkProcess.on("error", reject);
-  checkProcess.on("close", (code) => {
-    if (code !== 0 && code !== null) {
-      reject(new Error(`PM2 not found - install dependencies first (npm install)`));
-    }
-    resolve();
-  });
-});
-```
-
-#### 🔥🔥🔥🔥 CRITICAL BUG #3: No Handling for "Running But Unhealthy" (Lines 77-98)
-
-**THE LOGIC**:
-1. Line 45: Check if worker is healthy → NO (or we would have returned)
-2. Line 54-75: Check if PM2 says worker is running
-3. Line 77: `if (!isRunning)` → start the worker
-4. Line 98: Wait for worker to become healthy
-
-**THE PROBLEM**: What if PM2 says worker IS running but our health check (line 45) failed?
-
-**Answer**: We do NOTHING. We skip the `if (!isRunning)` block and jump straight to line 98, waiting for it to become healthy.
-
-**Why This Is Wrong**: If the worker is started but unhealthy, it won't magically heal itself. It needs to be RESTARTED.
-
-**Scenarios**:
-- Worker crashed but PM2 hasn't noticed yet → Status: "online", Health: failed → We wait forever
-- Worker is in infinite loop → Status: "online", Health: timeout → We wait forever
-- Worker port is wrong → Status: "online", Health: failed → We wait forever
-
-**THE FIX**: Restart if unhealthy
-```typescript
-if (!await isWorkerHealthy()) {
-  // Not healthy - restart it (PM2 restart is idempotent)
-  execSync(`"${pm2Path}" restart "${ecosystemPath}"`);
-  if (!await waitForWorkerHealth()) {
-    throw new Error("Worker failed to become healthy after restart");
-  }
-}
-```
-
-Or even simpler: Just always restart if health fails. PM2 handles "not started" vs "started" gracefully.
-
----
-
-#### Lines 107-109: `getWorkerPort()` - 🔥🔥🔥🔥🔥 DELETE THIS
-
-```typescript
-/**
- * Get the worker port number (fixed port)
- */
-export function getWorkerPort(): number {
-  return FIXED_PORT;
-}
-```
-
-**What**: Returns the FIXED_PORT constant
-**Why**: ???
-**Critique**: 🔥🔥🔥🔥🔥 **TEXTBOOK YAGNI VIOLATION**
-
-This is the "wrapper function for a constant" anti-pattern from CLAUDE.md.
-
-**THE PROBLEM**: This function adds ZERO value. It's pure ceremony.
-
-**Callers should just**:
-```typescript
-import { FIXED_PORT } from './worker-utils.js';
-// Use FIXED_PORT directly
-```
-
-**Instead of**:
-```typescript
-import { getWorkerPort } from './worker-utils.js';
-const port = getWorkerPort(); // Why???
-```
-
-**Why This Exists**: Training bias. Code that looks "professional" often includes ceremonial getters for constants. But this is WRONG. Delete it and export the constant.
-
-**THE FIX**:
-```typescript
-export const WORKER_PORT = parseInt(process.env.CLAUDE_MEM_WORKER_PORT || "37777", 10);
-```
-
-Then update all callers to use `WORKER_PORT` instead of `getWorkerPort()`.
-
----
-
-### worker-utils.ts COMPLETE REWRITE
-
-Here's what this file SHOULD be:
-
-```typescript
-import path from "path";
-import { execSync } from "child_process";
-import { getPackageRoot } from "./paths.js";
-
-// Configuration
-export const WORKER_PORT = parseInt(process.env.CLAUDE_MEM_WORKER_PORT || "37777", 10);
-
-const HEALTH_CHECK_TIMEOUT_MS = 100;
-const HEALTH_CHECK_POLL_INTERVAL_MS = 100;
-const HEALTH_CHECK_MAX_WAIT_MS = 10000;
-
-/**
- * Check if worker is responsive by trying the health endpoint
- */
-async function isWorkerHealthy(): Promise<boolean> {
-  try {
-    const response = await fetch(`http://127.0.0.1:${WORKER_PORT}/health`, {
-      signal: AbortSignal.timeout(HEALTH_CHECK_TIMEOUT_MS)
-    });
-    return response.ok;
-  } catch {
-    return false;
-  }
-}
-
-/**
- * Wait for worker to become healthy, polling every 100ms
- */
-async function waitForWorkerHealth(): Promise<boolean> {
-  const start = Date.now();
-  while (Date.now() - start < HEALTH_CHECK_MAX_WAIT_MS) {
-    if (await isWorkerHealthy()) return true;
-    await new Promise(resolve => setTimeout(resolve, HEALTH_CHECK_POLL_INTERVAL_MS));
-  }
-  return false;
-}
-
-/**
- * Ensure worker service is running and healthy
- * Restarts worker if not healthy (PM2 restart is idempotent)
- */
-export async function ensureWorkerRunning(): Promise<void> {
-  if (await isWorkerHealthy()) return;
-
-  const packageRoot = getPackageRoot();
-  const pm2Path = path.join(packageRoot, "node_modules", ".bin", "pm2");
-  const ecosystemPath = path.join(packageRoot, "ecosystem.config.cjs");
-
-  // PM2 restart is idempotent - handles both "not started" and "started but broken"
-  try {
-    const result = execSync(`"${pm2Path}" restart "${ecosystemPath}"`, {
-      cwd: packageRoot,
-      encoding: 'utf8',
-      stdio: 'pipe'
-    });
-
-    if (!await waitForWorkerHealth()) {
-      throw new Error(`Worker failed to become healthy. PM2 output:\n${result}`);
-    }
-  } catch (error: any) {
-    if (error.code === 'ENOENT' || error.message.includes('not found')) {
-      throw new Error('PM2 not found - run: npm install');
-    }
-    throw error;
-  }
-}
-```
-
-**Line Count**: 43 lines (vs 110 original)
-**Complexity**: 1/3 of original
-**Bugs Fixed**: All of them
-**Ceremony Removed**: All of it
-
-**What Changed**:
-1. Removed `getWorkerPort()` wrapper - export constant directly
-2. Removed PM2 status checking - just restart if unhealthy
-3. Removed string parsing - use PM2's idempotent restart
-4. Removed silent error handling - fail fast on PM2 not found
-5. Named all magic numbers as constants
-6. Simplified to: "Unhealthy? Restart. Wait for health. Done."
-
----
-
-## worker-service.ts Analysis
-
-### Overall Structure
-
-**Lines 1-24**: Imports and constants ✅
-**Lines 27-65**: `findClaudePath()` ✅ Excellent
-**Lines 67-96**: Type definitions ✅
-**Lines 98-1228**: WorkerService class
-
-### Critical Issues in worker-service.ts
-
-#### 🔥🔥🔥🔥🔥 ISSUE #1: Massive Code Duplication (Lines 754-844)
-
-**THE PROBLEM**: Session auto-creation logic is COPIED THREE TIMES:
-1. `handleInit()` (lines 663-733)
-2. `handleObservation()` (lines 754-785)
-3. `handleSummarize()` (lines 813-844)
-
-**The Duplicated Code** (20+ lines per copy):
-```typescript
-let session = this.sessions.get(sessionDbId);
-if (!session) {
-  const db = new SessionStore();
-  const dbSession = db.getSessionById(sessionDbId);
-  db.close();
-
-  session = {
-    sessionDbId,
-    claudeSessionId: dbSession!.claude_session_id,
-    sdkSessionId: null,
-    project: dbSession!.project,
-    userPrompt: dbSession!.user_prompt,
-    pendingMessages: [],
-    abortController: new AbortController(),
-    generatorPromise: null,
-    lastPromptNumber: 0,
-    startTime: Date.now()
-  };
-  this.sessions.set(sessionDbId, session);
-
-  session.generatorPromise = this.runSDKAgent(session).catch(err => {
-    logger.failure('WORKER', 'SDK agent error', { sessionId: sessionDbId }, err);
-    const db = new SessionStore();
-    db.markSessionFailed(sessionDbId);
-    db.close();
-    this.sessions.delete(sessionDbId);
-  });
-}
-```
-
-**Impact**: 60+ lines of duplicated code across 3 functions
-
-**THE FIX**: Extract to helper method
-```typescript
-private getOrCreateSession(sessionDbId: number): ActiveSession {
-  let session = this.sessions.get(sessionDbId);
-  if (session) return session;
-
-  const db = new SessionStore();
-  const dbSession = db.getSessionById(sessionDbId);
-  if (!dbSession) {
-    db.close();
-    throw new Error(`Session ${sessionDbId} not found in database`);
-  }
-
-  session = {
-    sessionDbId,
-    claudeSessionId: dbSession.claude_session_id,
-    sdkSessionId: null,
-    project: dbSession.project,
-    userPrompt: dbSession.user_prompt,
-    pendingMessages: [],
-    abortController: new AbortController(),
-    generatorPromise: null,
-    lastPromptNumber: 0,
-    startTime: Date.now()
-  };
-
-  this.sessions.set(sessionDbId, session);
-
-  // Start SDK agent in background
-  session.generatorPromise = this.runSDKAgent(session).catch(err => {
-    logger.failure('WORKER', 'SDK agent error', { sessionId: sessionDbId }, err);
-    const db = new SessionStore();
-    db.markSessionFailed(sessionDbId);
-    db.close();
-    this.sessions.delete(sessionDbId);
-  });
-
-  db.close();
-  return session;
-}
-```
-
-Then all three functions become:
-```typescript
-private handleObservation(req: Request, res: Response): void {
-  const sessionDbId = parseInt(req.params.sessionDbId, 10);
-  const { tool_name, tool_input, tool_output, prompt_number } = req.body;
-
-  const session = this.getOrCreateSession(sessionDbId);
-
-  session.pendingMessages.push({
-    type: 'observation',
-    tool_name,
-    tool_input,
-    tool_output,
-    prompt_number
-  });
-
-  res.json({ status: 'queued', queueLength: session.pendingMessages.length });
-}
-```
-
-**Savings**: Remove 60 lines, improve maintainability 10x
-
----
-
-#### 🔥🔥 ISSUE #2: Magic Numbers Throughout
-
-**Line 316**: `setTimeout(() => { ... }, 1500);` - Why 1500ms debounce?
-**Line 997**: `setTimeout(resolve, 100)` - Why 100ms polling?
-**Line 343**: `const version = process.env.npm_package_version || '5.0.3';` - Hardcoded fallback
-**Line 109**: `express.json({ limit: '50mb' })` - Why 50mb?
-
-**THE FIX**: Named constants
-```typescript
-const SPINNER_DEBOUNCE_MS = 200; // Debounce spinner to prevent flicker
-const MESSAGE_POLL_INTERVAL_MS = 100; // Check for new messages every 100ms
-const MAX_REQUEST_SIZE = '50mb'; // Allow large tool outputs
-```
-
----
-
-#### 🔥🔥 ISSUE #3: Configuration Duplication
-
-Default values appear in multiple places:
-- Line 377-380: Default settings in GET handler
-- Line 22: MODEL default
-- Throughout: Port defaults, observation count defaults
-
-**THE FIX**: Centralize
-```typescript
-export const DEFAULT_CONFIG = {
-  MODEL: 'claude-haiku-4-5',
-  CONTEXT_OBSERVATIONS: 50,
-  WORKER_PORT: 37777,
-  VALID_MODELS: ['claude-haiku-4-5', 'claude-sonnet-4-5', 'claude-opus-4'],
-  MAX_CONTEXT_OBSERVATIONS: 200,
-  MIN_PORT: 1024,
-  MAX_PORT: 65535
-} as const;
-```
-
----
-
-#### 🔥 ISSUE #4: Hardcoded Model Validation (Line 407)
-
-```typescript
-const validModels = ['claude-haiku-4-5', 'claude-sonnet-4-5', 'claude-opus-4'];
-```
-
-**THE PROBLEM**: This list will get stale when new models are released.
-
-**YAGNI QUESTION**: Do we even need to validate? The SDK will error if model doesn't exist.
-
-**ANSWER**: Better error messages for users. But this should be a WARNING, not a blocker.
-
-**THE FIX**: Remove validation or make it advisory
-```typescript
-// Let SDK handle validation - it knows the current model list
-// We don't need to duplicate that logic here
-if (CLAUDE_MEM_MODEL) {
-  settings.env.CLAUDE_MEM_MODEL = CLAUDE_MEM_MODEL;
-  logger.info('WORKER', `Model changed to ${CLAUDE_MEM_MODEL}`, {});
-}
-```
-
----
-
-### What worker-service.ts Does RIGHT ✅
-
-#### 1. Excellent Error Handling Priority
-```typescript
-// Store to SQLite FIRST (source of truth)
-const { id, createdAtEpoch } = db.storeObservation(...);
-
-// Broadcast to SSE (real-time UI updates)
-this.broadcastSSE({ type: 'new_observation', ... });
-
-// Sync to Chroma ASYNC (fire-and-forget, non-critical)
-this.chromaSync.syncObservation(...)
-  .catch((error: Error) => {
-    logger.error('...continuing', ...);
-    // Don't crash - SQLite has the data
-  });
-```
-
-**Priority**: SQLite > SSE > Chroma
-**Philosophy**: Write to source of truth first, update UI second, sync to vector DB last. Chroma failures don't crash the worker.
-
-#### 2. Clean Pagination APIs
-
-All data endpoints follow consistent pattern:
-- Parse `offset`, `limit`, `project` from query params
-- Cap limit at 100 to prevent abuse
-- Return `{ items, hasMore, total, offset, limit }`
-- Use parameterized queries (SQL injection safe)
-
-Example: `handleGetObservations()` (lines 467-515) is textbook good API design.
-
-#### 3. Proper Async Generator Pattern
-
-`createMessageGenerator()` (lines 969-1060) is an excellent implementation:
-- Yields init prompt immediately
-- Polls message queue with proper abort signal handling
-- No busy-waiting (100ms sleep between polls)
-- Clean message type discrimination
-- Proper error propagation
-
-#### 4. Defensive SSE Cleanup
-
-`broadcastSSE()` (lines 250-275):
-- Early return if no clients (optimization)
-- Two-phase cleanup (collect failures, then remove)
-- Doesn't modify Set during iteration
-- Handles disconnected clients gracefully
-
-This is GOOD defensive programming, not YAGNI violation.
-
----
-
-## Severity-Ranked YAGNI Violations
-
-### 🔥🔥🔥🔥🔥 SEVERITY 5: CRITICAL - FIX IMMEDIATELY
-
-| Issue | File | Lines | Problem | Impact |
-|-------|------|-------|---------|--------|
-| Fragile string parsing | worker-utils | 75 | `output.includes("claude-mem-worker") && output.includes("online")` | False positives cause failures |
-| Session auto-creation duplication | worker-service | 754-844 | 60+ lines copied 3 times | Maintenance nightmare |
-| Silent PM2 error handling | worker-utils | 70 | Always resolves, defers errors | Confusing error messages |
-
-### 🔥🔥🔥🔥 SEVERITY 4: MAJOR - FIX SOON
-
-| Issue | File | Lines | Problem | Impact |
-|-------|------|-------|---------|--------|
-| No "running but unhealthy" handling | worker-utils | 77-98 | Skip restart if PM2 says running | Worker never recovers |
-| Useless getWorkerPort() wrapper | worker-utils | 107-109 | Ceremony for a constant | Code bloat |
-
-### 🔥🔥🔥 SEVERITY 3: MODERATE - FIX WHEN CONVENIENT
-
-| Issue | File | Lines | Problem | Impact |
-|-------|------|-------|---------|--------|
-| 1500ms debounce too long | worker-service | 316 | Should be 100-200ms | Spinner lags |
-| Hardcoded model validation | worker-service | 407 | List will get stale | Blocks valid models |
-| Hardcoded fallback version | worker-service | 343 | '5.0.3' will get stale | Wrong stats |
-
-### 🔥🔥 SEVERITY 2: MINOR - CLEANUP
-
-| Issue | File | Lines | Problem | Impact |
-|-------|------|-------|---------|--------|
-| Magic numbers everywhere | Both | Multiple | 100, 1000, 1500, etc | Hard to maintain |
-| Duplicated default configs | worker-service | Multiple | Defaults in many places | Inconsistency risk |
-| Unnecessary this.port | worker-service | 100 | Should use FIXED_PORT | Confusion |
-
----
-
-## Recommended Action Plan
-
-### Phase 1: Critical Fixes (Do Today)
-
-1. **Fix worker-utils.ts completely** - Use the rewrite provided above (43 lines)
-   - Remove getWorkerPort()
-   - Fix PM2 string parsing → use `pm2 restart` (idempotent)
-   - Remove silent error handling
-   - Named constants for all timeouts
-
-2. **Extract getOrCreateSession()** in worker-service.ts
-   - Remove 60 lines of duplication
-   - Update handleInit, handleObservation, handleSummarize
-
-### Phase 2: Cleanup (Do This Week)
-
-3. **Centralize configuration**
-   - Create DEFAULT_CONFIG constant
-   - Remove duplicated defaults
-   - Update all references
-
-4. **Fix magic numbers**
-   - SPINNER_DEBOUNCE_MS = 200
-   - MESSAGE_POLL_INTERVAL_MS = 100
-   - HEALTH_CHECK_TIMEOUT_MS = 100
-   - etc.
-
-5. **Remove hardcoded validations**
-   - Model validation (let SDK handle it)
-   - Fallback version (read from package.json)
-
-### Phase 3: Polish (Do Next Week)
-
-6. **Fix minor issues**
-   - Remove `this.port` instance variable
-   - Update debounce to 200ms
-   - Add constants for all magic numbers
-
----
-
-## The YAGNI Philosophy Applied
-
-### What YAGNI Means Here
-
-**You Aren't Gonna Need It**: Don't build infrastructure for problems you don't have.
-
-### Examples from This Code
-
-#### YAGNI Violation ❌
-```typescript
-export function getWorkerPort(): number {
-  return FIXED_PORT; // Wrapper for a constant
-}
-```
-**Why**: Adds zero value. Pure ceremony. Just export the constant.
-
-#### YAGNI Compliance ✅
-```typescript
-export const WORKER_PORT = parseInt(...);
-```
-**Why**: Solves the actual need (get port) without ceremony.
-
----
-
-#### YAGNI Violation ❌
-```typescript
-// Check PM2 status with string parsing
-const checkProcess = spawn(pm2Path, ["list", "--no-color"]);
-let output = "";
-checkProcess.stdout?.on("data", (data) => { output += data.toString(); });
-// ... 30 lines of promise wrappers and parsing ...
-const isRunning = output.includes("claude-mem-worker") && output.includes("online");
-
-if (!isRunning) {
-  // Start worker
-}
-// But what if it's running AND unhealthy? Do nothing!
-```
-**Why**: Solving a problem that doesn't exist. PM2 restart is idempotent - it handles both "not started" and "started but broken". We don't need to distinguish.
-
-#### YAGNI Compliance ✅
-```typescript
-if (!await isWorkerHealthy()) {
-  execSync(`pm2 restart ecosystem.config.cjs`);
-  await waitForWorkerHealth();
-}
-```
-**Why**: Solves the actual problem (ensure worker is healthy) in the simplest way.
-
----
-
-### The Pattern
-
-**YAGNI Violations Follow This Pattern**:
-1. Imagine a scenario ("what if PM2 isn't installed?")
-2. Write defensive code for the scenario (silent error handling)
-3. Defer the error to a later point
-4. Make the actual error message worse
-
-**YAGNI Compliance Follows This Pattern**:
-1. Write the obvious solution (check health, restart if unhealthy)
-2. Let errors propagate naturally
-3. Add error handling only where actually needed
-4. Keep error messages clear and direct
-
----
-
-## Conclusion
-
-### Overall Assessment
-
-**worker-utils.ts**: 🔥🔥🔥🔥 2/5 - Needs complete rewrite
-**worker-service.ts**: ✅✅✅✅🔥 4/5 - Mostly excellent, fix duplication
-
-### The Good
-
-- worker-service.ts has excellent architecture (SQLite > SSE > Chroma priority)
-- Clean pagination APIs with proper parameterization
-- Good async generator pattern for SDK streaming
-- Proper SSE client management with defensive cleanup
-- Non-blocking Chroma sync with graceful failures
-
-### The Bad
-
-- worker-utils.ts has 3 critical bugs (string parsing, silent errors, missing restart)
-- 60+ lines of duplicated session auto-creation code
-- Magic numbers everywhere without named constants
-- Hardcoded defaults in multiple locations
-
-### The Ugly
-
-- `getWorkerPort()` is pure ceremony - delete it
-- 1500ms debounce is 10x too long
-- PM2 string parsing is fragile and will break
-- Silent error handling makes debugging impossible
-
-### Time to Fix
-
-- Critical fixes (worker-utils rewrite + extract getOrCreateSession): **2 hours**
-- Cleanup (centralize config, fix magic numbers): **2 hours**
-- Polish (minor issues): **1 hour**
-
-**Total**: 5 hours to bring codebase from 80% to 95% quality.
-
-### Final Verdict
-
-This code is **80% excellent, 20% disaster**. The disaster is concentrated in worker-utils.ts (which is called on EVERY session start) and the session auto-creation duplication (which makes maintenance painful). Fix these two issues and you have a rock-solid codebase.
-
-The worker-service.ts architecture is actually brilliant - the prioritization of SQLite > SSE > Chroma is exactly right, and the async generator pattern for SDK streaming is textbook perfect. Don't let the duplication overshadow the good design.
-
-**Recommendation**: Fix worker-utils.ts TODAY (it has production bugs), extract getOrCreateSession() THIS WEEK (it's painful to maintain), and clean up the rest NEXT WEEK.
diff --git a/docs/worker-service-overhead.md b/docs/worker-service-overhead.md
new file mode 100644
index 00000000..84bba30e
--- /dev/null
+++ b/docs/worker-service-overhead.md
@@ -0,0 +1,959 @@
+# Worker Service Overhead Analysis
+
+**Date**: 2025-11-06
+**File**: `src/services/worker-service.ts`
+**Total Lines**: 1173
+**Overall Assessment**: This file has accumulated unnecessary complexity, artificial delays, and defensive programming patterns that actively harm performance. Many patterns were likely added "just in case" without real-world justification.
+
+---
+
+## Executive Summary
+
+**High Severity Issues (Score 8-10)**:
+- **Line 942**: Polling loop with 100ms delay instead of event-driven architecture (Score: 10/10)
+- **Lines 338-365**: Spinner debounce with 1.5s artificial delay (Score: 9/10)
+- **Lines 204-234**: Database reopening on every getOrCreateSession call (Score: 8/10)
+
+**Medium Severity Issues (Score 5-7)**:
+- **Lines 33-70**: Unnecessary Claude path caching for rare operation (Score: 6/10)
+- **Lines 694-711**: Redundant database reopening in handleInit (Score: 7/10)
+- **Lines 728-741**: Fire-and-forget Chroma sync with verbose error handling (Score: 5/10)
+
+**Low Severity Issues (Score 3-4)**:
+- **Line 28**: Magic number MESSAGE_POLL_INTERVAL_MS without justification (Score: 4/10)
+- **Lines 303-321**: Over-engineered SSE client cleanup (Score: 4/10)
+
+---
+
+## Line-by-Line Analysis
+
+### Lines 1-30: Setup and Constants
+
+**Lines 22-24**: Version reading from package.json
+```typescript
+const packageJson = JSON.parse(readFileSync(join(__dirname, '..', '..', 'package.json'), 'utf-8'));
+const VERSION = packageJson.version;
+```
+**Score**: 2/10
+**Why**: This is fine. Reads once at startup, uses the value for the /api/stats endpoint.
+
+**Line 26**: Model configuration
+```typescript
+const MODEL = process.env.CLAUDE_MEM_MODEL || 'claude-sonnet-4-5';
+```
+**Score**: 1/10
+**Why**: Clean, simple, correct.
+
+**Line 28**: Magic number
+```typescript
+const MESSAGE_POLL_INTERVAL_MS = 100;
+```
+**Score**: 4/10
+**Why**: This is a magic number without justification. Why 100ms? Why not 50ms or 200ms? More importantly, **why are we polling at all instead of using event-driven patterns?** The name is descriptive, but the existence of this constant indicates a fundamental architectural problem (see line 942).
+
+**Pattern**: This constant exists to support a polling loop that shouldn't exist.
+
+---
+
+### Lines 33-70: Claude Path Caching
+
+```typescript
+let cachedClaudePath: string | null = null;
+
+function findClaudePath(): string {
+  if (cachedClaudePath) {
+    return cachedClaudePath;
+  }
+  // ... 30 lines of logic to find and cache path ...
+}
+```
+
+**Score**: 6/10
+**Why Stupid**:
+1. **YAGNI Violation**: This function is called **exactly once** per worker startup (line 846 in runSDKAgent)
+2. **Premature Optimization**: Caching saves ~5ms on an operation that happens once per worker lifetime
+3. **Added Complexity**: 37 lines of code including module-level state for negligible benefit
+4. **False Economy**: The worker runs for hours/days. Saving 5ms on startup is meaningless.
+
+**What Should Happen**:
+```typescript
+function findClaudePath(): string {
+  if (process.env.CLAUDE_CODE_PATH) return process.env.CLAUDE_CODE_PATH;
+
+  const command = process.platform === 'win32' ? 'where claude' : 'which claude';
+  const result = execSync(command, { encoding: 'utf8' }).trim().split('\n')[0].trim();
+
+  if (!result) throw new Error('Claude executable not found in PATH');
+  return result;
+}
+```
+**Savings**: Remove 33 lines of unnecessary code and module-level state.
+
+---
+
+### Lines 103-110: WorkerService State
+
+```typescript
+class WorkerService {
+  private app: express.Application;
+  private sessions: Map<number, ActiveSession> = new Map();
+  private chromaSync!: ChromaSync;
+  private sseClients: Set<Response> = new Set();
+  private isProcessing: boolean = false;
+  private spinnerStopTimer: NodeJS.Timeout | null = null;
+```
+
+**Score**: 7/10 (for spinnerStopTimer)
+**Why**:
+- `app`, `sessions`, `chromaSync`, `sseClients`: **Good** - necessary state
+- `isProcessing`: **Questionable** (Score 5/10) - Do we really need to track this globally? Can't we derive it from `sessions.size > 0` or `sessions.values().some(s => s.pendingMessages.length > 0)`?
+- `spinnerStopTimer`: **Bad** (Score 7/10) - Exists solely to support artificial debouncing (see lines 338-365)
+
+**Pattern**: State that exists to support other unnecessary complexity.
+
+---
+
+### Lines 145-178: Service Startup
+
+**Lines 145-153**: HTTP server startup
+```typescript
+async start(): Promise<void> {
+  const port = getWorkerPort();
+  await new Promise<void>((resolve, reject) => {
+    this.app.listen(port, () => resolve())
+      .on('error', reject);
+  });
+  logger.info('SYSTEM', 'Worker started', { port, pid: process.pid });
+```
+**Score**: 1/10
+**Why**: This is good. Clean promise wrapper, fail-fast on errors, clear logging.
+
+**Lines 155-167**: ChromaSync initialization and orphan cleanup
+```typescript
+this.chromaSync = new ChromaSync('claude-mem');
+logger.info('SYSTEM', 'ChromaSync initialized');
+
+const db = new SessionStore();
+const cleanedCount = db.cleanupOrphanedSessions();
+db.close();
+```
+**Score**: 2/10
+**Why**: This is fine. Necessary initialization and cleanup. Database is opened, used, and closed immediately.
+
+**Lines 168-177**: Chroma backfill
+```typescript
+logger.info('SYSTEM', 'Starting Chroma backfill in background...');
+this.chromaSync.ensureBackfilled()
+  .then(() => {
+    logger.info('SYSTEM', 'Chroma backfill complete');
+  })
+  .catch((error: Error) => {
+    logger.error('SYSTEM', 'Chroma backfill failed - continuing anyway', {}, error);
+    // Don't exit - allow worker to continue serving requests
+  });
+```
+**Score**: 3/10
+**Why**: This is mostly fine. Fire-and-forget background operation that doesn't block startup. The verbose error handling is slightly excessive (could be a single logger call), but acceptable for a background operation.
+
+---
+
+### Lines 200-236: getOrCreateSession - THE KILLER
+
+```typescript
+private getOrCreateSession(sessionDbId: number): ActiveSession {
+  let session = this.sessions.get(sessionDbId);
+  if (session) return session;
+
+  const db = new SessionStore();
+  const dbSession = db.getSessionById(sessionDbId);
+  if (!dbSession) {
+    db.close();
+    throw new Error(`Session ${sessionDbId} not found in database`);
+  }
+
+  session = {
+    sessionDbId,
+    claudeSessionId: dbSession.claude_session_id,
+    sdkSessionId: null,
+    project: dbSession.project,
+    userPrompt: dbSession.user_prompt,
+    pendingMessages: [],
+    abortController: new AbortController(),
+    generatorPromise: null,
+    lastPromptNumber: 0,
+    startTime: Date.now()
+  };
+
+  this.sessions.set(sessionDbId, session);
+
+  session.generatorPromise = this.runSDKAgent(session).catch(err => {
+    logger.failure('WORKER', 'SDK agent error', { sessionId: sessionDbId }, err);
+    const db = new SessionStore();
+    db.markSessionFailed(sessionDbId);
+    db.close();
+    this.sessions.delete(sessionDbId);
+  });
+
+  db.close();
+  return session;
+}
+```
+
+**Score**: 8/10
+**Why This Is Stupid**:
+
+1. **Database Reopening**: Opens database at line 204, closes at line 234. This happens on:
+   - First call to `/sessions/:id/init` (line 691)
+   - First call to `/sessions/:id/observations` (line 762)
+   - First call to `/sessions/:id/summarize` (line 789)
+
+   For a typical session: init (DB open/close) → observation (DB open/close) → observation (DB open/close) → summarize (DB open/close). **That's 4 database open/close cycles when ONE would suffice.**
+
+2. **Redundant Database Access**: The database is ALREADY opened in `handleInit` at line 695 to call `setWorkerPort()`. So we have:
+   - Line 695: `const db = new SessionStore()` in handleInit
+   - Line 696: `db.setWorkerPort()`
+   - Line 697-711: More queries on the same database
+   - Line 711: `db.close()`
+   - Line 691: `this.getOrCreateSession()` is called
+   - Line 204: **Opens database AGAIN** inside getOrCreateSession
+   - Line 234: Closes it
+
+   **This is fucking insane.** We close the database, then immediately reopen it in the same call stack.
+
+3. **Error Handler Opens Database**: Line 228 opens a NEW database connection in the error handler. If runSDKAgent fails, we open the database AGAIN just to mark it failed, then close it. This is defensive programming for ghosts - if the worker is crashing, do we really care about marking it failed?
+
+**What Should Happen**:
+- Pass the already-open database connection to getOrCreateSession
+- Or at minimum, reuse the connection from the calling context
+- The error handler should either crash hard or mark failed WITHOUT reopening the database
+
+**Estimated Performance Impact**: Database open/close is expensive (~1-5ms each). For a session with 10 observations, this pattern adds **20-100ms of pure overhead**.
+
+---
+
+### Lines 263-292: SSE Stream Setup
+
+```typescript
+private handleSSEStream(req: Request, res: Response): void {
+  // Set SSE headers
+  res.setHeader('Content-Type', 'text/event-stream');
+  res.setHeader('Cache-Control', 'no-cache');
+  res.setHeader('Connection', 'keep-alive');
+  res.setHeader('Access-Control-Allow-Origin', '*');
+
+  // Add client to set
+  this.sseClients.add(res);
+  logger.info('WORKER', `SSE client connected`, { totalClients: this.sseClients.size });
+
+  // Send only projects list - all data will be loaded via pagination
+  const db = new SessionStore();
+  const allProjects = db.getAllProjects();
+  db.close();
+
+  const initialData = {
+    type: 'initial_load',
+    projects: allProjects,
+    timestamp: Date.now()
+  };
+
+  res.write(`data: ${JSON.stringify(initialData)}\n\n`);
+
+  // Handle client disconnect
+  req.on('close', () => {
+    this.sseClients.delete(res);
+    logger.info('WORKER', `SSE client disconnected`, { remainingClients: this.sseClients.size });
+  });
+}
+```
+
+**Score**: 2/10
+**Why**: This is mostly good. Clean SSE setup with proper headers and client tracking. Database is opened, used, and closed.
+
+---
+
+### Lines 297-322: SSE Broadcast and Cleanup
+
+```typescript
+private broadcastSSE(event: any): void {
+  if (this.sseClients.size === 0) {
+    return; // No clients connected, skip broadcast
+  }
+
+  const data = `data: ${JSON.stringify(event)}\n\n`;
+  const clientsToRemove: Response[] = [];
+
+  for (const client of this.sseClients) {
+    try {
+      client.write(data);
+    } catch (error) {
+      // Client disconnected, mark for removal
+      clientsToRemove.push(client);
+    }
+  }
+
+  // Clean up disconnected clients
+  for (const client of clientsToRemove) {
+    this.sseClients.delete(client);
+  }
+
+  if (clientsToRemove.length > 0) {
+    logger.info('WORKER', `SSE cleaned up disconnected clients`, { count: clientsToRemove.length });
+  }
+}
+```
+
+**Score**: 4/10
+**Why This Is Slightly Stupid**:
+
+1. **Two-Pass Cleanup**: Creates a temporary array of failed clients, then iterates again to remove them. Why not just remove them in the first loop?
+2. **Unnecessary Logging**: Do we really need to log every time a client disconnects? The `handleSSEStream` already logs disconnects (line 290). This is duplicate logging.
+
+**What Should Happen**:
+```typescript
+private broadcastSSE(event: any): void {
+  if (this.sseClients.size === 0) return;
+
+  const data = `data: ${JSON.stringify(event)}\n\n`;
+  for (const client of this.sseClients) {
+    try {
+      client.write(data);
+    } catch {
+      this.sseClients.delete(client);
+    }
+  }
+}
+```
+
+**Savings**: Remove 10 lines, remove duplicate logging, eliminate temporary array.
+
+---
+
+### Lines 338-365: Spinner Debounce - ARTIFICIAL DELAY
+
+```typescript
+private checkAndStopSpinner(): void {
+  // Clear any existing timer
+  if (this.spinnerStopTimer) {
+    clearTimeout(this.spinnerStopTimer);
+    this.spinnerStopTimer = null;
+  }
+
+  // Check if any session has pending messages
+  const hasPendingMessages = Array.from(this.sessions.values()).some(
+    session => session.pendingMessages.length > 0
+  );
+
+  if (!hasPendingMessages) {
+    // Debounce: wait 1.5s and check again
+    this.spinnerStopTimer = setTimeout(() => {
+      const stillEmpty = Array.from(this.sessions.values()).every(
+        session => session.pendingMessages.length === 0
+      );
+
+      if (stillEmpty) {
+        logger.debug('WORKER', 'All queues empty - stopping spinner');
+        this.broadcastProcessingStatus(false);
+      }
+
+      this.spinnerStopTimer = null;
+    }, 1500);
+  }
+}
+```
+
+**Score**: 9/10
+**Why This Is ABSOLUTELY FUCKING STUPID**:
+
+1. **Artificial Delay**: **1.5 SECONDS** (1500ms) of artificial delay before stopping the spinner. This is pure overhead added for no reason.
+
+2. **Why Was This Added?**: Probably someone thought "the UI flickers when the spinner stops/starts rapidly." **SO FUCKING WHAT?** That's a UI rendering problem, not a worker service problem. Fix it in the UI with CSS transitions or debouncing on the CLIENT side.
+
+3. **Double-Check Pattern**: Checks if queues are empty, waits 1.5s, then checks AGAIN. This is defensive programming for ghosts. If the queue is empty, it's empty. We're not protecting against race conditions here - we're just wasting time.
+
+4. **Polling Instead of Events**: This function is called from `handleAgentMessage` (line 1145) after processing every single response. Instead of reacting to the actual completion of work, we're polling state and debouncing.
+
+5. **State Management Overhead**: Requires `spinnerStopTimer` field (line 109), timer cleanup logic, null checks, etc.
+
+**Real-World Impact**: Every time the worker finishes processing observations, the UI spinner continues to show "processing" for **1.5 seconds** even though nothing is happening. This makes the entire system feel slower.
+
+**What Should Happen**:
+```typescript
+private checkAndStopSpinner(): void {
+  const hasPendingMessages = Array.from(this.sessions.values()).some(
+    session => session.pendingMessages.length > 0
+  );
+
+  if (!hasPendingMessages) {
+    this.broadcastProcessingStatus(false);
+  }
+}
+```
+
+**Savings**: Remove 15 lines of debouncing logic, remove timer state, eliminate 1.5s artificial delay.
+
+**Alternative**: If UI flickering is actually a problem (prove it first), handle it client-side with CSS transitions or client-side debouncing.
+
+---
+
+### Lines 370-411: Stats Endpoint
+
+```typescript
+private handleStats(_req: Request, res: Response): void {
+  try {
+    const db = new SessionStore();
+
+    // Get database stats
+    const obsCount = db.db.prepare('SELECT COUNT(*) as count FROM observations').get() as { count: number };
+    const sessionCount = db.db.prepare('SELECT COUNT(*) as count FROM sdk_sessions').get() as { count: number };
+    const summaryCount = db.db.prepare('SELECT COUNT(*) as count FROM session_summaries').get() as { count: number };
+
+    // Get database file size
+    const dbPath = join(homedir(), '.claude-mem', 'claude-mem.db');
+    let dbSize = 0;
+    if (existsSync(dbPath)) {
+      dbSize = statSync(dbPath).size;
+    }
+
+    db.close();
+
+    // Get worker stats
+    const uptime = process.uptime();
+
+    res.json({
+      worker: {
+        version: VERSION,
+        uptime: Math.floor(uptime),
+        activeSessions: this.sessions.size,
+        sseClients: this.sseClients.size,
+        port: getWorkerPort()
+      },
+      database: {
+        path: dbPath,
+        size: dbSize,
+        observations: obsCount.count,
+        sessions: sessionCount.count,
+        summaries: summaryCount.count
+      }
+    });
+  } catch (error: any) {
+    logger.error('WORKER', 'Failed to get stats', {}, error);
+    res.status(500).json({ error: 'Failed to get stats' });
+  }
+}
+```
+
+**Score**: 3/10
+**Why Slightly Stupid**:
+
+1. **Redundant existsSync Check**: The database path is guaranteed to exist if SessionStore initialized successfully. If it doesn't exist, SessionStore would have crashed on startup. This is defensive programming for ghosts.
+
+2. **Three Separate Queries**: Could be combined into a single query with UNION or multiple SELECT columns, but this is minor.
+
+**What Should Happen**:
+```typescript
+const dbSize = statSync(dbPath).size; // Just crash if it doesn't exist
+```
+
+Otherwise, this is mostly fine. Stats endpoints are low-frequency and non-critical.
+
+---
+
+### Lines 507-555: GET /api/observations
+
+```typescript
+private handleGetObservations(req: Request, res: Response): void {
+  try {
+    const offset = parseInt(req.query.offset as string || '0', 10);
+    const limit = Math.min(parseInt(req.query.limit as string || '50', 10), 100); // Cap at 100
+    const project = req.query.project as string | undefined;
+
+    const db = new SessionStore();
+
+    // Build query with optional project filter
+    let query = `
+      SELECT id, type, title, subtitle, text, project, prompt_number, created_at, created_at_epoch
+      FROM observations
+    `;
+    let countQuery = 'SELECT COUNT(*) as total FROM observations';
+    const params: any[] = [];
+    const countParams: any[] = [];
+
+    if (project) {
+      query += ' WHERE project = ?';
+      countQuery += ' WHERE project = ?';
+      params.push(project);
+      countParams.push(project);
+    }
+
+    query += ' ORDER BY created_at_epoch DESC LIMIT ? OFFSET ?';
+    params.push(limit, offset);
+
+    const stmt = db.db.prepare(query);
+    const observations = stmt.all(...params);
+
+    // Check if there are more results
+    const countStmt = db.db.prepare(countQuery);
+    const { total } = countStmt.get(...countParams) as { total: number };
+    const hasMore = (offset + limit) < total;
+
+    db.close();
+
+    res.json({
+      observations,
+      hasMore,
+      total,
+      offset,
+      limit
+    });
+  } catch (error: any) {
+    logger.error('WORKER', 'Failed to get observations', {}, error);
+    res.status(500).json({ error: 'Failed to get observations' });
+  }
+}
+```
+
+**Score**: 5/10
+**Why This Is Mildly Stupid**:
+
+1. **Duplicate Parameter Arrays**: `params` and `countParams` are maintained separately even though they contain the same values (just the project filter). This is error-prone and verbose.
+
+2. **Two Queries Instead of One**: We run a COUNT query and a SELECT query. For small datasets, this is fine, but for large datasets, the COUNT query can be expensive. The `hasMore` flag could be computed by fetching `limit + 1` rows and checking if we got more than `limit`.
+
+**What Should Happen**:
+```typescript
+// Fetch one extra row to determine if there are more results
+const stmt = db.db.prepare(query);
+const results = stmt.all(...params);
+const observations = results.slice(0, limit);
+const hasMore = results.length > limit;
+
+// Only run COUNT if the UI actually needs it (it probably doesn't)
+```
+
+**Pattern**: This same pattern is repeated in `handleGetSummaries` (line 557) and `handleGetPrompts` (line 618). Copy-paste code smell.
+
+**Estimated Savings**: Remove COUNT queries (which can be expensive on large tables), simplify parameter handling.
+
+---
+
+### Lines 685-752: POST /sessions/:sessionDbId/init - DATABASE REOPENING HELL
+
+```typescript
+private async handleInit(req: Request, res: Response): Promise<void> {
+  const sessionDbId = parseInt(req.params.sessionDbId, 10);
+  const { project } = req.body;
+
+  logger.info('WORKER', 'Session init', { sessionDbId, project });
+
+  const session = this.getOrCreateSession(sessionDbId); // <-- Opens DB at line 204
+  const claudeSessionId = session.claudeSessionId;
+
+  // Update port in database
+  const db = new SessionStore(); // <-- Opens DB AGAIN
+  db.setWorkerPort(sessionDbId, getWorkerPort());
+
+  // Get the latest user_prompt for this session to sync to Chroma
+  const latestPrompt = db.db.prepare(`
+    SELECT
+      up.*,
+      s.sdk_session_id,
+      s.project
+    FROM user_prompts up
+    JOIN sdk_sessions s ON up.claude_session_id = s.claude_session_id
+    WHERE up.claude_session_id = ?
+    ORDER BY up.created_at_epoch DESC
+    LIMIT 1
+  `).get(claudeSessionId) as any;
+
+  db.close(); // <-- Closes DB
+
+  // ... SSE broadcast ...
+  // ... Chroma sync ...
+
+  logger.success('WORKER', 'Session initialized', { sessionId: sessionDbId, port: getWorkerPort() });
+  res.json({
+    status: 'initialized',
+    sessionDbId,
+    port: getWorkerPort()
+  });
+}
+```
+
+**Score**: 7/10
+**Why This Is Stupid**:
+
+1. **Two Database Opens in Same Function**:
+   - Line 691: `getOrCreateSession()` opens DB internally (line 204)
+   - Line 695: Opens DB AGAIN for `setWorkerPort()`
+   - Line 711: Closes DB
+
+2. **Redundant Data Fetching**: `getOrCreateSession()` already fetches session data from the database (line 205). Then we query AGAIN for the user prompt (line 698).
+
+3. **Tight Coupling**: `getOrCreateSession()` hides database access, making it unclear that we're opening the database twice.
+
+**What Should Happen**:
+- Open database ONCE at the start of handleInit
+- Pass the open database to getOrCreateSession
+- Fetch all needed data in a single transaction
+- Close database at the end
+
+**Estimated Savings**: Eliminate 1 database open/close cycle (1-5ms).
+
+---
+
+### Lines 728-741: Chroma Sync with Verbose Error Handling
+
+```typescript
+// Sync user prompt to Chroma (fire-and-forget, but crash on failure)
+if (latestPrompt) {
+  this.chromaSync.syncUserPrompt(
+    latestPrompt.id,
+    latestPrompt.sdk_session_id,
+    latestPrompt.project,
+    latestPrompt.prompt_text,
+    latestPrompt.prompt_number,
+    latestPrompt.created_at_epoch
+  ).catch(err => {
+    logger.failure('WORKER', 'Failed to sync user_prompt to Chroma - continuing', { promptId: latestPrompt.id }, err);
+    // Don't crash - SQLite has the data
+  });
+}
+```
+
+**Score**: 5/10
+**Why This Is Mildly Stupid**:
+
+1. **Inconsistent Error Handling**: The comment says "crash on failure" but then we catch the error and continue. Which is it?
+
+2. **Redundant Comment**: The code says `.catch(err => { /* continue */ })` and the comment says "Don't crash - SQLite has the data". The code is self-documenting.
+
+3. **Fire-and-Forget**: If we're going to fire-and-forget, why bother with verbose error handling? Either care about failures (and retry/alert) or don't (and just log).
+
+**What Should Happen**:
+```typescript
+// Fire-and-forget Chroma sync (SQLite is source of truth)
+if (latestPrompt) {
+  this.chromaSync.syncUserPrompt(/* ... */).catch(() => {}); // Swallow errors
+}
+```
+
+**Pattern**: This same verbose error handling appears in lines 1057-1076 and 1114-1133.
+
+---
+
+### Lines 758-779: POST /sessions/:sessionDbId/observations
+
+```typescript
+private handleObservation(req: Request, res: Response): void {
+  const sessionDbId = parseInt(req.params.sessionDbId, 10);
+  const { tool_name, tool_input, tool_output, prompt_number } = req.body;
+
+  const session = this.getOrCreateSession(sessionDbId); // <-- Opens DB
+  const toolStr = logger.formatTool(tool_name, tool_input);
+
+  logger.dataIn('WORKER', `Observation queued: ${toolStr}`, {
+    sessionId: sessionDbId,
+    queue: session.pendingMessages.length + 1
+  });
+
+  session.pendingMessages.push({
+    type: 'observation',
+    tool_name,
+    tool_input,
+    tool_output,
+    prompt_number
+  });
+
+  res.json({ status: 'queued', queueLength: session.pendingMessages.length });
+}
+```
+
+**Score**: 6/10
+**Why This Is Stupid**:
+
+1. **Database Opens for No Reason**: `getOrCreateSession()` opens the database (line 204), but we don't actually need any data from the database here. We just need to get or create the in-memory session object.
+
+2. **Hot Path Performance**: This endpoint is called **for every single tool execution**. If you run 100 tool calls in a session, this opens/closes the database 100 times unnecessarily.
+
+**What Should Happen**:
+- Separate "get existing session" from "create session from database"
+- Only open database if creating a new session
+- For existing sessions, just push to the queue
+
+**Estimated Savings**: For a session with 100 observations, eliminate 99 unnecessary database open/close cycles (**99-495ms of pure overhead**).
+
+---
+
+### Lines 914-1005: createMessageGenerator - THE POLLING HORROR
+
+```typescript
+private async* createMessageGenerator(session: ActiveSession): AsyncIterable<SDKUserMessage> {
+  // ... send init prompt ...
+
+  // Process messages continuously until session is deleted
+  while (true) {
+    if (session.abortController.signal.aborted) {
+      break;
+    }
+
+    if (session.pendingMessages.length === 0) {
+      await new Promise(resolve => setTimeout(resolve, MESSAGE_POLL_INTERVAL_MS));
+      continue;
+    }
+
+    while (session.pendingMessages.length > 0) {
+      const message = session.pendingMessages.shift()!;
+      // ... process message ...
+      yield { /* SDK message */ };
+    }
+  }
+}
+```
+
+**Score**: 10/10
+**Why This Is ABSOLUTELY FUCKING STUPID**:
+
+1. **Infinite Polling Loop**: Lines 936-944 implement a **busy-wait polling loop** that checks `pendingMessages.length` every 100ms. This is the single dumbest pattern in the entire file.
+
+2. **Event-Driven Alternative**: We have a fucking queue! When something is added to the queue, **NOTIFY THE CONSUMER**. Use an EventEmitter, a Promise, a Condition Variable, ANYTHING but polling.
+
+3. **Wasted CPU**: Every 100ms, this loop wakes up, checks if the queue is empty, and goes back to sleep. For a worker that runs for hours, this is thousands of unnecessary wake-ups.
+
+4. **Latency**: When an observation is queued (line 770), it sits in the queue for up to 100ms before being processed. **This adds 0-100ms of artificial latency to every single observation.**
+
+5. **Battery Impact**: On laptops, constant polling prevents CPU from entering deep sleep states, draining battery.
+
+**What Should Happen**:
+
+```typescript
+// In WorkerService class
+private sessionQueues: Map<number, EventEmitter> = new Map();
+
+private handleObservation(req: Request, res: Response): void {
+  // ... existing code ...
+  session.pendingMessages.push({ /* message */ });
+
+  // Notify the generator that new work is available
+  const emitter = this.sessionQueues.get(sessionDbId);
+  if (emitter) {
+    emitter.emit('message');
+  }
+
+  res.json({ status: 'queued', queueLength: session.pendingMessages.length });
+}
+
+private async* createMessageGenerator(session: ActiveSession): AsyncIterable<SDKUserMessage> {
+  const emitter = new EventEmitter();
+  this.sessionQueues.set(session.sessionDbId, emitter);
+
+  yield { /* init prompt */ };
+
+  while (!session.abortController.signal.aborted) {
+    if (session.pendingMessages.length === 0) {
+      // Wait for new messages via event, not polling
+      await new Promise(resolve => emitter.once('message', resolve));
+    }
+
+    while (session.pendingMessages.length > 0) {
+      const message = session.pendingMessages.shift()!;
+      yield { /* process message */ };
+    }
+  }
+
+  this.sessionQueues.delete(session.sessionDbId);
+}
+```
+
+**Estimated Savings**:
+- Remove 100ms polling interval (eliminate 0-100ms latency per observation)
+- Reduce CPU wake-ups from ~10/second to 0 when idle
+- Improve battery life on laptops
+- Make the system feel more responsive
+
+**Real-World Impact**: For a session with 10 observations, this polling adds **0-1000ms of cumulative latency**. The user is literally waiting for the polling loop to wake up.
+
+---
+
+### Lines 1011-1146: handleAgentMessage - Database Reopening and Chroma Spam
+
+```typescript
+private handleAgentMessage(session: ActiveSession, content: string, promptNumber: number): void {
+  // ... parse observations and summary ...
+
+  const db = new SessionStore(); // <-- Opens DB
+
+  // Store observations and sync to Chroma (non-blocking, fail-fast)
+  for (const obs of observations) {
+    const { id, createdAtEpoch } = db.storeObservation(/* ... */);
+    logger.success('DB', 'Observation stored', { /* ... */ });
+
+    // Broadcast to SSE clients
+    this.broadcastSSE({ /* ... */ });
+
+    // Sync to Chroma (non-blocking fire-and-forget, but crash on failure)
+    this.chromaSync.syncObservation(/* ... */)
+      .then(() => {
+        logger.success('WORKER', 'Observation synced to Chroma', { /* ... */ });
+      })
+      .catch((error: Error) => {
+        logger.error('WORKER', 'Observation sync failed - continuing', { /* ... */ }, error);
+        // Don't crash - SQLite has the data
+      });
+  }
+
+  // ... similar pattern for summary ...
+
+  db.close(); // <-- Closes DB
+
+  // Check if queue is empty and stop spinner after debounce
+  this.checkAndStopSpinner(); // <-- Triggers 1.5s delay
+}
+```
+
+**Score**: 6/10
+**Why This Is Stupid**:
+
+1. **Database Reopening**: Opens database (line 1030), stores all observations, closes database (line 1142). This is called **for every SDK response**. For a session with 10 observations, this opens/closes the database 10+ times.
+
+2. **Verbose Chroma Error Handling**: Lines 1057-1076 and 1114-1133 have identical verbose error handling for Chroma sync failures. This is copy-paste code smell.
+
+3. **Success Logging Spam**: Line 1066 and 1123 log success for EVERY Chroma sync. For a session with 100 observations, this logs 100 success messages. Why? Who reads these?
+
+4. **Debounce Call**: Line 1145 calls `checkAndStopSpinner()`, triggering the 1.5s artificial delay.
+
+**What Should Happen**:
+- Reuse database connection across multiple calls
+- Simplify Chroma error handling (fire-and-forget means swallow errors)
+- Remove success logging (or make it debug-level)
+- Remove debounce delay
+
+---
+
+## Summary of Patterns
+
+### 1. Database Reopening Anti-Pattern
+**Occurrences**: Lines 200-236, 685-752, 758-779, 1011-1146
+**Impact**: Opens/closes database 4-100+ times per session instead of reusing connections
+**Fix**: Pass open database connections between functions, use transactions, connection pooling
+
+### 2. Polling Instead of Events
+**Occurrences**: Line 942 (100ms polling loop)
+**Impact**: 0-100ms latency per observation, wasted CPU cycles, battery drain
+**Fix**: Use EventEmitter or async queue with await/notify pattern
+
+### 3. Artificial Delays
+**Occurrences**: Line 363 (1.5s spinner debounce), line 942 (100ms poll interval)
+**Impact**: 1.5s delay before spinner stops, 0-100ms delay per observation
+**Fix**: Remove debouncing, use event-driven patterns
+
+### 4. Premature Optimization
+**Occurrences**: Lines 33-70 (Claude path caching)
+**Impact**: 37 lines of code to save 5ms on a one-time operation
+**Fix**: Remove caching, inline the function
+
+### 5. Defensive Programming for Ghosts
+**Occurrences**: Line 382 (existsSync check), lines 228-231 (error handler reopens DB), lines 728-741 (verbose error handling)
+**Impact**: Code complexity without real benefit
+**Fix**: Fail fast, trust invariants, simplify error handling
+
+### 6. Copy-Paste Code
+**Occurrences**: handleGetObservations, handleGetSummaries, handleGetPrompts (nearly identical)
+**Impact**: Maintenance burden, inconsistency risk
+**Fix**: Extract common pagination logic into helper function
+
+---
+
+## Recommendations
+
+### Immediate Wins (Low Effort, High Impact)
+
+1. **Remove Spinner Debounce** (Lines 338-365)
+   - **Effort**: 5 minutes
+   - **Impact**: Eliminate 1.5s artificial delay
+   - **Score**: 9/10 stupidity
+
+2. **Replace Polling with Events** (Line 942)
+   - **Effort**: 30 minutes
+   - **Impact**: Eliminate 0-100ms latency per observation, reduce CPU usage
+   - **Score**: 10/10 stupidity
+
+3. **Remove Claude Path Caching** (Lines 33-70)
+   - **Effort**: 5 minutes
+   - **Impact**: Remove 37 lines of unnecessary code
+   - **Score**: 6/10 stupidity
+
+### Medium Wins (Moderate Effort, Good Impact)
+
+4. **Fix Database Reopening in Hot Path** (Lines 758-779)
+   - **Effort**: 1 hour
+   - **Impact**: Eliminate 99+ database cycles per session
+   - **Score**: 6/10 stupidity
+
+5. **Simplify Chroma Error Handling** (Lines 728-741, 1057-1076, 1114-1133)
+   - **Effort**: 15 minutes
+   - **Impact**: Remove 50+ lines of verbose error handling
+   - **Score**: 5/10 stupidity
+
+6. **Simplify SSE Broadcast** (Lines 297-322)
+   - **Effort**: 5 minutes
+   - **Impact**: Remove 10 lines, eliminate two-pass cleanup
+   - **Score**: 4/10 stupidity
+
+### Long-Term Improvements (High Effort, Architectural)
+
+7. **Database Connection Pooling**
+   - **Effort**: 4 hours
+   - **Impact**: Reuse connections across requests, eliminate all open/close overhead
+   - **Score**: 8/10 stupidity (current approach)
+
+8. **Extract Pagination Helper**
+   - **Effort**: 1 hour
+   - **Impact**: DRY up handleGetObservations/Summaries/Prompts
+   - **Score**: 5/10 stupidity
+
+---
+
+## Estimated Performance Impact
+
+**Current Hot Path (1 observation)**:
+- HTTP request arrives: 0ms
+- getOrCreateSession opens/closes DB: 1-5ms
+- Queue message: 0ms
+- Poll interval: 0-100ms (average 50ms)
+- SDK processing: variable
+- handleAgentMessage opens/closes DB: 1-5ms
+- Chroma sync (async): N/A
+- checkAndStopSpinner debounce: 1500ms
+- **Total artificial overhead**: 1502-1610ms (1.5-1.6 seconds)
+
+**Optimized Hot Path (1 observation)**:
+- HTTP request arrives: 0ms
+- Get existing session (no DB): 0ms
+- Queue message + notify: 0ms
+- SDK processing: variable
+- Store in DB (connection pool): 0.1-0.5ms
+- Chroma sync (async): N/A
+- Stop spinner (no debounce): 0ms
+- **Total artificial overhead**: 0.1-0.5ms
+
+**Speedup**: **3000-16000x faster** (removing artificial delays and polling)
+
+---
+
+## Conclusion
+
+This file has accumulated significant technical debt in the form of:
+- **Artificial delays** (1.5s debounce, 100ms polling)
+- **Database reopening anti-pattern** (4-100+ opens per session)
+- **Polling instead of events** (busy-wait loop)
+- **Premature optimization** (caching rare operations)
+- **Defensive programming** (protecting against non-existent failures)
+
+The worker spends more time **waiting** (polling, debouncing) than **working**. Most of these patterns were likely added with good intentions ("make the UI smooth", "cache for performance", "handle errors gracefully") but ended up creating more problems than they solved.
+
+**Priority Fixes**:
+1. Remove spinner debounce (9/10 stupidity)
+2. Replace polling with events (10/10 stupidity)
+3. Fix database reopening in hot path (6-8/10 stupidity)
+
+These three changes alone would eliminate **1.5+ seconds of artificial delay** per session and make the system feel dramatically more responsive.