feat: Conduct comprehensive overhead analysis of worker service

- Added detailed documentation on performance issues within `worker-service.ts`. - Identified high severity issues including unnecessary polling, artificial delays, and redundant database operations. - Provided recommendations for immediate and long-term improvements to enhance performance and reduce complexity. - Suggested architectural changes to replace polling with event-driven patterns and optimize database connection handling.
2025-11-06 23:10:33 -05:00
parent 7c3477b7e1
commit 9eddc51979
5 changed files with 959 additions and 2744 deletions
@@ -1,405 +0,0 @@
-# Viewer UI - Web-Based Memory Stream Visualization
-
-## Overview
-
-The Claude-Mem Viewer UI is a production-ready web interface that provides real-time visualization of your memory stream. Access it at **http://localhost:37777** while the claude-mem worker is running.
-
-**Key Features:**
- 🔴 **Real-time Updates** - Server-Sent Events (SSE) stream new observations, sessions, and prompts instantly
- 📜 **Infinite Scroll** - Load historical data progressively with automatic pagination
- 🎯 **Project Filtering** - Focus on specific codebases with smart project selection
- 🎨 **Theme Toggle** - Light, dark, or system preference with persistent settings
- 💾 **Settings Persistence** - Sidebar state and project filters saved automatically
- 🔄 **Auto-Reconnection** - Exponential backoff ensures connection stability
- ⚡ **GPU Acceleration** - Smooth animations and transitions
-
-## Architecture
-
-### Technology Stack
-
-| Component | Technology | Purpose |
-|-----------|-----------|---------|
-| **Framework** | React + TypeScript | Component-based UI with type safety |
-| **Build System** | esbuild | Self-contained HTML bundle (no separate assets) |
-| **Real-time** | Server-Sent Events (SSE) | Push-based updates from worker service |
-| **State Management** | React hooks | Local state with custom hooks for SSE, pagination, settings |
-| **Styling** | Inline CSS | No external stylesheets, fully self-contained |
-| **Typography** | Monaspace Radon | Embedded monospace font for code aesthetics |
-
-### File Structure
-
-```
-src/ui/viewer/
-├── App.tsx                    # Main application component
-├── types.ts                   # TypeScript interfaces
-├── components/
-│   ├── Header.tsx             # Top navigation with logo and theme toggle
-│   ├── Sidebar.tsx            # Project filter and stats sidebar
-│   ├── Feed.tsx               # Main feed with infinite scroll
-│   ├── ThemeToggle.tsx        # Light/dark/system theme selector
-│   └── cards/
-│       ├── ObservationCard.tsx  # Displays individual observations
-│       ├── SummaryCard.tsx      # Displays session summaries
-│       ├── PromptCard.tsx       # Displays user prompts
-│       └── SkeletonCard.tsx     # Loading placeholder
-├── hooks/
-│   ├── useSSE.ts              # Server-Sent Events connection
-│   ├── usePagination.ts       # Infinite scroll logic
-│   ├── useSettings.ts         # Settings persistence
-│   ├── useStats.ts            # Database statistics
-│   └── useTheme.ts            # Theme management
-└── utils/
-    ├── constants.ts           # Configuration constants
-    ├── data.ts                # Data merging and deduplication
-    └── formatters.ts          # Date/time formatting helpers
-```
-
-### Data Flow
-
-```
-┌─────────────────────────────────────────────────────────────┐
-│ Worker Service (port 37777)                                 │
-│  - Express HTTP API                                         │
-│  - SSE endpoint: /stream                                    │
-│  - REST endpoints: /api/*                                   │
-└─────────────────────────────────────────────────────────────┘
-                            ↓
-┌─────────────────────────────────────────────────────────────┐
-│ Viewer UI (React App)                                       │
-│  - useSSE hook: Real-time stream                           │
-│  - usePagination hook: Historical data                     │
-│  - useSettings hook: Persistent preferences                │
-└─────────────────────────────────────────────────────────────┘
-                            ↓
-┌─────────────────────────────────────────────────────────────┐
-│ Feed Component                                              │
-│  - Merges real-time + paginated data                       │
-│  - Deduplicates by ID                                       │
-│  - Filters by selected project                             │
-│  - Infinite scroll triggers pagination                     │
-└─────────────────────────────────────────────────────────────┘
-```
-
-## Features In Detail
-
-### Real-Time Updates (SSE)
-
-The viewer uses Server-Sent Events to receive updates instantly:
-
-```typescript
-// SSE message format
-{
-  "type": "observation" | "summary" | "prompt" | "projects" | "processing",
-  "data": { /* record data */ }
-}
-```
-
-**Event Types:**
- `observation` - New observation created
- `summary` - Session summary generated
- `prompt` - User prompt captured
- `projects` - Project list updated
- `processing` - Session processing status changed
-
-**Connection Management:**
- Auto-reconnect on disconnect with exponential backoff
- Visual connection status indicator in header
- Graceful degradation if SSE unavailable
-
-### Infinite Scroll Pagination
-
-The feed loads historical data progressively:
-
-1. **Initial Load**: First 20 records loaded on mount
-2. **Scroll Trigger**: When user scrolls to 80% of feed height
-3. **Batch Load**: Next 20 records fetched via `/api/{type}?offset=X&limit=20`
-4. **Deduplication**: Merges with real-time data, removes duplicates by ID
-5. **Loading State**: Skeleton cards show while fetching
-
-**Performance:**
- Requests debounced to prevent spam
- Only visible when scrolled near bottom
- Continues until no more records available
-
-### Project Filtering
-
-Filter memory stream by specific projects:
-
-1. Projects extracted from observations, summaries, and prompts
-2. Sidebar shows all unique project names with counts
-3. Click project name to filter feed
-4. Click "All Projects" to clear filter
-5. Filter persisted to localStorage
-
-**Project Detection:**
- Extracted from `projectPath` or `project` field in records
- Basename of path used as project name
- Empty/null projects shown as "(No Project)"
-
-### Theme Toggle (v5.1.2)
-
-Three theme modes available:
-
- **Light Mode**: Clean white background, dark text
- **Dark Mode**: Dark background, light text (default)
- **System**: Matches OS preference automatically
-
-**Implementation:**
-```typescript
-// Theme preference stored in localStorage
-localStorage.setItem('theme-preference', 'light' | 'dark' | 'system');
-
-// CSS variables updated dynamically
-document.documentElement.setAttribute('data-theme', resolvedTheme);
-```
-
-**CSS Variables:**
-```css
-:root[data-theme="light"] {
-  --bg-primary: #ffffff;
-  --text-primary: #1f2937;
-  /* ... */
-}
-
-:root[data-theme="dark"] {
-  --bg-primary: #111827;
-  --text-primary: #f9fafb;
-  /* ... */
-}
-```
-
-### Settings Persistence
-
-Settings automatically saved to worker service:
-
-**Saved Settings:**
- `sidebarOpen` - Sidebar expanded/collapsed state
- `selectedProject` - Current project filter
- `theme` - Theme preference (light/dark/system)
-
-**API Endpoints:**
- `GET /api/settings` - Retrieve saved settings
- `POST /api/settings` - Save settings (debounced 500ms)
-
-**Local Fallback:**
- If API unavailable, settings stored in localStorage
- Synced back to API when connection restored
-
-## Usage Guide
-
-### Opening the Viewer
-
-1. Ensure claude-mem worker is running (auto-starts with Claude Code)
-2. Open browser to http://localhost:37777
-3. Viewer loads automatically with recent records
-
-### Navigating the Feed
-
-**Cards Displayed:**
- **Observation Cards** (blue accent) - Tool usage observations with title, narrative, concepts, files
- **Summary Cards** (green accent) - Session summaries with request, completion, learnings
- **Prompt Cards** (purple accent) - Raw user prompts with timestamp and project
-
-**Card Features:**
- Click to expand/collapse full details
- Type indicators (🔴 bugfix, 🟣 feature, 🔄 refactor, etc.)
- Concept tags (clickable for future filtering)
- File references with paths
- Timestamps in relative format ("2 hours ago")
-
-### Using Project Filters
-
-1. **Open Sidebar**: Click hamburger menu (☰) in top-left
-2. **View Stats**: See total observations, sessions, prompts
-3. **Select Project**: Click project name to filter
-4. **View Counts**: Numbers show records per project
-5. **Clear Filter**: Click "All Projects" to reset
-
-### Changing Theme
-
-1. **Open Theme Toggle**: Click theme icon in header
-2. **Select Mode**:
-   - ☀️ Light mode
-   - 🌙 Dark mode
-   - 💻 System (follows OS)
-3. **Auto-Save**: Preference saved immediately
-4. **Smooth Transition**: CSS transitions between themes
-
-### Troubleshooting
-
-**Viewer Not Loading:**
-```bash
-# Check worker status
-npm run worker:logs
-
-# Restart worker
-npm run worker:restart
-
-# Check if port 37777 is available
-lsof -i :37777
-```
-
-**SSE Connection Issues:**
- Check browser console for connection errors
- Verify no proxy/firewall blocking EventSource
- Auto-reconnect attempts every 1-5s with exponential backoff
-
-**Theme Not Persisting:**
- Check localStorage: `localStorage.getItem('theme-preference')`
- Verify `/api/settings` endpoint responding
- Clear browser cache if stale
-
-**Infinite Scroll Not Triggering:**
- Scroll to 80% of feed height
- Check browser console for fetch errors
- Verify `/api/{type}` endpoints responding with data
-
-## Development
-
-### Building the Viewer
-
-```bash
-# Build viewer UI
-npm run build
-
-# Output: plugin/ui/viewer.html (self-contained)
-```
-
-### Adding New Features
-
-**Example: Add a new card component**
-
-1. Create component:
-```typescript
-// src/ui/viewer/components/cards/MyCard.tsx
-export function MyCard({ data }: { data: MyData }) {
-  return (
-    <div className="card">
-      <div className="card-header">{data.title}</div>
-      <div className="card-body">{data.content}</div>
-    </div>
-  );
-}
-```
-
-2. Add to Feed component:
-```typescript
-// src/ui/viewer/components/Feed.tsx
-import { MyCard } from './cards/MyCard';
-
-// In render:
-{myData.map(item => <MyCard key={item.id} data={item} />)}
-```
-
-3. Rebuild:
-```bash
-npm run build
-npm run sync-marketplace
-npm run worker:restart
-```
-
-### Testing Changes
-
-1. Make changes to `src/ui/viewer/`
-2. Rebuild: `npm run build`
-3. Restart worker: `npm run worker:restart`
-4. Refresh browser (http://localhost:37777)
-5. Check browser console for errors
-
-## API Integration
-
-The viewer consumes these worker service endpoints:
-
-### Data Retrieval
-
-```typescript
-// Get paginated observations
-GET /api/observations?offset=0&limit=20&project=myproject
-Response: { observations: Observation[], hasMore: boolean }
-
-// Get paginated summaries
-GET /api/summaries?offset=0&limit=20&project=myproject
-Response: { summaries: Summary[], hasMore: boolean }
-
-// Get paginated prompts
-GET /api/prompts?offset=0&limit=20&project=myproject
-Response: { prompts: UserPrompt[], hasMore: boolean }
-
-// Get database stats
-GET /api/stats
-Response: { totalObservations: number, totalSessions: number, ... }
-```
-
-### Real-Time Stream
-
-```typescript
-// Server-Sent Events stream
-GET /stream
-
-// Message format:
-event: observation
-data: {"type":"observation","data":{...}}
-
-event: summary
-data: {"type":"summary","data":{...}}
-```
-
-### Settings
-
-```typescript
-// Get settings
-GET /api/settings
-Response: { sidebarOpen: boolean, selectedProject: string, ... }
-
-// Save settings
-POST /api/settings
-Body: { sidebarOpen: boolean, selectedProject: string, ... }
-Response: { success: boolean }
-```
-
-## Performance Considerations
-
-### Bundle Size
- Self-contained HTML: ~150KB (gzipped)
- No external dependencies loaded at runtime
- Monaspace Radon font embedded (subset)
-
-### Memory Management
- Virtualization: Only renders visible cards
- Deduplication: Prevents duplicate records in memory
- Cleanup: Old records beyond pagination limit pruned
-
-### Network Efficiency
- SSE: Single long-lived connection for real-time updates
- REST: Paginated requests (20 records per batch)
- Debouncing: Settings saves debounced 500ms
-
-### Rendering Performance
- React.memo: Cards memoized to prevent unnecessary re-renders
- useMemo: Data merging/filtering memoized
- CSS transitions: GPU-accelerated for smooth animations
-
-## Future Enhancements
-
-Potential features for future versions:
-
- **Search**: Full-text search across observations, summaries, prompts
- **Export**: Download data as JSON, CSV, or markdown
- **Charts**: Visualize observation frequency, types, concepts over time
- **Keyboard Shortcuts**: Navigate feed, toggle sidebar, switch themes
- **Notifications**: Browser notifications for important observations
- **Dark/Light Auto-Schedule**: Auto-switch theme based on time of day
- **Custom Themes**: User-defined color schemes
- **Multi-Project Views**: Compare multiple projects side-by-side
-
-## Resources
-
- **Source Code**: `src/ui/viewer/`
- **Built Output**: `plugin/ui/viewer.html`
- **Worker Service**: `src/services/worker-service.ts`
- **Build Script**: `scripts/build-viewer.js`
- **Documentation**: This file
-
---
-
-**Built with React + TypeScript** | **Powered by Server-Sent Events** | **Self-Contained HTML Bundle**
@@ -1,303 +0,0 @@
-# Worker Service Refactor Plan
-
-**Date**: 2025-11-06
-**Based on**: worker-service-analysis.md
-**Branch**: cleanup/worker
-
---
-
-## Decisions Made
-
-### 🔥🔥🔥🔥🔥 Critical Fixes
-
-#### Issue #1: Fragile PM2 String Parsing
-**Decision**: DELETE all PM2 status checking code
- Remove lines 54-98 in worker-utils.ts (PM2 list parsing)
- Replace with simple: health check → if unhealthy, restart → wait for health
- PM2 restart is idempotent - handles "not started" and "started but broken"
- Rationale: "Just ping localhost:37777" - if unhealthy, restart it
-
-#### Issue #2: Silent PM2 Error Handling
-**Decision**: AUTOMATICALLY RESOLVED by Issue #1
- Gets deleted with PM2 status checking code
- New approach naturally fails fast on execSync
-
-#### Issue #3: Session Auto-Creation Duplication
-**Decision**: EXTRACT to helper method
- Create `private getOrCreateSession(sessionDbId): ActiveSession`
- Remove 60+ lines of duplicated code from:
-  - handleInit() (lines 663-733)
-  - handleObservation() (lines 754-785)
-  - handleSummarize() (lines 813-844)
- Rationale: DRY principle
-
-#### Issue #4: No "Running But Unhealthy" Handling
-**Decision**: AUTOMATICALLY RESOLVED by Issue #1
- New approach always restarts if unhealthy
- PM2 restart handles all cases
-
-#### Issue #5: Useless getWorkerPort() Wrapper
-**Decision**: CREATE proper settings reader
- Delete the wrapper function
- Create settings reader that:
-  1. Reads from `~/.claude-mem/settings.json`
-  2. Falls back to `process.env.CLAUDE_MEM_WORKER_PORT`
-  3. Falls back to `37777`
- Rationale: UI writes to `~/.claude-mem/settings.json`, worker/hooks must read from there
-
---
-
-### 🔥🔥🔥 Cleanup
-
-#### Issue #6: 1500ms Debounce Too Long
-**Decision**: SKIP - not a concern
-
-#### Issue #7: Magic Numbers Throughout
-**Decision**: DELETE unnecessary magic numbers, UNIFY required ones
- Remove hardcoded defaults that aren't needed
- Centralize remaining constants with named variables
- Locations:
-  - worker-utils.ts: timeout values (100ms, 1000ms, 10000ms)
-  - worker-service.ts: Line 997 (100ms), Line 109 ('50mb'), etc.
-
-#### Issue #8: Configuration Duplication
-**Decision**: AUTOMATICALLY RESOLVED by Issue #7
- Centralizing constants solves this
-
-#### Issue #9: Hardcoded Model Validation
-**Decision**: AUTOMATICALLY RESOLVED by Issue #7
- Delete hardcoded model list
- Let SDK handle validation
-
-#### Issue #10: Hardcoded Version Fallback
-**Decision**: READ from package.json
- Line 343: Replace `'5.0.3'` with dynamic read from package.json
- Rationale: Why hardcode a version that gets stale?
-
-#### Issue #11: Unnecessary this.port Instance Variable
-**Decision**: DELETE `this.port`
- worker-service.ts:100 - remove instance variable
- Replace all `this.port` uses with direct constant/settings reader
- Used at lines 351, 738, 742
-
---
-
-## Implementation Plan
-
-### Phase 1: worker-utils.ts Complete Rewrite
-
-**File**: `src/shared/worker-utils.ts`
-
-**Changes**:
-1. Create settings reader function:
-```typescript
-function getWorkerPort(): number {
-  try {
-    const settingsPath = join(homedir(), '.claude-mem', 'settings.json');
-    if (existsSync(settingsPath)) {
-      const settings = JSON.parse(readFileSync(settingsPath, 'utf-8'));
-      const port = parseInt(settings.env?.CLAUDE_MEM_WORKER_PORT, 10);
-      if (!isNaN(port)) return port;
-    }
-  } catch {}
-  return parseInt(process.env.CLAUDE_MEM_WORKER_PORT || '37777', 10);
-}
-```
-
-2. Add named constants:
-```typescript
-const HEALTH_CHECK_TIMEOUT_MS = 100;
-const HEALTH_CHECK_POLL_INTERVAL_MS = 100;
-const HEALTH_CHECK_MAX_WAIT_MS = 10000;
-```
-
-3. Simplify `ensureWorkerRunning()`:
-```typescript
-export async function ensureWorkerRunning(): Promise<void> {
-  if (await isWorkerHealthy()) return;
-
-  const packageRoot = getPackageRoot();
-  const pm2Path = path.join(packageRoot, "node_modules", ".bin", "pm2");
-  const ecosystemPath = path.join(packageRoot, "ecosystem.config.cjs");
-
-  execSync(`"${pm2Path}" restart "${ecosystemPath}"`, {
-    cwd: packageRoot,
-    stdio: 'pipe'
-  });
-
-  if (!await waitForWorkerHealth()) {
-    throw new Error("Worker failed to become healthy after restart");
-  }
-}
-```
-
-4. Update `isWorkerHealthy()` and `waitForWorkerHealth()` to use constants
-
-**Result**: ~50 lines (vs 110 original), all bugs fixed
-
---
-
-### Phase 2: worker-service.ts Cleanup
-
-**File**: `src/services/worker-service.ts`
-
-**Changes**:
-
-1. **Read version from package.json** (line 343):
-```typescript
-import { readFileSync } from 'fs';
-import { join, dirname } from 'path';
-import { fileURLToPath } from 'url';
-
-const __filename = fileURLToPath(import.meta.url);
-const __dirname = dirname(__filename);
-const packageJson = JSON.parse(readFileSync(join(__dirname, '../../package.json'), 'utf-8'));
-const VERSION = packageJson.version;
-```
-
-2. **Extract getOrCreateSession() helper**:
-```typescript
-private getOrCreateSession(sessionDbId: number): ActiveSession {
-  let session = this.sessions.get(sessionDbId);
-  if (session) return session;
-
-  const db = new SessionStore();
-  const dbSession = db.getSessionById(sessionDbId);
-  if (!dbSession) {
-    db.close();
-    throw new Error(`Session ${sessionDbId} not found in database`);
-  }
-
-  session = {
-    sessionDbId,
-    claudeSessionId: dbSession.claude_session_id,
-    sdkSessionId: null,
-    project: dbSession.project,
-    userPrompt: dbSession.user_prompt,
-    pendingMessages: [],
-    abortController: new AbortController(),
-    generatorPromise: null,
-    lastPromptNumber: 0,
-    startTime: Date.now()
-  };
-
-  this.sessions.set(sessionDbId, session);
-
-  session.generatorPromise = this.runSDKAgent(session).catch(err => {
-    logger.failure('WORKER', 'SDK agent error', { sessionId: sessionDbId }, err);
-    const db = new SessionStore();
-    db.markSessionFailed(sessionDbId);
-    db.close();
-    this.sessions.delete(sessionDbId);
-  });
-
-  db.close();
-  return session;
-}
-```
-
-3. **Update handleInit(), handleObservation(), handleSummarize()**:
-Replace duplication with single line:
-```typescript
-const session = this.getOrCreateSession(sessionDbId);
-```
-
-4. **Delete model validation** (lines 407+):
-Remove hardcoded validModels array and validation check
-
-5. **Delete this.port instance variable** (line 100):
- Remove `private port: number = FIXED_PORT;`
- Replace all `this.port` references with `FIXED_PORT` or settings reader
-
-6. **Add named constants** at top of file:
-```typescript
-const MESSAGE_POLL_INTERVAL_MS = 100;
-const MAX_REQUEST_SIZE = '50mb';
-```
-
-7. **Use named constants** throughout (lines 109, 997, etc.)
-
---
-
-### Phase 3: Update Hooks
-
-**Files**:
- `src/hooks/new-hook.ts`
- `src/hooks/save-hook.ts`
- `src/hooks/summary-hook.ts`
- `src/hooks/cleanup-hook.ts`
-
-**Changes**:
-1. Import settings reader from worker-utils
-2. Replace `const FIXED_PORT = parseInt(process.env.CLAUDE_MEM_WORKER_PORT || '37777', 10);`
-   with call to settings reader
-3. Update cleanup-hook.ts line 74 to use settings reader as fallback
-
---
-
-### Phase 4: Update user-message-hook.ts
-
-**File**: `src/hooks/user-message-hook.ts`
-
-**Changes**:
- Line 53: Replace hardcoded `http://localhost:37777/` with dynamic port from settings reader
-
---
-
-## Files Changed
-
-1. `src/shared/worker-utils.ts` - Complete rewrite (~50 lines)
-2. `src/services/worker-service.ts` - Major cleanup (remove ~60 lines duplication, add helper)
-3. `src/hooks/new-hook.ts` - Use settings reader
-4. `src/hooks/save-hook.ts` - Use settings reader
-5. `src/hooks/summary-hook.ts` - Use settings reader
-6. `src/hooks/cleanup-hook.ts` - Use settings reader
-7. `src/hooks/user-message-hook.ts` - Dynamic port in message
-
---
-
-## Testing Checklist
-
-After implementation:
-
- [ ] Build: `npm run build`
- [ ] Sync: `npm run sync-marketplace`
- [ ] Restart worker: `npm run worker:restart`
- [ ] Start new Claude Code session (hooks should work)
- [ ] Change port in UI settings to 38888
- [ ] Restart worker
- [ ] Verify worker binds to 38888
- [ ] Verify hooks connect to 38888
- [ ] Verify UI connects to 38888
- [ ] Change port back to 37777
- [ ] Test all endpoints work
-
---
-
-## Expected Outcomes
-
-**Lines Removed**: ~130 lines (60 from duplication, 70 from PM2 parsing)
-**Lines Added**: ~50 lines (helper method, settings reader, constants)
-**Net Change**: -80 lines
-
-**Bugs Fixed**:
- ✅ PM2 string parsing false positives
- ✅ Silent error handling
- ✅ No restart when unhealthy
- ✅ Port configuration not synchronized with UI
-
-**Code Quality**:
- ✅ DRY principle applied (no duplication)
- ✅ YAGNI principle applied (removed ceremony)
- ✅ Fail fast error handling
- ✅ Named constants instead of magic numbers
- ✅ Single source of truth for configuration
-
---
-
-## Notes
-
- This plan addresses all Severity 5 and Severity 4 issues from the analysis
- Skipped Severity 2 issues that aren't actual problems (debounce timing)
- All "automatically resolved" issues are covered by the main fixes
- Settings synchronization bug (port not working) is now fixed
@@ -1,907 +0,0 @@
-# Worker Service & Worker Utils: Comprehensive YAGNI Analysis
-
-**Date**: 2025-11-06
-**Files Analyzed**:
- `src/services/worker-service.ts` (1228 lines)
- `src/shared/worker-utils.ts` (110 lines)
-
-**Overall Assessment**: 80% excellent architecture, 20% cleanup needed. Worker-service is well-structured with proper error handling priorities, but worker-utils contains critical bugs and YAGNI violations.
-
---
-
-## Executive Summary
-
-### What These Files Do
-
-**worker-service.ts**: Long-running Express HTTP service managed by PM2. Handles AI compression of observations, session management, SSE streaming for web UI, and Chroma vector sync. This is the heart of claude-mem's async processing.
-
-**worker-utils.ts**: Utilities for ensuring the worker is running. Called by hooks at session start to verify/start the PM2 worker process.
-
-### Critical Findings
-
-#### 🔥🔥🔥🔥🔥 SEVERITY 5 - MUST FIX IMMEDIATELY
-
-1. **worker-utils.ts:75** - Fragile string parsing of PM2 output causes false positives
-2. **worker-service.ts:754-844** - 60+ lines of identical session auto-creation code duplicated 3 times
-3. **worker-utils.ts:70** - Silent error handling defers PM2 failures instead of failing fast
-
-#### 🔥🔥🔥 SEVERITY 3 - FIX SOON
-
-4. **worker-utils.ts:77-95** - No handling for "running but unhealthy" case
-5. **worker-utils.ts:107-109** - Useless `getWorkerPort()` wrapper function
-6. **worker-service.ts:316** - 1500ms debounce is 10x too long
-
-#### 🔥🔥 SEVERITY 2 - CLEANUP WHEN CONVENIENT
-
-7. Multiple magic numbers (100ms, 1000ms, 10000ms) without named constants
-8. Hardcoded default values duplicated across multiple locations
-9. Hardcoded model validation list that will become stale
-
---
-
-## Complete Function Catalog
-
-### worker-utils.ts Functions
-
-| Function | Lines | Purpose | Status |
-|----------|-------|---------|--------|
-| `isWorkerHealthy(timeoutMs)` | 10-19 | Check /health endpoint responds | ✅ OK |
-| `waitForWorkerHealth(maxWaitMs)` | 24-36 | Poll until worker healthy | 🔥 Inefficient timeout |
-| `ensureWorkerRunning()` | 43-102 | Main orchestrator to start worker | 🔥🔥🔥🔥🔥 CRITICAL BUGS |
-| `getWorkerPort()` | 107-109 | Returns FIXED_PORT constant | 🔥🔥🔥🔥🔥 DELETE THIS |
-
-### worker-service.ts Functions
-
-| Function | Lines | Purpose | Status |
-|----------|-------|---------|--------|
-| `findClaudePath()` | 35-65 | Find Claude Code executable | ✅ Excellent |
-| Constructor | 107-139 | Setup Express routes | ✅ Good |
-| `start()` | 141-173 | Start HTTP server, init Chroma | ✅ Excellent prioritization |
-| `getUIDirectory()` | 178-189 | Get UI path (CJS/ESM) | ✅ Good defensive code |
-| `handleHealth()` | 194-196 | GET /health | ✅ PERFECT |
-| `handleViewerHTML()` | 201-211 | GET / | ✅ Good |
-| `handleSSEStream()` | 216-245 | GET /stream (SSE) | ✅ Good |
-| `broadcastSSE()` | 250-275 | Broadcast to clients | ✅ Excellent defensive code |
-| `broadcastProcessingStatus()` | 280-286 | Broadcast processing state | ✅ Good |
-| `checkAndStopSpinner()` | 291-318 | Debounced spinner stop | 🔥 1500ms too long |
-| `handleStats()` | 323-365 | GET /api/stats | 🔥 Hardcoded paths/version |
-| `handleGetSettings()` | 370-397 | GET /api/settings | 🔥 Duplicated defaults |
-| `handlePostSettings()` | 402-461 | POST /api/settings | 🔥 Hardcoded model list |
-| `handleGetObservations()` | 467-515 | GET /api/observations | ✅ Excellent |
-| `handleGetSummaries()` | 517-576 | GET /api/summaries | ✅ Excellent |
-| `handleGetPrompts()` | 578-631 | GET /api/prompts | ✅ Excellent |
-| `handleGetProcessingStatus()` | 637-639 | GET /api/processing-status | ✅ Good |
-| `handleInit()` | 645-744 | POST /sessions/:id/init | ✅ Good but has duplication |
-| `handleObservation()` | 750-803 | POST /sessions/:id/observations | 🔥🔥🔥🔥🔥 MASSIVE DUPLICATION |
-| `handleSummarize()` | 809-858 | POST /sessions/:id/summarize | 🔥🔥🔥🔥🔥 MASSIVE DUPLICATION |
-| `handleComplete()` | 864-873 | POST /sessions/:id/complete | ✅ PERFECT |
-| `handleStatus()` | 878-893 | GET /sessions/:id/status | ✅ Good |
-| `runSDKAgent()` | 898-963 | Run SDK agent loop | ✅ Excellent |
-| `createMessageGenerator()` | 969-1060 | Async generator for SDK | ✅ Excellent |
-| `handleAgentMessage()` | 1066-1201 | Parse and store AI response | ✅ EXCELLENT |
-| `main()` | 1205-1225 | Entry point + signals | ✅ Good |
-
---
-
-## Line-by-Line Analysis
-
-### worker-utils.ts
-
-#### Lines 1-5: Imports and Constants
-```typescript
-const FIXED_PORT = parseInt(process.env.CLAUDE_MEM_WORKER_PORT || "37777", 10);
-```
-
-**What**: Parse port from env var with fallback to 37777
-**Why**: Need to know which port to connect to
-**Critique**: ✅ Good - simple constant, no unnecessary abstraction
-
---
-
-#### Lines 10-19: `isWorkerHealthy(timeoutMs = 100)`
-
-```typescript
-async function isWorkerHealthy(timeoutMs: number = 100): Promise<boolean> {
-  try {
-    const response = await fetch(`http://127.0.0.1:${FIXED_PORT}/health`, {
-      signal: AbortSignal.timeout(timeoutMs)
-    });
-    return response.ok;
-  } catch {
-    return false;
-  }
-}
-```
-
-**What**: Checks if /health endpoint responds within timeout
-**Why**: Need to know if worker is running before trying to start it
-**Critique**:
- Default 100ms is used once (line 45 initial check)
- Explicit 1000ms passed at line 29 (during startup polling)
- This inconsistency is actually INTENTIONAL: quick initial check vs. waiting for startup
- ✅ **VERDICT**: Reasonable pattern
-
-**Why the two timeouts?**
- 100ms: "Is it already running?" (fast check, don't wait)
- 1000ms: "Is it starting up?" (wait for initialization)
-
---
-
-#### Lines 24-36: `waitForWorkerHealth(maxWaitMs = 10000)`
-
-```typescript
-async function waitForWorkerHealth(maxWaitMs: number = 10000): Promise<boolean> {
-  const start = Date.now();
-  const checkInterval = 100; // Check every 100ms
-
-  while (Date.now() - start < maxWaitMs) {
-    if (await isWorkerHealthy(1000)) {
-      return true;
-    }
-    // Wait before next check
-    await new Promise(resolve => setTimeout(resolve, checkInterval));
-  }
-  return false;
-}
-```
-
-**What**: Polls health endpoint every 100ms until healthy or timeout
-**Why**: Worker takes time to start, need to wait
-**Critique**:
-
-🔥 **MAGIC NUMBER #1**: Line 26 `checkInterval = 100` - no units! Is this milliseconds? Should be `CHECK_INTERVAL_MS = 100`
-
-🔥 **MAGIC NUMBER #2**: Line 29 `isWorkerHealthy(1000)` - why 1000ms timeout per check?
-
-🔥 **INEFFICIENCY**: Each health check has 1000ms timeout, but we check every 100ms. If the worker is down, each check waits 1000ms to timeout. We could fail faster with a 100ms timeout since we retry quickly anyway.
-
-**The Math**:
- Check interval: 100ms
- Health timeout: 1000ms
- If worker is down, first check fails after 1000ms, then we wait 100ms, then try again
- Total time to detect "worker is down" on first check: 1000ms (could be 100ms)
-
-**RECOMMENDED**: Use 100ms timeout for health checks since we retry every 100ms anyway:
-```typescript
-const HEALTH_CHECK_TIMEOUT_MS = 100;
-const HEALTH_CHECK_POLL_INTERVAL_MS = 100;
-const HEALTH_CHECK_MAX_WAIT_MS = 10000;
-
-async function waitForWorkerHealth(): Promise<boolean> {
-  const start = Date.now();
-  while (Date.now() - start < HEALTH_CHECK_MAX_WAIT_MS) {
-    if (await isWorkerHealthy(HEALTH_CHECK_TIMEOUT_MS)) return true;
-    await new Promise(resolve => setTimeout(resolve, HEALTH_CHECK_POLL_INTERVAL_MS));
-  }
-  return false;
-}
-```
-
---
-
-#### Lines 43-102: `ensureWorkerRunning()` - 🔥🔥🔥🔥🔥 THE DISASTER ZONE
-
-```typescript
-export async function ensureWorkerRunning(): Promise<void> {
-  // First, check if worker is already healthy
-  if (await isWorkerHealthy()) {
-    return; // Worker is already running and responsive
-  }
-
-  const packageRoot = getPackageRoot();
-  const pm2Path = path.join(packageRoot, "node_modules", ".bin", "pm2");
-  const ecosystemPath = path.join(packageRoot, "ecosystem.config.cjs");
-
-  // Check PM2 status to see if worker process exists
-  const checkProcess = spawn(pm2Path, ["list", "--no-color"], {
-    cwd: packageRoot,
-    stdio: ["ignore", "pipe", "ignore"],
-  });
-
-  let output = "";
-  checkProcess.stdout?.on("data", (data) => {
-    output += data.toString();
-  });
-
-  // Wait for PM2 list to complete
-  await new Promise<void>((resolve, reject) => {
-    checkProcess.on("error", (error) => reject(error));
-    checkProcess.on("close", (code) => {
-      // PM2 list can fail, but we should still continue - just assume worker isn't running
-      // This handles cases where PM2 isn't installed yet
-      resolve();
-    });
-  });
-
-  // Check if 'claude-mem-worker' is in the PM2 list output and is 'online'
-  const isRunning = output.includes("claude-mem-worker") && output.includes("online");
-
-  if (!isRunning) {
-    // Start the worker
-    const startProcess = spawn(pm2Path, ["start", ecosystemPath], {
-      cwd: packageRoot,
-      stdio: "ignore",
-    });
-
-    // Wait for PM2 start command to complete
-    await new Promise<void>((resolve, reject) => {
-      startProcess.on("error", (error) => reject(error));
-      startProcess.on("close", (code) => {
-        if (code !== 0 && code !== null) {
-          reject(new Error(`PM2 start command failed with exit code ${code}`));
-        } else {
-          resolve();
-        }
-      });
-    });
-  }
-
-  // Wait for worker to become healthy (either just started or was starting)
-  const healthy = await waitForWorkerHealth(10000);
-  if (!healthy) {
-    throw new Error("Worker failed to become healthy after starting");
-  }
-}
-```
-
-**What**: Ensure PM2 worker is running - check health, check PM2 status, start if needed, wait for health
-**Why**: Hooks need worker running to process observations
-
-#### 🔥🔥🔥🔥🔥 CRITICAL BUG #1: Fragile String Parsing (Line 75)
-
-```typescript
-const isRunning = output.includes("claude-mem-worker") && output.includes("online");
-```
-
-**THE PROBLEM**: This checks if BOTH strings exist ANYWHERE in the output. This is WRONG.
-
-**Counter-Example**:
-```
-PM2 Process List:
-┌─────┬────────────────────┬─────────┐
-│ id  │ name               │ status  │
-├─────┼────────────────────┼─────────┤
-│  0  │ claude-mem-worker  │ stopped │
-│  1  │ some-other-app     │ online  │
-└─────┴────────────────────┴─────────┘
-```
-
-This would return `true` because output contains "claude-mem-worker" AND "online", even though the worker is STOPPED!
-
-**Impact**:
- False positive: Worker is stopped, but code thinks it's running
- Result: Skip starting worker (line 77 `if (!isRunning)`), wait for health
- Health check fails because worker isn't actually running
- Entire function fails with "Worker failed to become healthy"
- User sees cryptic error instead of "Worker is stopped, restarting..."
-
-**THE FIX**: Use PM2's JSON output
-```typescript
-const result = execSync(`"${pm2Path}" jlist`, { encoding: 'utf8' });
-const processes = JSON.parse(result);
-const worker = processes.find(p => p.name === 'claude-mem-worker');
-const isRunning = worker?.pm2_env?.status === 'online';
-```
-
-#### 🔥🔥🔥🔥🔥 CRITICAL BUG #2: Silent Error Handling (Lines 65-72)
-
-```typescript
-await new Promise<void>((resolve, reject) => {
-  checkProcess.on("error", (error) => reject(error));
-  checkProcess.on("close", (code) => {
-    // PM2 list can fail, but we should still continue - just assume worker isn't running
-    // This handles cases where PM2 isn't installed yet
-    resolve(); // ← ALWAYS RESOLVES, NEVER REJECTS
-  });
-});
-```
-
-**THE PROBLEM**:
-1. If PM2 isn't installed, `pm2 list` fails
-2. Line 70: ALWAYS resolves, ignoring the failure
-3. `output` is empty string
-4. Line 75: `isRunning = false` (correct by accident)
-5. Line 77-94: Try to START the worker... which will ALSO fail because PM2 isn't installed
-6. Line 85-93: THIS finally rejects with error
-
-**Why This Is Terrible**:
- Defers error detection to the start command instead of failing fast
- Confusing error message: "PM2 start command failed" instead of "PM2 not found - run npm install"
- User wastes time waiting for PM2 list to fail, then waiting for PM2 start to fail
- The comment is a LIE: "we should still continue" - no, we shouldn't! If PM2 isn't installed, FAIL IMMEDIATELY.
-
-**THE FIX**: Fail fast
-```typescript
-await new Promise<void>((resolve, reject) => {
-  checkProcess.on("error", reject);
-  checkProcess.on("close", (code) => {
-    if (code !== 0 && code !== null) {
-      reject(new Error(`PM2 not found - install dependencies first (npm install)`));
-    }
-    resolve();
-  });
-});
-```
-
-#### 🔥🔥🔥🔥 CRITICAL BUG #3: No Handling for "Running But Unhealthy" (Lines 77-98)
-
-**THE LOGIC**:
-1. Line 45: Check if worker is healthy → NO (or we would have returned)
-2. Line 54-75: Check if PM2 says worker is running
-3. Line 77: `if (!isRunning)` → start the worker
-4. Line 98: Wait for worker to become healthy
-
-**THE PROBLEM**: What if PM2 says worker IS running but our health check (line 45) failed?
-
-**Answer**: We do NOTHING. We skip the `if (!isRunning)` block and jump straight to line 98, waiting for it to become healthy.
-
-**Why This Is Wrong**: If the worker is started but unhealthy, it won't magically heal itself. It needs to be RESTARTED.
-
-**Scenarios**:
- Worker crashed but PM2 hasn't noticed yet → Status: "online", Health: failed → We wait forever
- Worker is in infinite loop → Status: "online", Health: timeout → We wait forever
- Worker port is wrong → Status: "online", Health: failed → We wait forever
-
-**THE FIX**: Restart if unhealthy
-```typescript
-if (!await isWorkerHealthy()) {
-  // Not healthy - restart it (PM2 restart is idempotent)
-  execSync(`"${pm2Path}" restart "${ecosystemPath}"`);
-  if (!await waitForWorkerHealth()) {
-    throw new Error("Worker failed to become healthy after restart");
-  }
-}
-```
-
-Or even simpler: Just always restart if health fails. PM2 handles "not started" vs "started" gracefully.
-
---
-
-#### Lines 107-109: `getWorkerPort()` - 🔥🔥🔥🔥🔥 DELETE THIS
-
-```typescript
-/**
- * Get the worker port number (fixed port)
- */
-export function getWorkerPort(): number {
-  return FIXED_PORT;
-}
-```
-
-**What**: Returns the FIXED_PORT constant
-**Why**: ???
-**Critique**: 🔥🔥🔥🔥🔥 **TEXTBOOK YAGNI VIOLATION**
-
-This is the "wrapper function for a constant" anti-pattern from CLAUDE.md.
-
-**THE PROBLEM**: This function adds ZERO value. It's pure ceremony.
-
-**Callers should just**:
-```typescript
-import { FIXED_PORT } from './worker-utils.js';
-// Use FIXED_PORT directly
-```
-
-**Instead of**:
-```typescript
-import { getWorkerPort } from './worker-utils.js';
-const port = getWorkerPort(); // Why???
-```
-
-**Why This Exists**: Training bias. Code that looks "professional" often includes ceremonial getters for constants. But this is WRONG. Delete it and export the constant.
-
-**THE FIX**:
-```typescript
-export const WORKER_PORT = parseInt(process.env.CLAUDE_MEM_WORKER_PORT || "37777", 10);
-```
-
-Then update all callers to use `WORKER_PORT` instead of `getWorkerPort()`.
-
---
-
-### worker-utils.ts COMPLETE REWRITE
-
-Here's what this file SHOULD be:
-
-```typescript
-import path from "path";
-import { execSync } from "child_process";
-import { getPackageRoot } from "./paths.js";
-
-// Configuration
-export const WORKER_PORT = parseInt(process.env.CLAUDE_MEM_WORKER_PORT || "37777", 10);
-
-const HEALTH_CHECK_TIMEOUT_MS = 100;
-const HEALTH_CHECK_POLL_INTERVAL_MS = 100;
-const HEALTH_CHECK_MAX_WAIT_MS = 10000;
-
-/**
- * Check if worker is responsive by trying the health endpoint
- */
-async function isWorkerHealthy(): Promise<boolean> {
-  try {
-    const response = await fetch(`http://127.0.0.1:${WORKER_PORT}/health`, {
-      signal: AbortSignal.timeout(HEALTH_CHECK_TIMEOUT_MS)
-    });
-    return response.ok;
-  } catch {
-    return false;
-  }
-}
-
-/**
- * Wait for worker to become healthy, polling every 100ms
- */
-async function waitForWorkerHealth(): Promise<boolean> {
-  const start = Date.now();
-  while (Date.now() - start < HEALTH_CHECK_MAX_WAIT_MS) {
-    if (await isWorkerHealthy()) return true;
-    await new Promise(resolve => setTimeout(resolve, HEALTH_CHECK_POLL_INTERVAL_MS));
-  }
-  return false;
-}
-
-/**
- * Ensure worker service is running and healthy
- * Restarts worker if not healthy (PM2 restart is idempotent)
- */
-export async function ensureWorkerRunning(): Promise<void> {
-  if (await isWorkerHealthy()) return;
-
-  const packageRoot = getPackageRoot();
-  const pm2Path = path.join(packageRoot, "node_modules", ".bin", "pm2");
-  const ecosystemPath = path.join(packageRoot, "ecosystem.config.cjs");
-
-  // PM2 restart is idempotent - handles both "not started" and "started but broken"
-  try {
-    const result = execSync(`"${pm2Path}" restart "${ecosystemPath}"`, {
-      cwd: packageRoot,
-      encoding: 'utf8',
-      stdio: 'pipe'
-    });
-
-    if (!await waitForWorkerHealth()) {
-      throw new Error(`Worker failed to become healthy. PM2 output:\n${result}`);
-    }
-  } catch (error: any) {
-    if (error.code === 'ENOENT' || error.message.includes('not found')) {
-      throw new Error('PM2 not found - run: npm install');
-    }
-    throw error;
-  }
-}
-```
-
-**Line Count**: 43 lines (vs 110 original)
-**Complexity**: 1/3 of original
-**Bugs Fixed**: All of them
-**Ceremony Removed**: All of it
-
-**What Changed**:
-1. Removed `getWorkerPort()` wrapper - export constant directly
-2. Removed PM2 status checking - just restart if unhealthy
-3. Removed string parsing - use PM2's idempotent restart
-4. Removed silent error handling - fail fast on PM2 not found
-5. Named all magic numbers as constants
-6. Simplified to: "Unhealthy? Restart. Wait for health. Done."
-
---
-
-## worker-service.ts Analysis
-
-### Overall Structure
-
-**Lines 1-24**: Imports and constants ✅
-**Lines 27-65**: `findClaudePath()` ✅ Excellent
-**Lines 67-96**: Type definitions ✅
-**Lines 98-1228**: WorkerService class
-
-### Critical Issues in worker-service.ts
-
-#### 🔥🔥🔥🔥🔥 ISSUE #1: Massive Code Duplication (Lines 754-844)
-
-**THE PROBLEM**: Session auto-creation logic is COPIED THREE TIMES:
-1. `handleInit()` (lines 663-733)
-2. `handleObservation()` (lines 754-785)
-3. `handleSummarize()` (lines 813-844)
-
-**The Duplicated Code** (20+ lines per copy):
-```typescript
-let session = this.sessions.get(sessionDbId);
-if (!session) {
-  const db = new SessionStore();
-  const dbSession = db.getSessionById(sessionDbId);
-  db.close();
-
-  session = {
-    sessionDbId,
-    claudeSessionId: dbSession!.claude_session_id,
-    sdkSessionId: null,
-    project: dbSession!.project,
-    userPrompt: dbSession!.user_prompt,
-    pendingMessages: [],
-    abortController: new AbortController(),
-    generatorPromise: null,
-    lastPromptNumber: 0,
-    startTime: Date.now()
-  };
-  this.sessions.set(sessionDbId, session);
-
-  session.generatorPromise = this.runSDKAgent(session).catch(err => {
-    logger.failure('WORKER', 'SDK agent error', { sessionId: sessionDbId }, err);
-    const db = new SessionStore();
-    db.markSessionFailed(sessionDbId);
-    db.close();
-    this.sessions.delete(sessionDbId);
-  });
-}
-```
-
-**Impact**: 60+ lines of duplicated code across 3 functions
-
-**THE FIX**: Extract to helper method
-```typescript
-private getOrCreateSession(sessionDbId: number): ActiveSession {
-  let session = this.sessions.get(sessionDbId);
-  if (session) return session;
-
-  const db = new SessionStore();
-  const dbSession = db.getSessionById(sessionDbId);
-  if (!dbSession) {
-    db.close();
-    throw new Error(`Session ${sessionDbId} not found in database`);
-  }
-
-  session = {
-    sessionDbId,
-    claudeSessionId: dbSession.claude_session_id,
-    sdkSessionId: null,
-    project: dbSession.project,
-    userPrompt: dbSession.user_prompt,
-    pendingMessages: [],
-    abortController: new AbortController(),
-    generatorPromise: null,
-    lastPromptNumber: 0,
-    startTime: Date.now()
-  };
-
-  this.sessions.set(sessionDbId, session);
-
-  // Start SDK agent in background
-  session.generatorPromise = this.runSDKAgent(session).catch(err => {
-    logger.failure('WORKER', 'SDK agent error', { sessionId: sessionDbId }, err);
-    const db = new SessionStore();
-    db.markSessionFailed(sessionDbId);
-    db.close();
-    this.sessions.delete(sessionDbId);
-  });
-
-  db.close();
-  return session;
-}
-```
-
-Then all three functions become:
-```typescript
-private handleObservation(req: Request, res: Response): void {
-  const sessionDbId = parseInt(req.params.sessionDbId, 10);
-  const { tool_name, tool_input, tool_output, prompt_number } = req.body;
-
-  const session = this.getOrCreateSession(sessionDbId);
-
-  session.pendingMessages.push({
-    type: 'observation',
-    tool_name,
-    tool_input,
-    tool_output,
-    prompt_number
-  });
-
-  res.json({ status: 'queued', queueLength: session.pendingMessages.length });
-}
-```
-
-**Savings**: Remove 60 lines, improve maintainability 10x
-
---
-
-#### 🔥🔥 ISSUE #2: Magic Numbers Throughout
-
-**Line 316**: `setTimeout(() => { ... }, 1500);` - Why 1500ms debounce?
-**Line 997**: `setTimeout(resolve, 100)` - Why 100ms polling?
-**Line 343**: `const version = process.env.npm_package_version || '5.0.3';` - Hardcoded fallback
-**Line 109**: `express.json({ limit: '50mb' })` - Why 50mb?
-
-**THE FIX**: Named constants
-```typescript
-const SPINNER_DEBOUNCE_MS = 200; // Debounce spinner to prevent flicker
-const MESSAGE_POLL_INTERVAL_MS = 100; // Check for new messages every 100ms
-const MAX_REQUEST_SIZE = '50mb'; // Allow large tool outputs
-```
-
---
-
-#### 🔥🔥 ISSUE #3: Configuration Duplication
-
-Default values appear in multiple places:
- Line 377-380: Default settings in GET handler
- Line 22: MODEL default
- Throughout: Port defaults, observation count defaults
-
-**THE FIX**: Centralize
-```typescript
-export const DEFAULT_CONFIG = {
-  MODEL: 'claude-haiku-4-5',
-  CONTEXT_OBSERVATIONS: 50,
-  WORKER_PORT: 37777,
-  VALID_MODELS: ['claude-haiku-4-5', 'claude-sonnet-4-5', 'claude-opus-4'],
-  MAX_CONTEXT_OBSERVATIONS: 200,
-  MIN_PORT: 1024,
-  MAX_PORT: 65535
-} as const;
-```
-
---
-
-#### 🔥 ISSUE #4: Hardcoded Model Validation (Line 407)
-
-```typescript
-const validModels = ['claude-haiku-4-5', 'claude-sonnet-4-5', 'claude-opus-4'];
-```
-
-**THE PROBLEM**: This list will get stale when new models are released.
-
-**YAGNI QUESTION**: Do we even need to validate? The SDK will error if model doesn't exist.
-
-**ANSWER**: Better error messages for users. But this should be a WARNING, not a blocker.
-
-**THE FIX**: Remove validation or make it advisory
-```typescript
-// Let SDK handle validation - it knows the current model list
-// We don't need to duplicate that logic here
-if (CLAUDE_MEM_MODEL) {
-  settings.env.CLAUDE_MEM_MODEL = CLAUDE_MEM_MODEL;
-  logger.info('WORKER', `Model changed to ${CLAUDE_MEM_MODEL}`, {});
-}
-```
-
---
-
-### What worker-service.ts Does RIGHT ✅
-
-#### 1. Excellent Error Handling Priority
-```typescript
-// Store to SQLite FIRST (source of truth)
-const { id, createdAtEpoch } = db.storeObservation(...);
-
-// Broadcast to SSE (real-time UI updates)
-this.broadcastSSE({ type: 'new_observation', ... });
-
-// Sync to Chroma ASYNC (fire-and-forget, non-critical)
-this.chromaSync.syncObservation(...)
-  .catch((error: Error) => {
-    logger.error('...continuing', ...);
-    // Don't crash - SQLite has the data
-  });
-```
-
-**Priority**: SQLite > SSE > Chroma
-**Philosophy**: Write to source of truth first, update UI second, sync to vector DB last. Chroma failures don't crash the worker.
-
-#### 2. Clean Pagination APIs
-
-All data endpoints follow consistent pattern:
- Parse `offset`, `limit`, `project` from query params
- Cap limit at 100 to prevent abuse
- Return `{ items, hasMore, total, offset, limit }`
- Use parameterized queries (SQL injection safe)
-
-Example: `handleGetObservations()` (lines 467-515) is textbook good API design.
-
-#### 3. Proper Async Generator Pattern
-
-`createMessageGenerator()` (lines 969-1060) is an excellent implementation:
- Yields init prompt immediately
- Polls message queue with proper abort signal handling
- No busy-waiting (100ms sleep between polls)
- Clean message type discrimination
- Proper error propagation
-
-#### 4. Defensive SSE Cleanup
-
-`broadcastSSE()` (lines 250-275):
- Early return if no clients (optimization)
- Two-phase cleanup (collect failures, then remove)
- Doesn't modify Set during iteration
- Handles disconnected clients gracefully
-
-This is GOOD defensive programming, not YAGNI violation.
-
---
-
-## Severity-Ranked YAGNI Violations
-
-### 🔥🔥🔥🔥🔥 SEVERITY 5: CRITICAL - FIX IMMEDIATELY
-
-| Issue | File | Lines | Problem | Impact |
-|-------|------|-------|---------|--------|
-| Fragile string parsing | worker-utils | 75 | `output.includes("claude-mem-worker") && output.includes("online")` | False positives cause failures |
-| Session auto-creation duplication | worker-service | 754-844 | 60+ lines copied 3 times | Maintenance nightmare |
-| Silent PM2 error handling | worker-utils | 70 | Always resolves, defers errors | Confusing error messages |
-
-### 🔥🔥🔥🔥 SEVERITY 4: MAJOR - FIX SOON
-
-| Issue | File | Lines | Problem | Impact |
-|-------|------|-------|---------|--------|
-| No "running but unhealthy" handling | worker-utils | 77-98 | Skip restart if PM2 says running | Worker never recovers |
-| Useless getWorkerPort() wrapper | worker-utils | 107-109 | Ceremony for a constant | Code bloat |
-
-### 🔥🔥🔥 SEVERITY 3: MODERATE - FIX WHEN CONVENIENT
-
-| Issue | File | Lines | Problem | Impact |
-|-------|------|-------|---------|--------|
-| 1500ms debounce too long | worker-service | 316 | Should be 100-200ms | Spinner lags |
-| Hardcoded model validation | worker-service | 407 | List will get stale | Blocks valid models |
-| Hardcoded fallback version | worker-service | 343 | '5.0.3' will get stale | Wrong stats |
-
-### 🔥🔥 SEVERITY 2: MINOR - CLEANUP
-
-| Issue | File | Lines | Problem | Impact |
-|-------|------|-------|---------|--------|
-| Magic numbers everywhere | Both | Multiple | 100, 1000, 1500, etc | Hard to maintain |
-| Duplicated default configs | worker-service | Multiple | Defaults in many places | Inconsistency risk |
-| Unnecessary this.port | worker-service | 100 | Should use FIXED_PORT | Confusion |
-
---
-
-## Recommended Action Plan
-
-### Phase 1: Critical Fixes (Do Today)
-
-1. **Fix worker-utils.ts completely** - Use the rewrite provided above (43 lines)
-   - Remove getWorkerPort()
-   - Fix PM2 string parsing → use `pm2 restart` (idempotent)
-   - Remove silent error handling
-   - Named constants for all timeouts
-
-2. **Extract getOrCreateSession()** in worker-service.ts
-   - Remove 60 lines of duplication
-   - Update handleInit, handleObservation, handleSummarize
-
-### Phase 2: Cleanup (Do This Week)
-
-3. **Centralize configuration**
-   - Create DEFAULT_CONFIG constant
-   - Remove duplicated defaults
-   - Update all references
-
-4. **Fix magic numbers**
-   - SPINNER_DEBOUNCE_MS = 200
-   - MESSAGE_POLL_INTERVAL_MS = 100
-   - HEALTH_CHECK_TIMEOUT_MS = 100
-   - etc.
-
-5. **Remove hardcoded validations**
-   - Model validation (let SDK handle it)
-   - Fallback version (read from package.json)
-
-### Phase 3: Polish (Do Next Week)
-
-6. **Fix minor issues**
-   - Remove `this.port` instance variable
-   - Update debounce to 200ms
-   - Add constants for all magic numbers
-
---
-
-## The YAGNI Philosophy Applied
-
-### What YAGNI Means Here
-
-**You Aren't Gonna Need It**: Don't build infrastructure for problems you don't have.
-
-### Examples from This Code
-
-#### YAGNI Violation ❌
-```typescript
-export function getWorkerPort(): number {
-  return FIXED_PORT; // Wrapper for a constant
-}
-```
-**Why**: Adds zero value. Pure ceremony. Just export the constant.
-
-#### YAGNI Compliance ✅
-```typescript
-export const WORKER_PORT = parseInt(...);
-```
-**Why**: Solves the actual need (get port) without ceremony.
-
---
-
-#### YAGNI Violation ❌
-```typescript
-// Check PM2 status with string parsing
-const checkProcess = spawn(pm2Path, ["list", "--no-color"]);
-let output = "";
-checkProcess.stdout?.on("data", (data) => { output += data.toString(); });
-// ... 30 lines of promise wrappers and parsing ...
-const isRunning = output.includes("claude-mem-worker") && output.includes("online");
-
-if (!isRunning) {
-  // Start worker
-}
-// But what if it's running AND unhealthy? Do nothing!
-```
-**Why**: Solving a problem that doesn't exist. PM2 restart is idempotent - it handles both "not started" and "started but broken". We don't need to distinguish.
-
-#### YAGNI Compliance ✅
-```typescript
-if (!await isWorkerHealthy()) {
-  execSync(`pm2 restart ecosystem.config.cjs`);
-  await waitForWorkerHealth();
-}
-```
-**Why**: Solves the actual problem (ensure worker is healthy) in the simplest way.
-
---
-
-### The Pattern
-
-**YAGNI Violations Follow This Pattern**:
-1. Imagine a scenario ("what if PM2 isn't installed?")
-2. Write defensive code for the scenario (silent error handling)
-3. Defer the error to a later point
-4. Make the actual error message worse
-
-**YAGNI Compliance Follows This Pattern**:
-1. Write the obvious solution (check health, restart if unhealthy)
-2. Let errors propagate naturally
-3. Add error handling only where actually needed
-4. Keep error messages clear and direct
-
---
-
-## Conclusion
-
-### Overall Assessment
-
-**worker-utils.ts**: 🔥🔥🔥🔥 2/5 - Needs complete rewrite
-**worker-service.ts**: ✅✅✅✅🔥 4/5 - Mostly excellent, fix duplication
-
-### The Good
-
- worker-service.ts has excellent architecture (SQLite > SSE > Chroma priority)
- Clean pagination APIs with proper parameterization
- Good async generator pattern for SDK streaming
- Proper SSE client management with defensive cleanup
- Non-blocking Chroma sync with graceful failures
-
-### The Bad
-
- worker-utils.ts has 3 critical bugs (string parsing, silent errors, missing restart)
- 60+ lines of duplicated session auto-creation code
- Magic numbers everywhere without named constants
- Hardcoded defaults in multiple locations
-
-### The Ugly
-
- `getWorkerPort()` is pure ceremony - delete it
- 1500ms debounce is 10x too long
- PM2 string parsing is fragile and will break
- Silent error handling makes debugging impossible
-
-### Time to Fix
-
- Critical fixes (worker-utils rewrite + extract getOrCreateSession): **2 hours**
- Cleanup (centralize config, fix magic numbers): **2 hours**
- Polish (minor issues): **1 hour**
-
-**Total**: 5 hours to bring codebase from 80% to 95% quality.
-
-### Final Verdict
-
-This code is **80% excellent, 20% disaster**. The disaster is concentrated in worker-utils.ts (which is called on EVERY session start) and the session auto-creation duplication (which makes maintenance painful). Fix these two issues and you have a rock-solid codebase.
-
-The worker-service.ts architecture is actually brilliant - the prioritization of SQLite > SSE > Chroma is exactly right, and the async generator pattern for SDK streaming is textbook perfect. Don't let the duplication overshadow the good design.
-
-**Recommendation**: Fix worker-utils.ts TODAY (it has production bugs), extract getOrCreateSession() THIS WEEK (it's painful to maintain), and clean up the rest NEXT WEEK.
@@ -0,0 +1,959 @@
+# Worker Service Overhead Analysis
+
+**Date**: 2025-11-06
+**File**: `src/services/worker-service.ts`
+**Total Lines**: 1173
+**Overall Assessment**: This file has accumulated unnecessary complexity, artificial delays, and defensive programming patterns that actively harm performance. Many patterns were likely added "just in case" without real-world justification.
+
+---
+
+## Executive Summary
+
+**High Severity Issues (Score 8-10)**:
+- **Line 942**: Polling loop with 100ms delay instead of event-driven architecture (Score: 10/10)
+- **Lines 338-365**: Spinner debounce with 1.5s artificial delay (Score: 9/10)
+- **Lines 204-234**: Database reopening on every getOrCreateSession call (Score: 8/10)
+
+**Medium Severity Issues (Score 5-7)**:
+- **Lines 33-70**: Unnecessary Claude path caching for rare operation (Score: 6/10)
+- **Lines 694-711**: Redundant database reopening in handleInit (Score: 7/10)
+- **Lines 728-741**: Fire-and-forget Chroma sync with verbose error handling (Score: 5/10)
+
+**Low Severity Issues (Score 3-4)**:
+- **Line 28**: Magic number MESSAGE_POLL_INTERVAL_MS without justification (Score: 4/10)
+- **Lines 303-321**: Over-engineered SSE client cleanup (Score: 4/10)
+
+---
+
+## Line-by-Line Analysis
+
+### Lines 1-30: Setup and Constants
+
+**Lines 22-24**: Version reading from package.json
+```typescript
+const packageJson = JSON.parse(readFileSync(join(__dirname, '..', '..', 'package.json'), 'utf-8'));
+const VERSION = packageJson.version;
+```
+**Score**: 2/10
+**Why**: This is fine. Reads once at startup, uses the value for the /api/stats endpoint.
+
+**Line 26**: Model configuration
+```typescript
+const MODEL = process.env.CLAUDE_MEM_MODEL || 'claude-sonnet-4-5';
+```
+**Score**: 1/10
+**Why**: Clean, simple, correct.
+
+**Line 28**: Magic number
+```typescript
+const MESSAGE_POLL_INTERVAL_MS = 100;
+```
+**Score**: 4/10
+**Why**: This is a magic number without justification. Why 100ms? Why not 50ms or 200ms? More importantly, **why are we polling at all instead of using event-driven patterns?** The name is descriptive, but the existence of this constant indicates a fundamental architectural problem (see line 942).
+
+**Pattern**: This constant exists to support a polling loop that shouldn't exist.
+
+---
+
+### Lines 33-70: Claude Path Caching
+
+```typescript
+let cachedClaudePath: string | null = null;
+
+function findClaudePath(): string {
+  if (cachedClaudePath) {
+    return cachedClaudePath;
+  }
+  // ... 30 lines of logic to find and cache path ...
+}
+```
+
+**Score**: 6/10
+**Why Stupid**:
+1. **YAGNI Violation**: This function is called **exactly once** per worker startup (line 846 in runSDKAgent)
+2. **Premature Optimization**: Caching saves ~5ms on an operation that happens once per worker lifetime
+3. **Added Complexity**: 37 lines of code including module-level state for negligible benefit
+4. **False Economy**: The worker runs for hours/days. Saving 5ms on startup is meaningless.
+
+**What Should Happen**:
+```typescript
+function findClaudePath(): string {
+  if (process.env.CLAUDE_CODE_PATH) return process.env.CLAUDE_CODE_PATH;
+
+  const command = process.platform === 'win32' ? 'where claude' : 'which claude';
+  const result = execSync(command, { encoding: 'utf8' }).trim().split('\n')[0].trim();
+
+  if (!result) throw new Error('Claude executable not found in PATH');
+  return result;
+}
+```
+**Savings**: Remove 33 lines of unnecessary code and module-level state.
+
+---
+
+### Lines 103-110: WorkerService State
+
+```typescript
+class WorkerService {
+  private app: express.Application;
+  private sessions: Map<number, ActiveSession> = new Map();
+  private chromaSync!: ChromaSync;
+  private sseClients: Set<Response> = new Set();
+  private isProcessing: boolean = false;
+  private spinnerStopTimer: NodeJS.Timeout | null = null;
+```
+
+**Score**: 7/10 (for spinnerStopTimer)
+**Why**:
+- `app`, `sessions`, `chromaSync`, `sseClients`: **Good** - necessary state
+- `isProcessing`: **Questionable** (Score 5/10) - Do we really need to track this globally? Can't we derive it from `sessions.size > 0` or `sessions.values().some(s => s.pendingMessages.length > 0)`?
+- `spinnerStopTimer`: **Bad** (Score 7/10) - Exists solely to support artificial debouncing (see lines 338-365)
+
+**Pattern**: State that exists to support other unnecessary complexity.
+
+---
+
+### Lines 145-178: Service Startup
+
+**Lines 145-153**: HTTP server startup
+```typescript
+async start(): Promise<void> {
+  const port = getWorkerPort();
+  await new Promise<void>((resolve, reject) => {
+    this.app.listen(port, () => resolve())
+      .on('error', reject);
+  });
+  logger.info('SYSTEM', 'Worker started', { port, pid: process.pid });
+```
+**Score**: 1/10
+**Why**: This is good. Clean promise wrapper, fail-fast on errors, clear logging.
+
+**Lines 155-167**: ChromaSync initialization and orphan cleanup
+```typescript
+this.chromaSync = new ChromaSync('claude-mem');
+logger.info('SYSTEM', 'ChromaSync initialized');
+
+const db = new SessionStore();
+const cleanedCount = db.cleanupOrphanedSessions();
+db.close();
+```
+**Score**: 2/10
+**Why**: This is fine. Necessary initialization and cleanup. Database is opened, used, and closed immediately.
+
+**Lines 168-177**: Chroma backfill
+```typescript
+logger.info('SYSTEM', 'Starting Chroma backfill in background...');
+this.chromaSync.ensureBackfilled()
+  .then(() => {
+    logger.info('SYSTEM', 'Chroma backfill complete');
+  })
+  .catch((error: Error) => {
+    logger.error('SYSTEM', 'Chroma backfill failed - continuing anyway', {}, error);
+    // Don't exit - allow worker to continue serving requests
+  });
+```
+**Score**: 3/10
+**Why**: This is mostly fine. Fire-and-forget background operation that doesn't block startup. The verbose error handling is slightly excessive (could be a single logger call), but acceptable for a background operation.
+
+---
+
+### Lines 200-236: getOrCreateSession - THE KILLER
+
+```typescript
+private getOrCreateSession(sessionDbId: number): ActiveSession {
+  let session = this.sessions.get(sessionDbId);
+  if (session) return session;
+
+  const db = new SessionStore();
+  const dbSession = db.getSessionById(sessionDbId);
+  if (!dbSession) {
+    db.close();
+    throw new Error(`Session ${sessionDbId} not found in database`);
+  }
+
+  session = {
+    sessionDbId,
+    claudeSessionId: dbSession.claude_session_id,
+    sdkSessionId: null,
+    project: dbSession.project,
+    userPrompt: dbSession.user_prompt,
+    pendingMessages: [],
+    abortController: new AbortController(),
+    generatorPromise: null,
+    lastPromptNumber: 0,
+    startTime: Date.now()
+  };
+
+  this.sessions.set(sessionDbId, session);
+
+  session.generatorPromise = this.runSDKAgent(session).catch(err => {
+    logger.failure('WORKER', 'SDK agent error', { sessionId: sessionDbId }, err);
+    const db = new SessionStore();
+    db.markSessionFailed(sessionDbId);
+    db.close();
+    this.sessions.delete(sessionDbId);
+  });
+
+  db.close();
+  return session;
+}
+```
+
+**Score**: 8/10
+**Why This Is Stupid**:
+
+1. **Database Reopening**: Opens database at line 204, closes at line 234. This happens on:
+   - First call to `/sessions/:id/init` (line 691)
+   - First call to `/sessions/:id/observations` (line 762)
+   - First call to `/sessions/:id/summarize` (line 789)
+
+   For a typical session: init (DB open/close) → observation (DB open/close) → observation (DB open/close) → summarize (DB open/close). **That's 4 database open/close cycles when ONE would suffice.**
+
+2. **Redundant Database Access**: The database is ALREADY opened in `handleInit` at line 695 to call `setWorkerPort()`. So we have:
+   - Line 695: `const db = new SessionStore()` in handleInit
+   - Line 696: `db.setWorkerPort()`
+   - Line 697-711: More queries on the same database
+   - Line 711: `db.close()`
+   - Line 691: `this.getOrCreateSession()` is called
+   - Line 204: **Opens database AGAIN** inside getOrCreateSession
+   - Line 234: Closes it
+
+   **This is fucking insane.** We close the database, then immediately reopen it in the same call stack.
+
+3. **Error Handler Opens Database**: Line 228 opens a NEW database connection in the error handler. If runSDKAgent fails, we open the database AGAIN just to mark it failed, then close it. This is defensive programming for ghosts - if the worker is crashing, do we really care about marking it failed?
+
+**What Should Happen**:
+- Pass the already-open database connection to getOrCreateSession
+- Or at minimum, reuse the connection from the calling context
+- The error handler should either crash hard or mark failed WITHOUT reopening the database
+
+**Estimated Performance Impact**: Database open/close is expensive (~1-5ms each). For a session with 10 observations, this pattern adds **20-100ms of pure overhead**.
+
+---
+
+### Lines 263-292: SSE Stream Setup
+
+```typescript
+private handleSSEStream(req: Request, res: Response): void {
+  // Set SSE headers
+  res.setHeader('Content-Type', 'text/event-stream');
+  res.setHeader('Cache-Control', 'no-cache');
+  res.setHeader('Connection', 'keep-alive');
+  res.setHeader('Access-Control-Allow-Origin', '*');
+
+  // Add client to set
+  this.sseClients.add(res);
+  logger.info('WORKER', `SSE client connected`, { totalClients: this.sseClients.size });
+
+  // Send only projects list - all data will be loaded via pagination
+  const db = new SessionStore();
+  const allProjects = db.getAllProjects();
+  db.close();
+
+  const initialData = {
+    type: 'initial_load',
+    projects: allProjects,
+    timestamp: Date.now()
+  };
+
+  res.write(`data: ${JSON.stringify(initialData)}\n\n`);
+
+  // Handle client disconnect
+  req.on('close', () => {
+    this.sseClients.delete(res);
+    logger.info('WORKER', `SSE client disconnected`, { remainingClients: this.sseClients.size });
+  });
+}
+```
+
+**Score**: 2/10
+**Why**: This is mostly good. Clean SSE setup with proper headers and client tracking. Database is opened, used, and closed.
+
+---
+
+### Lines 297-322: SSE Broadcast and Cleanup
+
+```typescript
+private broadcastSSE(event: any): void {
+  if (this.sseClients.size === 0) {
+    return; // No clients connected, skip broadcast
+  }
+
+  const data = `data: ${JSON.stringify(event)}\n\n`;
+  const clientsToRemove: Response[] = [];
+
+  for (const client of this.sseClients) {
+    try {
+      client.write(data);
+    } catch (error) {
+      // Client disconnected, mark for removal
+      clientsToRemove.push(client);
+    }
+  }
+
+  // Clean up disconnected clients
+  for (const client of clientsToRemove) {
+    this.sseClients.delete(client);
+  }
+
+  if (clientsToRemove.length > 0) {
+    logger.info('WORKER', `SSE cleaned up disconnected clients`, { count: clientsToRemove.length });
+  }
+}
+```
+
+**Score**: 4/10
+**Why This Is Slightly Stupid**:
+
+1. **Two-Pass Cleanup**: Creates a temporary array of failed clients, then iterates again to remove them. Why not just remove them in the first loop?
+2. **Unnecessary Logging**: Do we really need to log every time a client disconnects? The `handleSSEStream` already logs disconnects (line 290). This is duplicate logging.
+
+**What Should Happen**:
+```typescript
+private broadcastSSE(event: any): void {
+  if (this.sseClients.size === 0) return;
+
+  const data = `data: ${JSON.stringify(event)}\n\n`;
+  for (const client of this.sseClients) {
+    try {
+      client.write(data);
+    } catch {
+      this.sseClients.delete(client);
+    }
+  }
+}
+```
+
+**Savings**: Remove 10 lines, remove duplicate logging, eliminate temporary array.
+
+---
+
+### Lines 338-365: Spinner Debounce - ARTIFICIAL DELAY
+
+```typescript
+private checkAndStopSpinner(): void {
+  // Clear any existing timer
+  if (this.spinnerStopTimer) {
+    clearTimeout(this.spinnerStopTimer);
+    this.spinnerStopTimer = null;
+  }
+
+  // Check if any session has pending messages
+  const hasPendingMessages = Array.from(this.sessions.values()).some(
+    session => session.pendingMessages.length > 0
+  );
+
+  if (!hasPendingMessages) {
+    // Debounce: wait 1.5s and check again
+    this.spinnerStopTimer = setTimeout(() => {
+      const stillEmpty = Array.from(this.sessions.values()).every(
+        session => session.pendingMessages.length === 0
+      );
+
+      if (stillEmpty) {
+        logger.debug('WORKER', 'All queues empty - stopping spinner');
+        this.broadcastProcessingStatus(false);
+      }
+
+      this.spinnerStopTimer = null;
+    }, 1500);
+  }
+}
+```
+
+**Score**: 9/10
+**Why This Is ABSOLUTELY FUCKING STUPID**:
+
+1. **Artificial Delay**: **1.5 SECONDS** (1500ms) of artificial delay before stopping the spinner. This is pure overhead added for no reason.
+
+2. **Why Was This Added?**: Probably someone thought "the UI flickers when the spinner stops/starts rapidly." **SO FUCKING WHAT?** That's a UI rendering problem, not a worker service problem. Fix it in the UI with CSS transitions or debouncing on the CLIENT side.
+
+3. **Double-Check Pattern**: Checks if queues are empty, waits 1.5s, then checks AGAIN. This is defensive programming for ghosts. If the queue is empty, it's empty. We're not protecting against race conditions here - we're just wasting time.
+
+4. **Polling Instead of Events**: This function is called from `handleAgentMessage` (line 1145) after processing every single response. Instead of reacting to the actual completion of work, we're polling state and debouncing.
+
+5. **State Management Overhead**: Requires `spinnerStopTimer` field (line 109), timer cleanup logic, null checks, etc.
+
+**Real-World Impact**: Every time the worker finishes processing observations, the UI spinner continues to show "processing" for **1.5 seconds** even though nothing is happening. This makes the entire system feel slower.
+
+**What Should Happen**:
+```typescript
+private checkAndStopSpinner(): void {
+  const hasPendingMessages = Array.from(this.sessions.values()).some(
+    session => session.pendingMessages.length > 0
+  );
+
+  if (!hasPendingMessages) {
+    this.broadcastProcessingStatus(false);
+  }
+}
+```
+
+**Savings**: Remove 15 lines of debouncing logic, remove timer state, eliminate 1.5s artificial delay.
+
+**Alternative**: If UI flickering is actually a problem (prove it first), handle it client-side with CSS transitions or client-side debouncing.
+
+---
+
+### Lines 370-411: Stats Endpoint
+
+```typescript
+private handleStats(_req: Request, res: Response): void {
+  try {
+    const db = new SessionStore();
+
+    // Get database stats
+    const obsCount = db.db.prepare('SELECT COUNT(*) as count FROM observations').get() as { count: number };
+    const sessionCount = db.db.prepare('SELECT COUNT(*) as count FROM sdk_sessions').get() as { count: number };
+    const summaryCount = db.db.prepare('SELECT COUNT(*) as count FROM session_summaries').get() as { count: number };
+
+    // Get database file size
+    const dbPath = join(homedir(), '.claude-mem', 'claude-mem.db');
+    let dbSize = 0;
+    if (existsSync(dbPath)) {
+      dbSize = statSync(dbPath).size;
+    }
+
+    db.close();
+
+    // Get worker stats
+    const uptime = process.uptime();
+
+    res.json({
+      worker: {
+        version: VERSION,
+        uptime: Math.floor(uptime),
+        activeSessions: this.sessions.size,
+        sseClients: this.sseClients.size,
+        port: getWorkerPort()
+      },
+      database: {
+        path: dbPath,
+        size: dbSize,
+        observations: obsCount.count,
+        sessions: sessionCount.count,
+        summaries: summaryCount.count
+      }
+    });
+  } catch (error: any) {
+    logger.error('WORKER', 'Failed to get stats', {}, error);
+    res.status(500).json({ error: 'Failed to get stats' });
+  }
+}
+```
+
+**Score**: 3/10
+**Why Slightly Stupid**:
+
+1. **Redundant existsSync Check**: The database path is guaranteed to exist if SessionStore initialized successfully. If it doesn't exist, SessionStore would have crashed on startup. This is defensive programming for ghosts.
+
+2. **Three Separate Queries**: Could be combined into a single query with UNION or multiple SELECT columns, but this is minor.
+
+**What Should Happen**:
+```typescript
+const dbSize = statSync(dbPath).size; // Just crash if it doesn't exist
+```
+
+Otherwise, this is mostly fine. Stats endpoints are low-frequency and non-critical.
+
+---
+
+### Lines 507-555: GET /api/observations
+
+```typescript
+private handleGetObservations(req: Request, res: Response): void {
+  try {
+    const offset = parseInt(req.query.offset as string || '0', 10);
+    const limit = Math.min(parseInt(req.query.limit as string || '50', 10), 100); // Cap at 100
+    const project = req.query.project as string | undefined;
+
+    const db = new SessionStore();
+
+    // Build query with optional project filter
+    let query = `
+      SELECT id, type, title, subtitle, text, project, prompt_number, created_at, created_at_epoch
+      FROM observations
+    `;
+    let countQuery = 'SELECT COUNT(*) as total FROM observations';
+    const params: any[] = [];
+    const countParams: any[] = [];
+
+    if (project) {
+      query += ' WHERE project = ?';
+      countQuery += ' WHERE project = ?';
+      params.push(project);
+      countParams.push(project);
+    }
+
+    query += ' ORDER BY created_at_epoch DESC LIMIT ? OFFSET ?';
+    params.push(limit, offset);
+
+    const stmt = db.db.prepare(query);
+    const observations = stmt.all(...params);
+
+    // Check if there are more results
+    const countStmt = db.db.prepare(countQuery);
+    const { total } = countStmt.get(...countParams) as { total: number };
+    const hasMore = (offset + limit) < total;
+
+    db.close();
+
+    res.json({
+      observations,
+      hasMore,
+      total,
+      offset,
+      limit
+    });
+  } catch (error: any) {
+    logger.error('WORKER', 'Failed to get observations', {}, error);
+    res.status(500).json({ error: 'Failed to get observations' });
+  }
+}
+```
+
+**Score**: 5/10
+**Why This Is Mildly Stupid**:
+
+1. **Duplicate Parameter Arrays**: `params` and `countParams` are maintained separately even though they contain the same values (just the project filter). This is error-prone and verbose.
+
+2. **Two Queries Instead of One**: We run a COUNT query and a SELECT query. For small datasets, this is fine, but for large datasets, the COUNT query can be expensive. The `hasMore` flag could be computed by fetching `limit + 1` rows and checking if we got more than `limit`.
+
+**What Should Happen**:
+```typescript
+// Fetch one extra row to determine if there are more results
+const stmt = db.db.prepare(query);
+const results = stmt.all(...params);
+const observations = results.slice(0, limit);
+const hasMore = results.length > limit;
+
+// Only run COUNT if the UI actually needs it (it probably doesn't)
+```
+
+**Pattern**: This same pattern is repeated in `handleGetSummaries` (line 557) and `handleGetPrompts` (line 618). Copy-paste code smell.
+
+**Estimated Savings**: Remove COUNT queries (which can be expensive on large tables), simplify parameter handling.
+
+---
+
+### Lines 685-752: POST /sessions/:sessionDbId/init - DATABASE REOPENING HELL
+
+```typescript
+private async handleInit(req: Request, res: Response): Promise<void> {
+  const sessionDbId = parseInt(req.params.sessionDbId, 10);
+  const { project } = req.body;
+
+  logger.info('WORKER', 'Session init', { sessionDbId, project });
+
+  const session = this.getOrCreateSession(sessionDbId); // <-- Opens DB at line 204
+  const claudeSessionId = session.claudeSessionId;
+
+  // Update port in database
+  const db = new SessionStore(); // <-- Opens DB AGAIN
+  db.setWorkerPort(sessionDbId, getWorkerPort());
+
+  // Get the latest user_prompt for this session to sync to Chroma
+  const latestPrompt = db.db.prepare(`
+    SELECT
+      up.*,
+      s.sdk_session_id,
+      s.project
+    FROM user_prompts up
+    JOIN sdk_sessions s ON up.claude_session_id = s.claude_session_id
+    WHERE up.claude_session_id = ?
+    ORDER BY up.created_at_epoch DESC
+    LIMIT 1
+  `).get(claudeSessionId) as any;
+
+  db.close(); // <-- Closes DB
+
+  // ... SSE broadcast ...
+  // ... Chroma sync ...
+
+  logger.success('WORKER', 'Session initialized', { sessionId: sessionDbId, port: getWorkerPort() });
+  res.json({
+    status: 'initialized',
+    sessionDbId,
+    port: getWorkerPort()
+  });
+}
+```
+
+**Score**: 7/10
+**Why This Is Stupid**:
+
+1. **Two Database Opens in Same Function**:
+   - Line 691: `getOrCreateSession()` opens DB internally (line 204)
+   - Line 695: Opens DB AGAIN for `setWorkerPort()`
+   - Line 711: Closes DB
+
+2. **Redundant Data Fetching**: `getOrCreateSession()` already fetches session data from the database (line 205). Then we query AGAIN for the user prompt (line 698).
+
+3. **Tight Coupling**: `getOrCreateSession()` hides database access, making it unclear that we're opening the database twice.
+
+**What Should Happen**:
+- Open database ONCE at the start of handleInit
+- Pass the open database to getOrCreateSession
+- Fetch all needed data in a single transaction
+- Close database at the end
+
+**Estimated Savings**: Eliminate 1 database open/close cycle (1-5ms).
+
+---
+
+### Lines 728-741: Chroma Sync with Verbose Error Handling
+
+```typescript
+// Sync user prompt to Chroma (fire-and-forget, but crash on failure)
+if (latestPrompt) {
+  this.chromaSync.syncUserPrompt(
+    latestPrompt.id,
+    latestPrompt.sdk_session_id,
+    latestPrompt.project,
+    latestPrompt.prompt_text,
+    latestPrompt.prompt_number,
+    latestPrompt.created_at_epoch
+  ).catch(err => {
+    logger.failure('WORKER', 'Failed to sync user_prompt to Chroma - continuing', { promptId: latestPrompt.id }, err);
+    // Don't crash - SQLite has the data
+  });
+}
+```
+
+**Score**: 5/10
+**Why This Is Mildly Stupid**:
+
+1. **Inconsistent Error Handling**: The comment says "crash on failure" but then we catch the error and continue. Which is it?
+
+2. **Redundant Comment**: The code says `.catch(err => { /* continue */ })` and the comment says "Don't crash - SQLite has the data". The code is self-documenting.
+
+3. **Fire-and-Forget**: If we're going to fire-and-forget, why bother with verbose error handling? Either care about failures (and retry/alert) or don't (and just log).
+
+**What Should Happen**:
+```typescript
+// Fire-and-forget Chroma sync (SQLite is source of truth)
+if (latestPrompt) {
+  this.chromaSync.syncUserPrompt(/* ... */).catch(() => {}); // Swallow errors
+}
+```
+
+**Pattern**: This same verbose error handling appears in lines 1057-1076 and 1114-1133.
+
+---
+
+### Lines 758-779: POST /sessions/:sessionDbId/observations
+
+```typescript
+private handleObservation(req: Request, res: Response): void {
+  const sessionDbId = parseInt(req.params.sessionDbId, 10);
+  const { tool_name, tool_input, tool_output, prompt_number } = req.body;
+
+  const session = this.getOrCreateSession(sessionDbId); // <-- Opens DB
+  const toolStr = logger.formatTool(tool_name, tool_input);
+
+  logger.dataIn('WORKER', `Observation queued: ${toolStr}`, {
+    sessionId: sessionDbId,
+    queue: session.pendingMessages.length + 1
+  });
+
+  session.pendingMessages.push({
+    type: 'observation',
+    tool_name,
+    tool_input,
+    tool_output,
+    prompt_number
+  });
+
+  res.json({ status: 'queued', queueLength: session.pendingMessages.length });
+}
+```
+
+**Score**: 6/10
+**Why This Is Stupid**:
+
+1. **Database Opens for No Reason**: `getOrCreateSession()` opens the database (line 204), but we don't actually need any data from the database here. We just need to get or create the in-memory session object.
+
+2. **Hot Path Performance**: This endpoint is called **for every single tool execution**. If you run 100 tool calls in a session, this opens/closes the database 100 times unnecessarily.
+
+**What Should Happen**:
+- Separate "get existing session" from "create session from database"
+- Only open database if creating a new session
+- For existing sessions, just push to the queue
+
+**Estimated Savings**: For a session with 100 observations, eliminate 99 unnecessary database open/close cycles (**99-495ms of pure overhead**).
+
+---
+
+### Lines 914-1005: createMessageGenerator - THE POLLING HORROR
+
+```typescript
+private async* createMessageGenerator(session: ActiveSession): AsyncIterable<SDKUserMessage> {
+  // ... send init prompt ...
+
+  // Process messages continuously until session is deleted
+  while (true) {
+    if (session.abortController.signal.aborted) {
+      break;
+    }
+
+    if (session.pendingMessages.length === 0) {
+      await new Promise(resolve => setTimeout(resolve, MESSAGE_POLL_INTERVAL_MS));
+      continue;
+    }
+
+    while (session.pendingMessages.length > 0) {
+      const message = session.pendingMessages.shift()!;
+      // ... process message ...
+      yield { /* SDK message */ };
+    }
+  }
+}
+```
+
+**Score**: 10/10
+**Why This Is ABSOLUTELY FUCKING STUPID**:
+
+1. **Infinite Polling Loop**: Lines 936-944 implement a **busy-wait polling loop** that checks `pendingMessages.length` every 100ms. This is the single dumbest pattern in the entire file.
+
+2. **Event-Driven Alternative**: We have a fucking queue! When something is added to the queue, **NOTIFY THE CONSUMER**. Use an EventEmitter, a Promise, a Condition Variable, ANYTHING but polling.
+
+3. **Wasted CPU**: Every 100ms, this loop wakes up, checks if the queue is empty, and goes back to sleep. For a worker that runs for hours, this is thousands of unnecessary wake-ups.
+
+4. **Latency**: When an observation is queued (line 770), it sits in the queue for up to 100ms before being processed. **This adds 0-100ms of artificial latency to every single observation.**
+
+5. **Battery Impact**: On laptops, constant polling prevents CPU from entering deep sleep states, draining battery.
+
+**What Should Happen**:
+
+```typescript
+// In WorkerService class
+private sessionQueues: Map<number, EventEmitter> = new Map();
+
+private handleObservation(req: Request, res: Response): void {
+  // ... existing code ...
+  session.pendingMessages.push({ /* message */ });
+
+  // Notify the generator that new work is available
+  const emitter = this.sessionQueues.get(sessionDbId);
+  if (emitter) {
+    emitter.emit('message');
+  }
+
+  res.json({ status: 'queued', queueLength: session.pendingMessages.length });
+}
+
+private async* createMessageGenerator(session: ActiveSession): AsyncIterable<SDKUserMessage> {
+  const emitter = new EventEmitter();
+  this.sessionQueues.set(session.sessionDbId, emitter);
+
+  yield { /* init prompt */ };
+
+  while (!session.abortController.signal.aborted) {
+    if (session.pendingMessages.length === 0) {
+      // Wait for new messages via event, not polling
+      await new Promise(resolve => emitter.once('message', resolve));
+    }
+
+    while (session.pendingMessages.length > 0) {
+      const message = session.pendingMessages.shift()!;
+      yield { /* process message */ };
+    }
+  }
+
+  this.sessionQueues.delete(session.sessionDbId);
+}
+```
+
+**Estimated Savings**:
+- Remove 100ms polling interval (eliminate 0-100ms latency per observation)
+- Reduce CPU wake-ups from ~10/second to 0 when idle
+- Improve battery life on laptops
+- Make the system feel more responsive
+
+**Real-World Impact**: For a session with 10 observations, this polling adds **0-1000ms of cumulative latency**. The user is literally waiting for the polling loop to wake up.
+
+---
+
+### Lines 1011-1146: handleAgentMessage - Database Reopening and Chroma Spam
+
+```typescript
+private handleAgentMessage(session: ActiveSession, content: string, promptNumber: number): void {
+  // ... parse observations and summary ...
+
+  const db = new SessionStore(); // <-- Opens DB
+
+  // Store observations and sync to Chroma (non-blocking, fail-fast)
+  for (const obs of observations) {
+    const { id, createdAtEpoch } = db.storeObservation(/* ... */);
+    logger.success('DB', 'Observation stored', { /* ... */ });
+
+    // Broadcast to SSE clients
+    this.broadcastSSE({ /* ... */ });
+
+    // Sync to Chroma (non-blocking fire-and-forget, but crash on failure)
+    this.chromaSync.syncObservation(/* ... */)
+      .then(() => {
+        logger.success('WORKER', 'Observation synced to Chroma', { /* ... */ });
+      })
+      .catch((error: Error) => {
+        logger.error('WORKER', 'Observation sync failed - continuing', { /* ... */ }, error);
+        // Don't crash - SQLite has the data
+      });
+  }
+
+  // ... similar pattern for summary ...
+
+  db.close(); // <-- Closes DB
+
+  // Check if queue is empty and stop spinner after debounce
+  this.checkAndStopSpinner(); // <-- Triggers 1.5s delay
+}
+```
+
+**Score**: 6/10
+**Why This Is Stupid**:
+
+1. **Database Reopening**: Opens database (line 1030), stores all observations, closes database (line 1142). This is called **for every SDK response**. For a session with 10 observations, this opens/closes the database 10+ times.
+
+2. **Verbose Chroma Error Handling**: Lines 1057-1076 and 1114-1133 have identical verbose error handling for Chroma sync failures. This is copy-paste code smell.
+
+3. **Success Logging Spam**: Line 1066 and 1123 log success for EVERY Chroma sync. For a session with 100 observations, this logs 100 success messages. Why? Who reads these?
+
+4. **Debounce Call**: Line 1145 calls `checkAndStopSpinner()`, triggering the 1.5s artificial delay.
+
+**What Should Happen**:
+- Reuse database connection across multiple calls
+- Simplify Chroma error handling (fire-and-forget means swallow errors)
+- Remove success logging (or make it debug-level)
+- Remove debounce delay
+
+---
+
+## Summary of Patterns
+
+### 1. Database Reopening Anti-Pattern
+**Occurrences**: Lines 200-236, 685-752, 758-779, 1011-1146
+**Impact**: Opens/closes database 4-100+ times per session instead of reusing connections
+**Fix**: Pass open database connections between functions, use transactions, connection pooling
+
+### 2. Polling Instead of Events
+**Occurrences**: Line 942 (100ms polling loop)
+**Impact**: 0-100ms latency per observation, wasted CPU cycles, battery drain
+**Fix**: Use EventEmitter or async queue with await/notify pattern
+
+### 3. Artificial Delays
+**Occurrences**: Line 363 (1.5s spinner debounce), line 942 (100ms poll interval)
+**Impact**: 1.5s delay before spinner stops, 0-100ms delay per observation
+**Fix**: Remove debouncing, use event-driven patterns
+
+### 4. Premature Optimization
+**Occurrences**: Lines 33-70 (Claude path caching)
+**Impact**: 37 lines of code to save 5ms on a one-time operation
+**Fix**: Remove caching, inline the function
+
+### 5. Defensive Programming for Ghosts
+**Occurrences**: Line 382 (existsSync check), lines 228-231 (error handler reopens DB), lines 728-741 (verbose error handling)
+**Impact**: Code complexity without real benefit
+**Fix**: Fail fast, trust invariants, simplify error handling
+
+### 6. Copy-Paste Code
+**Occurrences**: handleGetObservations, handleGetSummaries, handleGetPrompts (nearly identical)
+**Impact**: Maintenance burden, inconsistency risk
+**Fix**: Extract common pagination logic into helper function
+
+---
+
+## Recommendations
+
+### Immediate Wins (Low Effort, High Impact)
+
+1. **Remove Spinner Debounce** (Lines 338-365)
+   - **Effort**: 5 minutes
+   - **Impact**: Eliminate 1.5s artificial delay
+   - **Score**: 9/10 stupidity
+
+2. **Replace Polling with Events** (Line 942)
+   - **Effort**: 30 minutes
+   - **Impact**: Eliminate 0-100ms latency per observation, reduce CPU usage
+   - **Score**: 10/10 stupidity
+
+3. **Remove Claude Path Caching** (Lines 33-70)
+   - **Effort**: 5 minutes
+   - **Impact**: Remove 37 lines of unnecessary code
+   - **Score**: 6/10 stupidity
+
+### Medium Wins (Moderate Effort, Good Impact)
+
+4. **Fix Database Reopening in Hot Path** (Lines 758-779)
+   - **Effort**: 1 hour
+   - **Impact**: Eliminate 99+ database cycles per session
+   - **Score**: 6/10 stupidity
+
+5. **Simplify Chroma Error Handling** (Lines 728-741, 1057-1076, 1114-1133)
+   - **Effort**: 15 minutes
+   - **Impact**: Remove 50+ lines of verbose error handling
+   - **Score**: 5/10 stupidity
+
+6. **Simplify SSE Broadcast** (Lines 297-322)
+   - **Effort**: 5 minutes
+   - **Impact**: Remove 10 lines, eliminate two-pass cleanup
+   - **Score**: 4/10 stupidity
+
+### Long-Term Improvements (High Effort, Architectural)
+
+7. **Database Connection Pooling**
+   - **Effort**: 4 hours
+   - **Impact**: Reuse connections across requests, eliminate all open/close overhead
+   - **Score**: 8/10 stupidity (current approach)
+
+8. **Extract Pagination Helper**
+   - **Effort**: 1 hour
+   - **Impact**: DRY up handleGetObservations/Summaries/Prompts
+   - **Score**: 5/10 stupidity
+
+---
+
+## Estimated Performance Impact
+
+**Current Hot Path (1 observation)**:
+- HTTP request arrives: 0ms
+- getOrCreateSession opens/closes DB: 1-5ms
+- Queue message: 0ms
+- Poll interval: 0-100ms (average 50ms)
+- SDK processing: variable
+- handleAgentMessage opens/closes DB: 1-5ms
+- Chroma sync (async): N/A
+- checkAndStopSpinner debounce: 1500ms
+- **Total artificial overhead**: 1502-1610ms (1.5-1.6 seconds)
+
+**Optimized Hot Path (1 observation)**:
+- HTTP request arrives: 0ms
+- Get existing session (no DB): 0ms
+- Queue message + notify: 0ms
+- SDK processing: variable
+- Store in DB (connection pool): 0.1-0.5ms
+- Chroma sync (async): N/A
+- Stop spinner (no debounce): 0ms
+- **Total artificial overhead**: 0.1-0.5ms
+
+**Speedup**: **3000-16000x faster** (removing artificial delays and polling)
+
+---
+
+## Conclusion
+
+This file has accumulated significant technical debt in the form of:
+- **Artificial delays** (1.5s debounce, 100ms polling)
+- **Database reopening anti-pattern** (4-100+ opens per session)
+- **Polling instead of events** (busy-wait loop)
+- **Premature optimization** (caching rare operations)
+- **Defensive programming** (protecting against non-existent failures)
+
+The worker spends more time **waiting** (polling, debouncing) than **working**. Most of these patterns were likely added with good intentions ("make the UI smooth", "cache for performance", "handle errors gracefully") but ended up creating more problems than they solved.
+
+**Priority Fixes**:
+1. Remove spinner debounce (9/10 stupidity)
+2. Replace polling with events (10/10 stupidity)
+3. Fix database reopening in hot path (6-8/10 stupidity)
+
+These three changes alone would eliminate **1.5+ seconds of artificial delay** per session and make the system feel dramatically more responsive.