claude-mem/docs/worker-service-overhead.md

# Worker Service Overhead Analysis

**Date**: 2025-11-06
**File**: `src/services/worker-service.ts`
**Total Lines**: 1173
**Overall Assessment**: This file has accumulated unnecessary complexity, artificial delays, and defensive programming patterns that actively harm performance. Many patterns were likely added "just in case" without real-world justification.

---

## Executive Summary

**High Severity Issues (Score 8-10)**:
- **Line 942**: Polling loop with 100ms delay instead of event-driven architecture (Score: 10/10)
- **Lines 338-365**: Spinner debounce with 1.5s artificial delay (Score: 9/10)
- **Lines 204-234**: Database reopening on every getOrCreateSession call (Score: 8/10)

**Medium Severity Issues (Score 5-7)**:
- **Lines 33-70**: Unnecessary Claude path caching for rare operation (Score: 6/10)
- **Lines 694-711**: Redundant database reopening in handleInit (Score: 7/10)
- **Lines 728-741**: Fire-and-forget Chroma sync with verbose error handling (Score: 5/10)

**Low Severity Issues (Score 3-4)**:
- **Line 28**: Magic number MESSAGE_POLL_INTERVAL_MS without justification (Score: 4/10)
- **Lines 303-321**: Over-engineered SSE client cleanup (Score: 4/10)

---

## Line-by-Line Analysis

### Lines 1-30: Setup and Constants

**Lines 22-24**: Version reading from package.json
```typescript
const packageJson = JSON.parse(readFileSync(join(__dirname, '..', '..', 'package.json'), 'utf-8'));
const VERSION = packageJson.version;
```
**Score**: 2/10
**Why**: This is fine. Reads once at startup, uses the value for the /api/stats endpoint.

**Line 26**: Model configuration
```typescript
const MODEL = process.env.CLAUDE_MEM_MODEL || 'claude-sonnet-4-5';
```
**Score**: 1/10
**Why**: Clean, simple, correct.

**Line 28**: Magic number
```typescript
const MESSAGE_POLL_INTERVAL_MS = 100;
```
**Score**: 4/10
**Why**: This is a magic number without justification. Why 100ms? Why not 50ms or 200ms? More importantly, **why are we polling at all instead of using event-driven patterns?** The name is descriptive, but the existence of this constant indicates a fundamental architectural problem (see line 942).

**Pattern**: This constant exists to support a polling loop that shouldn't exist.

---

### Lines 33-70: Claude Path Caching

```typescript
let cachedClaudePath: string | null = null;

function findClaudePath(): string {
  if (cachedClaudePath) {
    return cachedClaudePath;
  }
  // ... 30 lines of logic to find and cache path ...
}
```

**Score**: 6/10
**Why Stupid**:
1. **YAGNI Violation**: This function is called **exactly once** per worker startup (line 846 in runSDKAgent)
2. **Premature Optimization**: Caching saves ~5ms on an operation that happens once per worker lifetime
3. **Added Complexity**: 37 lines of code including module-level state for negligible benefit
4. **False Economy**: The worker runs for hours/days. Saving 5ms on startup is meaningless.

**What Should Happen**:
```typescript
function findClaudePath(): string {
  if (process.env.CLAUDE_CODE_PATH) return process.env.CLAUDE_CODE_PATH;

  const command = process.platform === 'win32' ? 'where claude' : 'which claude';
  const result = execSync(command, { encoding: 'utf8' }).trim().split('\n')[0].trim();

  if (!result) throw new Error('Claude executable not found in PATH');
  return result;
}
```
**Savings**: Remove 33 lines of unnecessary code and module-level state.

---

### Lines 103-110: WorkerService State

```typescript
class WorkerService {
  private app: express.Application;
  private sessions: Map<number, ActiveSession> = new Map();
  private chromaSync!: ChromaSync;
  private sseClients: Set<Response> = new Set();
  private isProcessing: boolean = false;
  private spinnerStopTimer: NodeJS.Timeout | null = null;
```

**Score**: 7/10 (for spinnerStopTimer)
**Why**:
- `app`, `sessions`, `chromaSync`, `sseClients`: **Good** - necessary state
- `isProcessing`: **Questionable** (Score 5/10) - Do we really need to track this globally? Can't we derive it from `sessions.size > 0` or `sessions.values().some(s => s.pendingMessages.length > 0)`?
- `spinnerStopTimer`: **Bad** (Score 7/10) - Exists solely to support artificial debouncing (see lines 338-365)

**Pattern**: State that exists to support other unnecessary complexity.

---

### Lines 145-178: Service Startup

**Lines 145-153**: HTTP server startup
```typescript
async start(): Promise<void> {
  const port = getWorkerPort();
  await new Promise<void>((resolve, reject) => {
    this.app.listen(port, () => resolve())
      .on('error', reject);
  });
  logger.info('SYSTEM', 'Worker started', { port, pid: process.pid });
```
**Score**: 1/10
**Why**: This is good. Clean promise wrapper, fail-fast on errors, clear logging.

**Lines 155-167**: ChromaSync initialization and orphan cleanup
```typescript
this.chromaSync = new ChromaSync('claude-mem');
logger.info('SYSTEM', 'ChromaSync initialized');

const db = new SessionStore();
const cleanedCount = db.cleanupOrphanedSessions();
db.close();
```
**Score**: 2/10
**Why**: This is fine. Necessary initialization and cleanup. Database is opened, used, and closed immediately.

**Lines 168-177**: Chroma backfill
```typescript
logger.info('SYSTEM', 'Starting Chroma backfill in background...');
this.chromaSync.ensureBackfilled()
  .then(() => {
    logger.info('SYSTEM', 'Chroma backfill complete');
  })
  .catch((error: Error) => {
    logger.error('SYSTEM', 'Chroma backfill failed - continuing anyway', {}, error);
    // Don't exit - allow worker to continue serving requests
  });
```
**Score**: 3/10
**Why**: This is mostly fine. Fire-and-forget background operation that doesn't block startup. The verbose error handling is slightly excessive (could be a single logger call), but acceptable for a background operation.

---

### Lines 200-236: getOrCreateSession - THE KILLER

```typescript
private getOrCreateSession(sessionDbId: number): ActiveSession {
  let session = this.sessions.get(sessionDbId);
  if (session) return session;

  const db = new SessionStore();
  const dbSession = db.getSessionById(sessionDbId);
  if (!dbSession) {
    db.close();
    throw new Error(`Session ${sessionDbId} not found in database`);
  }

  session = {
    sessionDbId,
    claudeSessionId: dbSession.claude_session_id,
    sdkSessionId: null,
    project: dbSession.project,
    userPrompt: dbSession.user_prompt,
    pendingMessages: [],
    abortController: new AbortController(),
    generatorPromise: null,
    lastPromptNumber: 0,
    startTime: Date.now()
  };

  this.sessions.set(sessionDbId, session);

  session.generatorPromise = this.runSDKAgent(session).catch(err => {
    logger.failure('WORKER', 'SDK agent error', { sessionId: sessionDbId }, err);
    const db = new SessionStore();
    db.markSessionFailed(sessionDbId);
    db.close();
    this.sessions.delete(sessionDbId);
  });

  db.close();
  return session;
}
```

**Score**: 8/10
**Why This Is Stupid**:

1. **Database Reopening**: Opens database at line 204, closes at line 234. This happens on:
   - First call to `/sessions/:id/init` (line 691)
   - First call to `/sessions/:id/observations` (line 762)
   - First call to `/sessions/:id/summarize` (line 789)

   For a typical session: init (DB open/close) → observation (DB open/close) → observation (DB open/close) → summarize (DB open/close). **That's 4 database open/close cycles when ONE would suffice.**

2. **Redundant Database Access**: The database is ALREADY opened in `handleInit` at line 695 to call `setWorkerPort()`. So we have:
   - Line 695: `const db = new SessionStore()` in handleInit
   - Line 696: `db.setWorkerPort()`
   - Line 697-711: More queries on the same database
   - Line 711: `db.close()`
   - Line 691: `this.getOrCreateSession()` is called
   - Line 204: **Opens database AGAIN** inside getOrCreateSession
   - Line 234: Closes it

   **This is fucking insane.** We close the database, then immediately reopen it in the same call stack.

3. **Error Handler Opens Database**: Line 228 opens a NEW database connection in the error handler. If runSDKAgent fails, we open the database AGAIN just to mark it failed, then close it. This is defensive programming for ghosts - if the worker is crashing, do we really care about marking it failed?

**What Should Happen**:
- Pass the already-open database connection to getOrCreateSession
- Or at minimum, reuse the connection from the calling context
- The error handler should either crash hard or mark failed WITHOUT reopening the database

**Estimated Performance Impact**: Database open/close is expensive (~1-5ms each). For a session with 10 observations, this pattern adds **20-100ms of pure overhead**.

---

### Lines 263-292: SSE Stream Setup

```typescript
private handleSSEStream(req: Request, res: Response): void {
  // Set SSE headers
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');
  res.setHeader('Access-Control-Allow-Origin', '*');

  // Add client to set
  this.sseClients.add(res);
  logger.info('WORKER', `SSE client connected`, { totalClients: this.sseClients.size });

  // Send only projects list - all data will be loaded via pagination
  const db = new SessionStore();
  const allProjects = db.getAllProjects();
  db.close();

  const initialData = {
    type: 'initial_load',
    projects: allProjects,
    timestamp: Date.now()
  };

  res.write(`data: ${JSON.stringify(initialData)}\n\n`);

  // Handle client disconnect
  req.on('close', () => {
    this.sseClients.delete(res);
    logger.info('WORKER', `SSE client disconnected`, { remainingClients: this.sseClients.size });
  });
}
```

**Score**: 2/10
**Why**: This is mostly good. Clean SSE setup with proper headers and client tracking. Database is opened, used, and closed.

---

### Lines 297-322: SSE Broadcast and Cleanup

```typescript
private broadcastSSE(event: any): void {
  if (this.sseClients.size === 0) {
    return; // No clients connected, skip broadcast
  }

  const data = `data: ${JSON.stringify(event)}\n\n`;
  const clientsToRemove: Response[] = [];

  for (const client of this.sseClients) {
    try {
      client.write(data);
    } catch (error) {
      // Client disconnected, mark for removal
      clientsToRemove.push(client);
    }
  }

  // Clean up disconnected clients
  for (const client of clientsToRemove) {
    this.sseClients.delete(client);
  }

  if (clientsToRemove.length > 0) {
    logger.info('WORKER', `SSE cleaned up disconnected clients`, { count: clientsToRemove.length });
  }
}
```

**Score**: 4/10
**Why This Is Slightly Stupid**:

1. **Two-Pass Cleanup**: Creates a temporary array of failed clients, then iterates again to remove them. Why not just remove them in the first loop?
2. **Unnecessary Logging**: Do we really need to log every time a client disconnects? The `handleSSEStream` already logs disconnects (line 290). This is duplicate logging.

**What Should Happen**:
```typescript
private broadcastSSE(event: any): void {
  if (this.sseClients.size === 0) return;

  const data = `data: ${JSON.stringify(event)}\n\n`;
  for (const client of this.sseClients) {
    try {
      client.write(data);
    } catch {
      this.sseClients.delete(client);
    }
  }
}
```

**Savings**: Remove 10 lines, remove duplicate logging, eliminate temporary array.

---

### Lines 338-365: Spinner Debounce - ARTIFICIAL DELAY

```typescript
private checkAndStopSpinner(): void {
  // Clear any existing timer
  if (this.spinnerStopTimer) {
    clearTimeout(this.spinnerStopTimer);
    this.spinnerStopTimer = null;
  }

  // Check if any session has pending messages
  const hasPendingMessages = Array.from(this.sessions.values()).some(
    session => session.pendingMessages.length > 0
  );

  if (!hasPendingMessages) {
    // Debounce: wait 1.5s and check again
    this.spinnerStopTimer = setTimeout(() => {
      const stillEmpty = Array.from(this.sessions.values()).every(
        session => session.pendingMessages.length === 0
      );

      if (stillEmpty) {
        logger.debug('WORKER', 'All queues empty - stopping spinner');
        this.broadcastProcessingStatus(false);
      }

      this.spinnerStopTimer = null;
    }, 1500);
  }
}
```

**Score**: 9/10
**Why This Is ABSOLUTELY FUCKING STUPID**:

1. **Artificial Delay**: **1.5 SECONDS** (1500ms) of artificial delay before stopping the spinner. This is pure overhead added for no reason.

2. **Why Was This Added?**: Probably someone thought "the UI flickers when the spinner stops/starts rapidly." **SO FUCKING WHAT?** That's a UI rendering problem, not a worker service problem. Fix it in the UI with CSS transitions or debouncing on the CLIENT side.

3. **Double-Check Pattern**: Checks if queues are empty, waits 1.5s, then checks AGAIN. This is defensive programming for ghosts. If the queue is empty, it's empty. We're not protecting against race conditions here - we're just wasting time.

4. **Polling Instead of Events**: This function is called from `handleAgentMessage` (line 1145) after processing every single response. Instead of reacting to the actual completion of work, we're polling state and debouncing.

5. **State Management Overhead**: Requires `spinnerStopTimer` field (line 109), timer cleanup logic, null checks, etc.

**Real-World Impact**: Every time the worker finishes processing observations, the UI spinner continues to show "processing" for **1.5 seconds** even though nothing is happening. This makes the entire system feel slower.

**What Should Happen**:
```typescript
private checkAndStopSpinner(): void {
  const hasPendingMessages = Array.from(this.sessions.values()).some(
    session => session.pendingMessages.length > 0
  );

  if (!hasPendingMessages) {
    this.broadcastProcessingStatus(false);
  }
}
```

**Savings**: Remove 15 lines of debouncing logic, remove timer state, eliminate 1.5s artificial delay.

**Alternative**: If UI flickering is actually a problem (prove it first), handle it client-side with CSS transitions or client-side debouncing.

---

### Lines 370-411: Stats Endpoint

```typescript
private handleStats(_req: Request, res: Response): void {
  try {
    const db = new SessionStore();

    // Get database stats
    const obsCount = db.db.prepare('SELECT COUNT(*) as count FROM observations').get() as { count: number };
    const sessionCount = db.db.prepare('SELECT COUNT(*) as count FROM sdk_sessions').get() as { count: number };
    const summaryCount = db.db.prepare('SELECT COUNT(*) as count FROM session_summaries').get() as { count: number };

    // Get database file size
    const dbPath = join(homedir(), '.claude-mem', 'claude-mem.db');
    let dbSize = 0;
    if (existsSync(dbPath)) {
      dbSize = statSync(dbPath).size;
    }

    db.close();

    // Get worker stats
    const uptime = process.uptime();

    res.json({
      worker: {
        version: VERSION,
        uptime: Math.floor(uptime),
        activeSessions: this.sessions.size,
        sseClients: this.sseClients.size,
        port: getWorkerPort()
      },
      database: {
        path: dbPath,
        size: dbSize,
        observations: obsCount.count,
        sessions: sessionCount.count,
        summaries: summaryCount.count
      }
    });
  } catch (error: any) {
    logger.error('WORKER', 'Failed to get stats', {}, error);
    res.status(500).json({ error: 'Failed to get stats' });
  }
}
```

**Score**: 3/10
**Why Slightly Stupid**:

1. **Redundant existsSync Check**: The database path is guaranteed to exist if SessionStore initialized successfully. If it doesn't exist, SessionStore would have crashed on startup. This is defensive programming for ghosts.

2. **Three Separate Queries**: Could be combined into a single query with UNION or multiple SELECT columns, but this is minor.

**What Should Happen**:
```typescript
const dbSize = statSync(dbPath).size; // Just crash if it doesn't exist
```

Otherwise, this is mostly fine. Stats endpoints are low-frequency and non-critical.

---

### Lines 507-555: GET /api/observations

```typescript
private handleGetObservations(req: Request, res: Response): void {
  try {
    const offset = parseInt(req.query.offset as string || '0', 10);
    const limit = Math.min(parseInt(req.query.limit as string || '50', 10), 100); // Cap at 100
    const project = req.query.project as string | undefined;

    const db = new SessionStore();

    // Build query with optional project filter
    let query = `
      SELECT id, type, title, subtitle, text, project, prompt_number, created_at, created_at_epoch
      FROM observations
    `;
    let countQuery = 'SELECT COUNT(*) as total FROM observations';
    const params: any[] = [];
    const countParams: any[] = [];

    if (project) {
      query += ' WHERE project = ?';
      countQuery += ' WHERE project = ?';
      params.push(project);
      countParams.push(project);
    }

    query += ' ORDER BY created_at_epoch DESC LIMIT ? OFFSET ?';
    params.push(limit, offset);

    const stmt = db.db.prepare(query);
    const observations = stmt.all(...params);

    // Check if there are more results
    const countStmt = db.db.prepare(countQuery);
    const { total } = countStmt.get(...countParams) as { total: number };
    const hasMore = (offset + limit) < total;

    db.close();

    res.json({
      observations,
      hasMore,
      total,
      offset,
      limit
    });
  } catch (error: any) {
    logger.error('WORKER', 'Failed to get observations', {}, error);
    res.status(500).json({ error: 'Failed to get observations' });
  }
}
```

**Score**: 5/10
**Why This Is Mildly Stupid**:

1. **Duplicate Parameter Arrays**: `params` and `countParams` are maintained separately even though they contain the same values (just the project filter). This is error-prone and verbose.

2. **Two Queries Instead of One**: We run a COUNT query and a SELECT query. For small datasets, this is fine, but for large datasets, the COUNT query can be expensive. The `hasMore` flag could be computed by fetching `limit + 1` rows and checking if we got more than `limit`.

**What Should Happen**:
```typescript
// Fetch one extra row to determine if there are more results
const stmt = db.db.prepare(query);
const results = stmt.all(...params);
const observations = results.slice(0, limit);
const hasMore = results.length > limit;

// Only run COUNT if the UI actually needs it (it probably doesn't)
```

**Pattern**: This same pattern is repeated in `handleGetSummaries` (line 557) and `handleGetPrompts` (line 618). Copy-paste code smell.

**Estimated Savings**: Remove COUNT queries (which can be expensive on large tables), simplify parameter handling.

---

### Lines 685-752: POST /sessions/:sessionDbId/init - DATABASE REOPENING HELL

```typescript
private async handleInit(req: Request, res: Response): Promise<void> {
  const sessionDbId = parseInt(req.params.sessionDbId, 10);
  const { project } = req.body;

  logger.info('WORKER', 'Session init', { sessionDbId, project });

  const session = this.getOrCreateSession(sessionDbId); // <-- Opens DB at line 204
  const claudeSessionId = session.claudeSessionId;

  // Update port in database
  const db = new SessionStore(); // <-- Opens DB AGAIN
  db.setWorkerPort(sessionDbId, getWorkerPort());

  // Get the latest user_prompt for this session to sync to Chroma
  const latestPrompt = db.db.prepare(`
    SELECT
      up.*,
      s.sdk_session_id,
      s.project
    FROM user_prompts up
    JOIN sdk_sessions s ON up.claude_session_id = s.claude_session_id
    WHERE up.claude_session_id = ?
    ORDER BY up.created_at_epoch DESC
    LIMIT 1
  `).get(claudeSessionId) as any;

  db.close(); // <-- Closes DB

  // ... SSE broadcast ...
  // ... Chroma sync ...

  logger.success('WORKER', 'Session initialized', { sessionId: sessionDbId, port: getWorkerPort() });
  res.json({
    status: 'initialized',
    sessionDbId,
    port: getWorkerPort()
  });
}
```

**Score**: 7/10
**Why This Is Stupid**:

1. **Two Database Opens in Same Function**:
   - Line 691: `getOrCreateSession()` opens DB internally (line 204)
   - Line 695: Opens DB AGAIN for `setWorkerPort()`
   - Line 711: Closes DB

2. **Redundant Data Fetching**: `getOrCreateSession()` already fetches session data from the database (line 205). Then we query AGAIN for the user prompt (line 698).

3. **Tight Coupling**: `getOrCreateSession()` hides database access, making it unclear that we're opening the database twice.

**What Should Happen**:
- Open database ONCE at the start of handleInit
- Pass the open database to getOrCreateSession
- Fetch all needed data in a single transaction
- Close database at the end

**Estimated Savings**: Eliminate 1 database open/close cycle (1-5ms).

---

### Lines 728-741: Chroma Sync with Verbose Error Handling

```typescript
// Sync user prompt to Chroma (fire-and-forget, but crash on failure)
if (latestPrompt) {
  this.chromaSync.syncUserPrompt(
    latestPrompt.id,
    latestPrompt.sdk_session_id,
    latestPrompt.project,
    latestPrompt.prompt_text,
    latestPrompt.prompt_number,
    latestPrompt.created_at_epoch
  ).catch(err => {
    logger.failure('WORKER', 'Failed to sync user_prompt to Chroma - continuing', { promptId: latestPrompt.id }, err);
    // Don't crash - SQLite has the data
  });
}
```

**Score**: 5/10
**Why This Is Mildly Stupid**:

1. **Inconsistent Error Handling**: The comment says "crash on failure" but then we catch the error and continue. Which is it?

2. **Redundant Comment**: The code says `.catch(err => { /* continue */ })` and the comment says "Don't crash - SQLite has the data". The code is self-documenting.

3. **Fire-and-Forget**: If we're going to fire-and-forget, why bother with verbose error handling? Either care about failures (and retry/alert) or don't (and just log).

**What Should Happen**:
```typescript
// Fire-and-forget Chroma sync (SQLite is source of truth)
if (latestPrompt) {
  this.chromaSync.syncUserPrompt(/* ... */).catch(() => {}); // Swallow errors
}
```

**Pattern**: This same verbose error handling appears in lines 1057-1076 and 1114-1133.

---

### Lines 758-779: POST /sessions/:sessionDbId/observations

```typescript
private handleObservation(req: Request, res: Response): void {
  const sessionDbId = parseInt(req.params.sessionDbId, 10);
  const { tool_name, tool_input, tool_output, prompt_number } = req.body;

  const session = this.getOrCreateSession(sessionDbId); // <-- Opens DB
  const toolStr = logger.formatTool(tool_name, tool_input);

  logger.dataIn('WORKER', `Observation queued: ${toolStr}`, {
    sessionId: sessionDbId,
    queue: session.pendingMessages.length + 1
  });

  session.pendingMessages.push({
    type: 'observation',
    tool_name,
    tool_input,
    tool_output,
    prompt_number
  });

  res.json({ status: 'queued', queueLength: session.pendingMessages.length });
}
```

**Score**: 6/10
**Why This Is Stupid**:

1. **Database Opens for No Reason**: `getOrCreateSession()` opens the database (line 204), but we don't actually need any data from the database here. We just need to get or create the in-memory session object.

2. **Hot Path Performance**: This endpoint is called **for every single tool execution**. If you run 100 tool calls in a session, this opens/closes the database 100 times unnecessarily.

**What Should Happen**:
- Separate "get existing session" from "create session from database"
- Only open database if creating a new session
- For existing sessions, just push to the queue

**Estimated Savings**: For a session with 100 observations, eliminate 99 unnecessary database open/close cycles (**99-495ms of pure overhead**).

---

### Lines 914-1005: createMessageGenerator - THE POLLING HORROR

```typescript
private async* createMessageGenerator(session: ActiveSession): AsyncIterable<SDKUserMessage> {
  // ... send init prompt ...

  // Process messages continuously until session is deleted
  while (true) {
    if (session.abortController.signal.aborted) {
      break;
    }

    if (session.pendingMessages.length === 0) {
      await new Promise(resolve => setTimeout(resolve, MESSAGE_POLL_INTERVAL_MS));
      continue;
    }

    while (session.pendingMessages.length > 0) {
      const message = session.pendingMessages.shift()!;
      // ... process message ...
      yield { /* SDK message */ };
    }
  }
}
```

**Score**: 10/10
**Why This Is ABSOLUTELY FUCKING STUPID**:

1. **Infinite Polling Loop**: Lines 936-944 implement a **busy-wait polling loop** that checks `pendingMessages.length` every 100ms. This is the single dumbest pattern in the entire file.

2. **Event-Driven Alternative**: We have a fucking queue! When something is added to the queue, **NOTIFY THE CONSUMER**. Use an EventEmitter, a Promise, a Condition Variable, ANYTHING but polling.

3. **Wasted CPU**: Every 100ms, this loop wakes up, checks if the queue is empty, and goes back to sleep. For a worker that runs for hours, this is thousands of unnecessary wake-ups.

4. **Latency**: When an observation is queued (line 770), it sits in the queue for up to 100ms before being processed. **This adds 0-100ms of artificial latency to every single observation.**

5. **Battery Impact**: On laptops, constant polling prevents CPU from entering deep sleep states, draining battery.

**What Should Happen**:

```typescript
// In WorkerService class
private sessionQueues: Map<number, EventEmitter> = new Map();

private handleObservation(req: Request, res: Response): void {
  // ... existing code ...
  session.pendingMessages.push({ /* message */ });

  // Notify the generator that new work is available
  const emitter = this.sessionQueues.get(sessionDbId);
  if (emitter) {
    emitter.emit('message');
  }

  res.json({ status: 'queued', queueLength: session.pendingMessages.length });
}

private async* createMessageGenerator(session: ActiveSession): AsyncIterable<SDKUserMessage> {
  const emitter = new EventEmitter();
  this.sessionQueues.set(session.sessionDbId, emitter);

  yield { /* init prompt */ };

  while (!session.abortController.signal.aborted) {
    if (session.pendingMessages.length === 0) {
      // Wait for new messages via event, not polling
      await new Promise(resolve => emitter.once('message', resolve));
    }

    while (session.pendingMessages.length > 0) {
      const message = session.pendingMessages.shift()!;
      yield { /* process message */ };
    }
  }

  this.sessionQueues.delete(session.sessionDbId);
}
```

**Estimated Savings**:
- Remove 100ms polling interval (eliminate 0-100ms latency per observation)
- Reduce CPU wake-ups from ~10/second to 0 when idle
- Improve battery life on laptops
- Make the system feel more responsive

**Real-World Impact**: For a session with 10 observations, this polling adds **0-1000ms of cumulative latency**. The user is literally waiting for the polling loop to wake up.

---

### Lines 1011-1146: handleAgentMessage - Database Reopening and Chroma Spam

```typescript
private handleAgentMessage(session: ActiveSession, content: string, promptNumber: number): void {
  // ... parse observations and summary ...

  const db = new SessionStore(); // <-- Opens DB

  // Store observations and sync to Chroma (non-blocking, fail-fast)
  for (const obs of observations) {
    const { id, createdAtEpoch } = db.storeObservation(/* ... */);
    logger.success('DB', 'Observation stored', { /* ... */ });

    // Broadcast to SSE clients
    this.broadcastSSE({ /* ... */ });

    // Sync to Chroma (non-blocking fire-and-forget, but crash on failure)
    this.chromaSync.syncObservation(/* ... */)
      .then(() => {
        logger.success('WORKER', 'Observation synced to Chroma', { /* ... */ });
      })
      .catch((error: Error) => {
        logger.error('WORKER', 'Observation sync failed - continuing', { /* ... */ }, error);
        // Don't crash - SQLite has the data
      });
  }

  // ... similar pattern for summary ...

  db.close(); // <-- Closes DB

  // Check if queue is empty and stop spinner after debounce
  this.checkAndStopSpinner(); // <-- Triggers 1.5s delay
}
```

**Score**: 6/10
**Why This Is Stupid**:

1. **Database Reopening**: Opens database (line 1030), stores all observations, closes database (line 1142). This is called **for every SDK response**. For a session with 10 observations, this opens/closes the database 10+ times.

2. **Verbose Chroma Error Handling**: Lines 1057-1076 and 1114-1133 have identical verbose error handling for Chroma sync failures. This is copy-paste code smell.

3. **Success Logging Spam**: Line 1066 and 1123 log success for EVERY Chroma sync. For a session with 100 observations, this logs 100 success messages. Why? Who reads these?

4. **Debounce Call**: Line 1145 calls `checkAndStopSpinner()`, triggering the 1.5s artificial delay.

**What Should Happen**:
- Reuse database connection across multiple calls
- Simplify Chroma error handling (fire-and-forget means swallow errors)
- Remove success logging (or make it debug-level)
- Remove debounce delay

---

## Summary of Patterns

### 1. Database Reopening Anti-Pattern
**Occurrences**: Lines 200-236, 685-752, 758-779, 1011-1146
**Impact**: Opens/closes database 4-100+ times per session instead of reusing connections
**Fix**: Pass open database connections between functions, use transactions, connection pooling

### 2. Polling Instead of Events
**Occurrences**: Line 942 (100ms polling loop)
**Impact**: 0-100ms latency per observation, wasted CPU cycles, battery drain
**Fix**: Use EventEmitter or async queue with await/notify pattern

### 3. Artificial Delays
**Occurrences**: Line 363 (1.5s spinner debounce), line 942 (100ms poll interval)
**Impact**: 1.5s delay before spinner stops, 0-100ms delay per observation
**Fix**: Remove debouncing, use event-driven patterns

### 4. Premature Optimization
**Occurrences**: Lines 33-70 (Claude path caching)
**Impact**: 37 lines of code to save 5ms on a one-time operation
**Fix**: Remove caching, inline the function

### 5. Defensive Programming for Ghosts
**Occurrences**: Line 382 (existsSync check), lines 228-231 (error handler reopens DB), lines 728-741 (verbose error handling)
**Impact**: Code complexity without real benefit
**Fix**: Fail fast, trust invariants, simplify error handling

### 6. Copy-Paste Code
**Occurrences**: handleGetObservations, handleGetSummaries, handleGetPrompts (nearly identical)
**Impact**: Maintenance burden, inconsistency risk
**Fix**: Extract common pagination logic into helper function

---

## Recommendations

### Immediate Wins (Low Effort, High Impact)

1. **Remove Spinner Debounce** (Lines 338-365)
   - **Effort**: 5 minutes
   - **Impact**: Eliminate 1.5s artificial delay
   - **Score**: 9/10 stupidity

2. **Replace Polling with Events** (Line 942)
   - **Effort**: 30 minutes
   - **Impact**: Eliminate 0-100ms latency per observation, reduce CPU usage
   - **Score**: 10/10 stupidity

3. **Remove Claude Path Caching** (Lines 33-70)
   - **Effort**: 5 minutes
   - **Impact**: Remove 37 lines of unnecessary code
   - **Score**: 6/10 stupidity

### Medium Wins (Moderate Effort, Good Impact)

4. **Fix Database Reopening in Hot Path** (Lines 758-779)
   - **Effort**: 1 hour
   - **Impact**: Eliminate 99+ database cycles per session
   - **Score**: 6/10 stupidity

5. **Simplify Chroma Error Handling** (Lines 728-741, 1057-1076, 1114-1133)
   - **Effort**: 15 minutes
   - **Impact**: Remove 50+ lines of verbose error handling
   - **Score**: 5/10 stupidity

6. **Simplify SSE Broadcast** (Lines 297-322)
   - **Effort**: 5 minutes
   - **Impact**: Remove 10 lines, eliminate two-pass cleanup
   - **Score**: 4/10 stupidity

### Long-Term Improvements (High Effort, Architectural)

7. **Database Connection Pooling**
   - **Effort**: 4 hours
   - **Impact**: Reuse connections across requests, eliminate all open/close overhead
   - **Score**: 8/10 stupidity (current approach)

8. **Extract Pagination Helper**
   - **Effort**: 1 hour
   - **Impact**: DRY up handleGetObservations/Summaries/Prompts
   - **Score**: 5/10 stupidity

---

## Estimated Performance Impact

**Current Hot Path (1 observation)**:
- HTTP request arrives: 0ms
- getOrCreateSession opens/closes DB: 1-5ms
- Queue message: 0ms
- Poll interval: 0-100ms (average 50ms)
- SDK processing: variable
- handleAgentMessage opens/closes DB: 1-5ms
- Chroma sync (async): N/A
- checkAndStopSpinner debounce: 1500ms
- **Total artificial overhead**: 1502-1610ms (1.5-1.6 seconds)

**Optimized Hot Path (1 observation)**:
- HTTP request arrives: 0ms
- Get existing session (no DB): 0ms
- Queue message + notify: 0ms
- SDK processing: variable
- Store in DB (connection pool): 0.1-0.5ms
- Chroma sync (async): N/A
- Stop spinner (no debounce): 0ms
- **Total artificial overhead**: 0.1-0.5ms

**Speedup**: **3000-16000x faster** (removing artificial delays and polling)

---

## Conclusion

This file has accumulated significant technical debt in the form of:
- **Artificial delays** (1.5s debounce, 100ms polling)
- **Database reopening anti-pattern** (4-100+ opens per session)
- **Polling instead of events** (busy-wait loop)
- **Premature optimization** (caching rare operations)
- **Defensive programming** (protecting against non-existent failures)

The worker spends more time **waiting** (polling, debouncing) than **working**. Most of these patterns were likely added with good intentions ("make the UI smooth", "cache for performance", "handle errors gracefully") but ended up creating more problems than they solved.

**Priority Fixes**:
1. Remove spinner debounce (9/10 stupidity)
2. Replace polling with events (10/10 stupidity)
3. Fix database reopening in hot path (6-8/10 stupidity)

These three changes alone would eliminate **1.5+ seconds of artificial delay** per session and make the system feel dramatically more responsive.