Files
claude-mem/REFACTOR-PLAN.md
T

18 KiB

Claude-Mem Architecture Refactor Plan

Core Purpose

Create a lightweight, hook-driven memory system that captures important context during Claude Code sessions and makes it available in future sessions.

Principles:

  • Hooks should be fast and non-blocking
  • SDK agent synthesizes observations, not just stores raw data
  • Storage should be simple and queryable
  • Users should never notice the memory system working

Understanding the Foundation

What Claude Code Hooks Actually Do

SessionStart Hook:

  • Runs when Claude Code starts or resumes
  • Can inject context via stdout (plain text) OR JSON additionalContext
  • This is how we show "What's new" to Claude

UserPromptSubmit Hook:

  • Runs BEFORE Claude processes the user's message
  • Can inject context via stdout OR JSON additionalContext
  • This is where we initialize per-session tracking

PostToolUse Hook:

  • Runs AFTER each tool completes successfully
  • Gets both tool input and output
  • Runs in PARALLEL with other matching hooks
  • This is where we observe what Claude is doing

Stop Hook:

  • Runs when main agent finishes (NOT on user interrupt)
  • This is where we finalize the session
  • Summary should be structured responses that answer the following:
    • What did user request?
    • What did you investigate?
    • What did you learn?
    • What did you do?
    • What's next?
    • Files read
    • Files edited
    • Notes

How SDK Streaming Actually Works

Streaming Input Mode (what we need):

  • Persistent session with AsyncGenerator
  • Can queue multiple messages
  • Supports interruption
  • Natural multi-turn conversations
  • The SDK maintains conversation state

Critical insight: We use "Streaming Input Mode" which creates ONE long-running SDK session per Claude Code session, not multiple short sessions.


Architecture

What is the SDK agent's job?

The SDK agent is a synthesis engine, not a data collector.

It should:

  • Receive tool observations as they happen
  • Extract meaningful patterns and insights
  • Store atomic, searchable observations in SQLite
  • Synthesize a human-readable summary at the end

It should NOT:

  • Store raw tool outputs
  • Try to capture everything
  • Make decisions about what Claude Code should do
  • Block or slow down the main session

How hooks run in parallel

PostToolUse hooks run in parallel. Handle this by:

  • Make SDK agent calls async and fire-and-forget
  • Use the observation_queue SQLite table to serialize observations
  • SDK worker polls this queue and processes observations sequentially

What if the user interrupts Claude Code?

Stop hook doesn't run on interrupts. So:

  • Observations stay in queue
  • Next session continues where left off
  • Mark session as 'interrupted' after 24h of inactivity

Database Schema

-- Tracks SDK streaming sessions
CREATE TABLE sdk_sessions (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  claude_session_id TEXT UNIQUE NOT NULL,
  sdk_session_id TEXT UNIQUE NOT NULL,
  project TEXT NOT NULL,
  user_prompt TEXT,
  started_at TEXT NOT NULL,
  started_at_epoch INTEGER NOT NULL,
  completed_at TEXT,
  completed_at_epoch INTEGER,
  status TEXT CHECK(status IN ('active', 'completed', 'failed'))
);

-- Tracks pending observations (message queue)
CREATE TABLE observation_queue (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  sdk_session_id TEXT NOT NULL,
  tool_name TEXT NOT NULL,
  tool_input TEXT NOT NULL,  -- JSON
  tool_output TEXT NOT NULL, -- JSON
  created_at_epoch INTEGER NOT NULL,
  processed_at_epoch INTEGER,
  FOREIGN KEY(sdk_session_id) REFERENCES sdk_sessions(sdk_session_id)
);

-- Stores extracted observations (what SDK decides is important)
CREATE TABLE observations (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  sdk_session_id TEXT NOT NULL,
  project TEXT NOT NULL,
  text TEXT NOT NULL,
  type TEXT NOT NULL, -- 'decision' | 'bugfix' | 'feature' | 'refactor' | 'discovery'
  created_at TEXT NOT NULL,
  created_at_epoch INTEGER NOT NULL,
  FOREIGN KEY(sdk_session_id) REFERENCES sdk_sessions(sdk_session_id)
);

CREATE INDEX idx_observations_project ON observations(project);
CREATE INDEX idx_observations_created ON observations(created_at_epoch DESC);

-- Stores session summaries
CREATE TABLE session_summaries (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  sdk_session_id TEXT UNIQUE NOT NULL,
  project TEXT NOT NULL,
  summary TEXT NOT NULL,
  created_at TEXT NOT NULL,
  created_at_epoch INTEGER NOT NULL,
  FOREIGN KEY(sdk_session_id) REFERENCES sdk_sessions(sdk_session_id)
);

CREATE INDEX idx_summaries_project ON session_summaries(project);
CREATE INDEX idx_summaries_created ON session_summaries(created_at_epoch DESC);

Hook Implementation

1. SessionStart Hook

Purpose: Show user what happened in recent sessions

Hook config:

{
  "hooks": {
    "SessionStart": [{
      "matcher": "startup",
      "hooks": [{
        "type": "command",
        "command": "claude-mem context"
      }]
    }]
  }
}

Command: claude-mem context

Flow:

  1. Read stdin JSON (session_id, cwd, source, etc.)
  2. If source !== "startup", exit immediately
  3. Extract project from cwd basename
  4. Query SQLite for recent summaries:
    SELECT summary, created_at
    FROM session_summaries
    WHERE project = ?
    ORDER BY created_at_epoch DESC
    LIMIT 10
    
  5. Format results as human-readable text
  6. Output to stdout (Claude Code automatically injects this)
  7. Exit with code 0

2. UserPromptSubmit Hook

Purpose: Initialize SDK memory session in background

Hook config:

{
  "hooks": {
    "UserPromptSubmit": [{
      "hooks": [{
        "type": "command",
        "command": "claude-mem new"
      }]
    }]
  }
}

Command: claude-mem new

Flow:

  1. Read stdin JSON (session_id, prompt, cwd, etc.)
  2. Extract project from cwd
  3. Create SDK session record in database
  4. Start SDK session with initialization prompt in background process
  5. Save SDK session ID to database
  6. Output: {"continue": true, "suppressOutput": true}
  7. Exit immediately (SDK runs in background daemon/process)

The Background SDK Process:

The SDK session should run as a detached background process:

// In claude-mem new
const child = spawn('claude-mem', ['sdk-worker', session_id], {
  detached: true,
  stdio: 'ignore'
});
child.unref();

The SDK worker:

// claude-mem sdk-worker <session_id>
async function runSDKWorker(sessionId: string) {
  const session = await loadSessionFromDB(sessionId);

  async function* messageGenerator() {
    yield {
      type: "user",
      message: {
        role: "user",
        content: buildInitPrompt(session)
      }
    };

    // Then listen for queued observations
    while (session.status === 'active') {
      const observations = await pollObservationQueue(session.sdk_session_id);

      for (const obs of observations) {
        yield {
          type: "user",
          message: {
            role: "user",
            content: buildObservationPrompt(obs)
          }
        };
        markObservationProcessed(obs.id);
      }

      await sleep(1000); // Poll every second
    }
  }

  // Run SDK session
  const response = query({
    prompt: messageGenerator(),
    options: {
      model: 'claude-haiku-4-5-20251001', // 3x faster than Sonnet 4.5, quality of Sonnet 4.0-4.1
      allowedTools: [], // No tools needed - agent outputs XML that we parse
      maxTurns: 1000,
      cwd: session.cwd
    }
  });

  // Consume responses and parse XML for observations/summaries
  for await (const msg of response) {
    if (msg.type === 'text') {
      // Use an XML parser library (e.g., fast-xml-parser or similar) to parse observations and summaries
      // Parse <observation> blocks and call storeObservation(session_id, project, type, text)
      // Parse <summary> blocks, extract all 8 fields, format and call storeSummary(session_id, project, text)

      parseAndStoreObservations(msg.content, session);
      parseAndStoreSummary(msg.content, session);
    }
  }
}

3. PostToolUse Hook

Purpose: Queue tool observations for SDK processing

Hook config:

{
  "hooks": {
    "PostToolUse": [{
      "matcher": "*",
      "hooks": [{
        "type": "command",
        "command": "claude-mem save"
      }]
    }]
  }
}

Command: claude-mem save

Flow:

  1. Read stdin JSON (tool_name, tool_input, tool_output, etc.)
  2. Find active SDK session for this project
  3. Insert observation into queue:
    INSERT INTO observation_queue
    (sdk_session_id, tool_name, tool_input, tool_output, created_at_epoch)
    VALUES (?, ?, ?, ?, ?)
    
  4. Output: {"continue": true, "suppressOutput": true}
  5. Exit immediately

Observation Filtering:

Not every tool is worth observing. Filter in the hook:

const BORING_TOOLS = new Set(['Glob', 'Grep', 'ListMcpResourcesTool']);

if (BORING_TOOLS.has(tool_name)) {
  console.log('{"continue": true, "suppressOutput": true}');
  process.exit(0);
}

4. Stop Hook

Purpose: Signal SDK to finalize and generate summary

Hook config:

{
  "hooks": {
    "Stop": [{
      "hooks": [{
        "type": "command",
        "command": "claude-mem summary"
      }]
    }]
  }
}

Command: claude-mem summary

Flow:

  1. Read stdin JSON (session_id, cwd, etc.)
  2. Find active SDK session
  3. Insert special "FINALIZE" message into observation queue:
    INSERT INTO observation_queue
    (sdk_session_id, tool_name, tool_input, tool_output, created_at_epoch)
    VALUES (?, 'FINALIZE', '{}', '{}', ?)
    
  4. Output: {"continue": true, "suppressOutput": true}
  5. Exit immediately

SDK Worker Handling:

When SDK worker sees FINALIZE message:

if (obs.tool_name === 'FINALIZE') {
  yield {
    type: "user",
    message: {
      role: "user",
      content: buildFinalizePrompt(session)
    }
  };

  // Wait for SDK to finish processing
  await waitForSDKCompletion();

  // Update session status
  await markSessionCompleted(session.id);

  // Exit worker
  break;
}

SDK Agent Prompts

Initialization Prompt

function buildInitPrompt(project: string, sessionId: string, userPrompt: string): string {
  return `You are a memory assistant for the "${project}" project.

SESSION CONTEXT
---------------
Session ID: ${sessionId}
User's Goal: ${userPrompt}
Date: ${new Date().toISOString().split('T')[0]}

YOUR ROLE
---------
You will observe tool executions during this Claude Code session. Your job is to:

1. Extract meaningful insights (not just raw data)
2. Store atomic observations in SQLite
3. Focus on: key decisions, patterns discovered, problems solved, technical insights

WHAT TO CAPTURE
----------------
✓ Architecture decisions (e.g., "chose PostgreSQL over MongoDB for ACID guarantees")
✓ Bug fixes (e.g., "fixed race condition in auth middleware by adding mutex")
✓ New features (e.g., "implemented JWT refresh token flow")
✓ Refactorings (e.g., "extracted validation logic into separate service")
✓ Discoveries (e.g., "found that API rate limit is 100 req/min")

✗ NOT routine operations (reading files, listing directories)
✗ NOT work-in-progress (only completed work)
✗ NOT obvious facts (e.g., "TypeScript file has types")

HOW TO STORE OBSERVATIONS
--------------------------
When you identify something worth remembering, output your observation in this EXACT XML format:

\`\`\`xml
<observation>
  <type>feature</type>
  <text>Implemented JWT token refresh flow with 7-day expiry</text>
</observation>
\`\`\`

Valid types: decision, bugfix, feature, refactor, discovery

Structure requirements:
- <observation> is the root element
- <type> must be one of the 5 valid types (single word)
- <text> contains your concise observation (one sentence preferred)
- No additional fields or nesting

The SDK worker will parse all <observation> blocks from your response using regex and store them in SQLite.

You can include your reasoning before or after the observation block, or just output the observation by itself.

EXAMPLE
-------
Bad: "Read src/auth.ts file"
Good: "Implemented JWT token refresh flow with 7-day expiry"

Wait for tool observations. Acknowledge this message briefly.`;
}

Observation Prompt

function buildObservationPrompt(obs: Observation): string {
  return `TOOL OBSERVATION
================
Tool: ${obs.tool_name}
Time: ${new Date(obs.created_at_epoch).toISOString()}

Input:
${JSON.stringify(JSON.parse(obs.tool_input), null, 2)}

Output:
${JSON.stringify(JSON.parse(obs.tool_output), null, 2)}

ANALYSIS TASK
-------------
1. Does this observation contain something worth remembering?
2. If YES: Output the observation in this EXACT XML format:

   \`\`\`xml
   <observation>
     <type>feature</type>
     <text>Your concise observation here</text>
   </observation>
   \`\`\`

   Requirements:
   - Use one of these types: decision, bugfix, feature, refactor, discovery
   - Keep text concise (one sentence preferred)
   - No markdown formatting inside <text>
   - No additional XML fields

3. If NO: Just acknowledge and wait for next observation

Remember: Quality over quantity. Only store meaningful insights.`;
}

Finalization Prompt

function buildFinalizePrompt(session: SDKSession): string {
  return `SESSION ENDING
==============
The Claude Code session is finishing.

FINAL TASK
----------
1. Review the observations you've stored this session
2. Generate a structured summary that answers these questions:
   - What did user request?
   - What did you investigate?
   - What did you learn?
   - What did you do?
   - What's next?
   - Files read
   - Files edited
   - Notes

3. Generate the structured summary and output it in this EXACT XML format:

\`\`\`xml
<summary>
  <request>Implement JWT authentication system</request>
  <investigated>Existing auth middleware, session management, token storage patterns</investigated>
  <learned>Current system uses session cookies; no JWT support; race condition in middleware</learned>
  <completed>Implemented JWT token + refresh flow with 7-day expiry; fixed race condition with mutex; added token validation middleware</completed>
  <next_steps>Add token revocation API endpoint; write integration tests</next_steps>
  <files_read>
    <file>src/auth.ts</file>
    <file>src/middleware/session.ts</file>
    <file>src/types/user.ts</file>
  </files_read>
  <files_edited>
    <file>src/auth.ts</file>
    <file>src/middleware/auth.ts</file>
    <file>src/routes/auth.ts</file>
  </files_edited>
  <notes>Token secret stored in .env; refresh tokens use rotation strategy</notes>
</summary>
\`\`\`

Structure requirements:
- <summary> is the root element
- All 8 child elements are REQUIRED: request, investigated, learned, completed, next_steps, files_read, files_edited, notes
- <files_read> and <files_edited> must contain <file> child elements (one per file)
- If no files were read/edited, use empty tags: <files_read></files_read>
- Text fields can be multiple sentences but avoid markdown formatting
- Use underscores in element names: next_steps, files_read, files_edited

The SDK worker will parse the <summary> block and extract all fields to store in SQLite.

Generate the summary now in the required XML format.`;
}

Hook Commands Architecture

All four hook commands (claude-mem context, claude-mem new, claude-mem save, claude-mem summary) are implemented as standalone TypeScript functions that:

  1. Use bun:sqlite directly - No spawning child processes or CLI subcommands
  2. Are self-contained - Each hook has all the logic it needs
  3. Share a common database layer - Import from shared db.ts module
  4. Never call other claude-mem commands - All functionality via direct library calls
// Example structure
import { Database } from 'bun:sqlite';

export function contextHook(stdin: HookInput) {
  const db = new Database('~/.claude-mem/db.sqlite');
  // Query and return context directly
  const summaries = db.query('SELECT ...').all();
  console.log(formatContext(summaries));
  db.close();
}

export function saveHook(stdin: HookInput) {
  const db = new Database('~/.claude-mem/db.sqlite');
  // Insert observation directly
  db.run('INSERT INTO observation_queue ...', params);
  db.close();
  console.log('{"continue": true, "suppressOutput": true}');
}

Key principle: Hooks are fast, synchronous database operations. The SDK worker process is where async/complex logic happens.


Background Process Management

The claude-mem save hook just queues observations - processing happens in the background SDK worker process that polls the queue continuously.

The SDK worker is spawned by claude-mem new as a detached process and runs for the duration of the Claude Code session.

Benefits:

  • Works on all platforms (no systemd/launchd needed)
  • Self-contained (spawned and managed by claude-mem itself)
  • Simple state management (all state in SQLite)

Error Handling

SDK worker failures:

  • Each observation processing is atomic
  • Failed observations stay in queue
  • Next worker run retries
  • After 3 failures, mark observation as skipped

Database corruption:

  • SQLite with WAL mode (write-ahead logging)
  • Regular backups to ~/.claude-mem/backups/
  • Automatic recovery from backups

ChromaDB connection failures:

  • Graceful degradation (log error, continue)
  • Retry with exponential backoff
  • Don't block main Claude Code session

Implementation Order

  1. Database setup - Create tables and migration scripts
  2. Hook commands - Implement the 4 hook commands (context, new, save, summary)
  3. SDK worker - Implement the background worker process with response parsing
  4. SDK prompts - Wire up the prompts and message generator
  5. Test end-to-end - Run a real Claude Code session and verify it works

Start simple. Get one hook working before moving to the next. Don't try to build everything at once.

Note: MCP is only used for retrieval (when Claude Code needs to access stored memories), not for storage. The SDK agent stores data by outputting specially formatted text that the SDK worker parses and writes to SQLite.