18 KiB
Claude-Mem Architecture Refactor Plan
Core Purpose
Create a lightweight, hook-driven memory system that captures important context during Claude Code sessions and makes it available in future sessions.
Principles:
- Hooks should be fast and non-blocking
- SDK agent synthesizes observations, not just stores raw data
- Storage should be simple and queryable
- Users should never notice the memory system working
Understanding the Foundation
What Claude Code Hooks Actually Do
SessionStart Hook:
- Runs when Claude Code starts or resumes
- Can inject context via stdout (plain text) OR JSON
additionalContext - This is how we show "What's new" to Claude
UserPromptSubmit Hook:
- Runs BEFORE Claude processes the user's message
- Can inject context via stdout OR JSON
additionalContext - This is where we initialize per-session tracking
PostToolUse Hook:
- Runs AFTER each tool completes successfully
- Gets both tool input and output
- Runs in PARALLEL with other matching hooks
- This is where we observe what Claude is doing
Stop Hook:
- Runs when main agent finishes (NOT on user interrupt)
- This is where we finalize the session
- Summary should be structured responses that answer the following:
- What did user request?
- What did you investigate?
- What did you learn?
- What did you do?
- What's next?
- Files read
- Files edited
- Notes
How SDK Streaming Actually Works
Streaming Input Mode (what we need):
- Persistent session with AsyncGenerator
- Can queue multiple messages
- Supports interruption
- Natural multi-turn conversations
- The SDK maintains conversation state
Critical insight: We use "Streaming Input Mode" which creates ONE long-running SDK session per Claude Code session, not multiple short sessions.
Architecture
What is the SDK agent's job?
The SDK agent is a synthesis engine, not a data collector.
It should:
- Receive tool observations as they happen
- Extract meaningful patterns and insights
- Store atomic, searchable observations in SQLite
- Synthesize a human-readable summary at the end
It should NOT:
- Store raw tool outputs
- Try to capture everything
- Make decisions about what Claude Code should do
- Block or slow down the main session
How hooks run in parallel
PostToolUse hooks run in parallel. Handle this by:
- Make SDK agent calls async and fire-and-forget
- Use the observation_queue SQLite table to serialize observations
- SDK worker polls this queue and processes observations sequentially
What if the user interrupts Claude Code?
Stop hook doesn't run on interrupts. So:
- Observations stay in queue
- Next session continues where left off
- Mark session as 'interrupted' after 24h of inactivity
Database Schema
-- Tracks SDK streaming sessions
CREATE TABLE sdk_sessions (
id INTEGER PRIMARY KEY AUTOINCREMENT,
claude_session_id TEXT UNIQUE NOT NULL,
sdk_session_id TEXT UNIQUE NOT NULL,
project TEXT NOT NULL,
user_prompt TEXT,
started_at TEXT NOT NULL,
started_at_epoch INTEGER NOT NULL,
completed_at TEXT,
completed_at_epoch INTEGER,
status TEXT CHECK(status IN ('active', 'completed', 'failed'))
);
-- Tracks pending observations (message queue)
CREATE TABLE observation_queue (
id INTEGER PRIMARY KEY AUTOINCREMENT,
sdk_session_id TEXT NOT NULL,
tool_name TEXT NOT NULL,
tool_input TEXT NOT NULL, -- JSON
tool_output TEXT NOT NULL, -- JSON
created_at_epoch INTEGER NOT NULL,
processed_at_epoch INTEGER,
FOREIGN KEY(sdk_session_id) REFERENCES sdk_sessions(sdk_session_id)
);
-- Stores extracted observations (what SDK decides is important)
CREATE TABLE observations (
id INTEGER PRIMARY KEY AUTOINCREMENT,
sdk_session_id TEXT NOT NULL,
project TEXT NOT NULL,
text TEXT NOT NULL,
type TEXT NOT NULL, -- 'decision' | 'bugfix' | 'feature' | 'refactor' | 'discovery'
created_at TEXT NOT NULL,
created_at_epoch INTEGER NOT NULL,
FOREIGN KEY(sdk_session_id) REFERENCES sdk_sessions(sdk_session_id)
);
CREATE INDEX idx_observations_project ON observations(project);
CREATE INDEX idx_observations_created ON observations(created_at_epoch DESC);
-- Stores session summaries
CREATE TABLE session_summaries (
id INTEGER PRIMARY KEY AUTOINCREMENT,
sdk_session_id TEXT UNIQUE NOT NULL,
project TEXT NOT NULL,
summary TEXT NOT NULL,
created_at TEXT NOT NULL,
created_at_epoch INTEGER NOT NULL,
FOREIGN KEY(sdk_session_id) REFERENCES sdk_sessions(sdk_session_id)
);
CREATE INDEX idx_summaries_project ON session_summaries(project);
CREATE INDEX idx_summaries_created ON session_summaries(created_at_epoch DESC);
Hook Implementation
1. SessionStart Hook
Purpose: Show user what happened in recent sessions
Hook config:
{
"hooks": {
"SessionStart": [{
"matcher": "startup",
"hooks": [{
"type": "command",
"command": "claude-mem context"
}]
}]
}
}
Command: claude-mem context
Flow:
- Read stdin JSON (session_id, cwd, source, etc.)
- If source !== "startup", exit immediately
- Extract project from cwd basename
- Query SQLite for recent summaries:
SELECT summary, created_at FROM session_summaries WHERE project = ? ORDER BY created_at_epoch DESC LIMIT 10 - Format results as human-readable text
- Output to stdout (Claude Code automatically injects this)
- Exit with code 0
2. UserPromptSubmit Hook
Purpose: Initialize SDK memory session in background
Hook config:
{
"hooks": {
"UserPromptSubmit": [{
"hooks": [{
"type": "command",
"command": "claude-mem new"
}]
}]
}
}
Command: claude-mem new
Flow:
- Read stdin JSON (session_id, prompt, cwd, etc.)
- Extract project from cwd
- Create SDK session record in database
- Start SDK session with initialization prompt in background process
- Save SDK session ID to database
- Output:
{"continue": true, "suppressOutput": true} - Exit immediately (SDK runs in background daemon/process)
The Background SDK Process:
The SDK session should run as a detached background process:
// In claude-mem new
const child = spawn('claude-mem', ['sdk-worker', session_id], {
detached: true,
stdio: 'ignore'
});
child.unref();
The SDK worker:
// claude-mem sdk-worker <session_id>
async function runSDKWorker(sessionId: string) {
const session = await loadSessionFromDB(sessionId);
async function* messageGenerator() {
yield {
type: "user",
message: {
role: "user",
content: buildInitPrompt(session)
}
};
// Then listen for queued observations
while (session.status === 'active') {
const observations = await pollObservationQueue(session.sdk_session_id);
for (const obs of observations) {
yield {
type: "user",
message: {
role: "user",
content: buildObservationPrompt(obs)
}
};
markObservationProcessed(obs.id);
}
await sleep(1000); // Poll every second
}
}
// Run SDK session
const response = query({
prompt: messageGenerator(),
options: {
model: 'claude-haiku-4-5-20251001', // 3x faster than Sonnet 4.5, quality of Sonnet 4.0-4.1
allowedTools: [], // No tools needed - agent outputs XML that we parse
maxTurns: 1000,
cwd: session.cwd
}
});
// Consume responses and parse XML for observations/summaries
for await (const msg of response) {
if (msg.type === 'text') {
// Use an XML parser library (e.g., fast-xml-parser or similar) to parse observations and summaries
// Parse <observation> blocks and call storeObservation(session_id, project, type, text)
// Parse <summary> blocks, extract all 8 fields, format and call storeSummary(session_id, project, text)
parseAndStoreObservations(msg.content, session);
parseAndStoreSummary(msg.content, session);
}
}
}
3. PostToolUse Hook
Purpose: Queue tool observations for SDK processing
Hook config:
{
"hooks": {
"PostToolUse": [{
"matcher": "*",
"hooks": [{
"type": "command",
"command": "claude-mem save"
}]
}]
}
}
Command: claude-mem save
Flow:
- Read stdin JSON (tool_name, tool_input, tool_output, etc.)
- Find active SDK session for this project
- Insert observation into queue:
INSERT INTO observation_queue (sdk_session_id, tool_name, tool_input, tool_output, created_at_epoch) VALUES (?, ?, ?, ?, ?) - Output:
{"continue": true, "suppressOutput": true} - Exit immediately
Observation Filtering:
Not every tool is worth observing. Filter in the hook:
const BORING_TOOLS = new Set(['Glob', 'Grep', 'ListMcpResourcesTool']);
if (BORING_TOOLS.has(tool_name)) {
console.log('{"continue": true, "suppressOutput": true}');
process.exit(0);
}
4. Stop Hook
Purpose: Signal SDK to finalize and generate summary
Hook config:
{
"hooks": {
"Stop": [{
"hooks": [{
"type": "command",
"command": "claude-mem summary"
}]
}]
}
}
Command: claude-mem summary
Flow:
- Read stdin JSON (session_id, cwd, etc.)
- Find active SDK session
- Insert special "FINALIZE" message into observation queue:
INSERT INTO observation_queue (sdk_session_id, tool_name, tool_input, tool_output, created_at_epoch) VALUES (?, 'FINALIZE', '{}', '{}', ?) - Output:
{"continue": true, "suppressOutput": true} - Exit immediately
SDK Worker Handling:
When SDK worker sees FINALIZE message:
if (obs.tool_name === 'FINALIZE') {
yield {
type: "user",
message: {
role: "user",
content: buildFinalizePrompt(session)
}
};
// Wait for SDK to finish processing
await waitForSDKCompletion();
// Update session status
await markSessionCompleted(session.id);
// Exit worker
break;
}
SDK Agent Prompts
Initialization Prompt
function buildInitPrompt(project: string, sessionId: string, userPrompt: string): string {
return `You are a memory assistant for the "${project}" project.
SESSION CONTEXT
---------------
Session ID: ${sessionId}
User's Goal: ${userPrompt}
Date: ${new Date().toISOString().split('T')[0]}
YOUR ROLE
---------
You will observe tool executions during this Claude Code session. Your job is to:
1. Extract meaningful insights (not just raw data)
2. Store atomic observations in SQLite
3. Focus on: key decisions, patterns discovered, problems solved, technical insights
WHAT TO CAPTURE
----------------
✓ Architecture decisions (e.g., "chose PostgreSQL over MongoDB for ACID guarantees")
✓ Bug fixes (e.g., "fixed race condition in auth middleware by adding mutex")
✓ New features (e.g., "implemented JWT refresh token flow")
✓ Refactorings (e.g., "extracted validation logic into separate service")
✓ Discoveries (e.g., "found that API rate limit is 100 req/min")
✗ NOT routine operations (reading files, listing directories)
✗ NOT work-in-progress (only completed work)
✗ NOT obvious facts (e.g., "TypeScript file has types")
HOW TO STORE OBSERVATIONS
--------------------------
When you identify something worth remembering, output your observation in this EXACT XML format:
\`\`\`xml
<observation>
<type>feature</type>
<text>Implemented JWT token refresh flow with 7-day expiry</text>
</observation>
\`\`\`
Valid types: decision, bugfix, feature, refactor, discovery
Structure requirements:
- <observation> is the root element
- <type> must be one of the 5 valid types (single word)
- <text> contains your concise observation (one sentence preferred)
- No additional fields or nesting
The SDK worker will parse all <observation> blocks from your response using regex and store them in SQLite.
You can include your reasoning before or after the observation block, or just output the observation by itself.
EXAMPLE
-------
Bad: "Read src/auth.ts file"
Good: "Implemented JWT token refresh flow with 7-day expiry"
Wait for tool observations. Acknowledge this message briefly.`;
}
Observation Prompt
function buildObservationPrompt(obs: Observation): string {
return `TOOL OBSERVATION
================
Tool: ${obs.tool_name}
Time: ${new Date(obs.created_at_epoch).toISOString()}
Input:
${JSON.stringify(JSON.parse(obs.tool_input), null, 2)}
Output:
${JSON.stringify(JSON.parse(obs.tool_output), null, 2)}
ANALYSIS TASK
-------------
1. Does this observation contain something worth remembering?
2. If YES: Output the observation in this EXACT XML format:
\`\`\`xml
<observation>
<type>feature</type>
<text>Your concise observation here</text>
</observation>
\`\`\`
Requirements:
- Use one of these types: decision, bugfix, feature, refactor, discovery
- Keep text concise (one sentence preferred)
- No markdown formatting inside <text>
- No additional XML fields
3. If NO: Just acknowledge and wait for next observation
Remember: Quality over quantity. Only store meaningful insights.`;
}
Finalization Prompt
function buildFinalizePrompt(session: SDKSession): string {
return `SESSION ENDING
==============
The Claude Code session is finishing.
FINAL TASK
----------
1. Review the observations you've stored this session
2. Generate a structured summary that answers these questions:
- What did user request?
- What did you investigate?
- What did you learn?
- What did you do?
- What's next?
- Files read
- Files edited
- Notes
3. Generate the structured summary and output it in this EXACT XML format:
\`\`\`xml
<summary>
<request>Implement JWT authentication system</request>
<investigated>Existing auth middleware, session management, token storage patterns</investigated>
<learned>Current system uses session cookies; no JWT support; race condition in middleware</learned>
<completed>Implemented JWT token + refresh flow with 7-day expiry; fixed race condition with mutex; added token validation middleware</completed>
<next_steps>Add token revocation API endpoint; write integration tests</next_steps>
<files_read>
<file>src/auth.ts</file>
<file>src/middleware/session.ts</file>
<file>src/types/user.ts</file>
</files_read>
<files_edited>
<file>src/auth.ts</file>
<file>src/middleware/auth.ts</file>
<file>src/routes/auth.ts</file>
</files_edited>
<notes>Token secret stored in .env; refresh tokens use rotation strategy</notes>
</summary>
\`\`\`
Structure requirements:
- <summary> is the root element
- All 8 child elements are REQUIRED: request, investigated, learned, completed, next_steps, files_read, files_edited, notes
- <files_read> and <files_edited> must contain <file> child elements (one per file)
- If no files were read/edited, use empty tags: <files_read></files_read>
- Text fields can be multiple sentences but avoid markdown formatting
- Use underscores in element names: next_steps, files_read, files_edited
The SDK worker will parse the <summary> block and extract all fields to store in SQLite.
Generate the summary now in the required XML format.`;
}
Hook Commands Architecture
All four hook commands (claude-mem context, claude-mem new, claude-mem save, claude-mem summary) are implemented as standalone TypeScript functions that:
- Use bun:sqlite directly - No spawning child processes or CLI subcommands
- Are self-contained - Each hook has all the logic it needs
- Share a common database layer - Import from shared
db.tsmodule - Never call other claude-mem commands - All functionality via direct library calls
// Example structure
import { Database } from 'bun:sqlite';
export function contextHook(stdin: HookInput) {
const db = new Database('~/.claude-mem/db.sqlite');
// Query and return context directly
const summaries = db.query('SELECT ...').all();
console.log(formatContext(summaries));
db.close();
}
export function saveHook(stdin: HookInput) {
const db = new Database('~/.claude-mem/db.sqlite');
// Insert observation directly
db.run('INSERT INTO observation_queue ...', params);
db.close();
console.log('{"continue": true, "suppressOutput": true}');
}
Key principle: Hooks are fast, synchronous database operations. The SDK worker process is where async/complex logic happens.
Background Process Management
The claude-mem save hook just queues observations - processing happens in the background SDK worker process that polls the queue continuously.
The SDK worker is spawned by claude-mem new as a detached process and runs for the duration of the Claude Code session.
Benefits:
- Works on all platforms (no systemd/launchd needed)
- Self-contained (spawned and managed by claude-mem itself)
- Simple state management (all state in SQLite)
Error Handling
SDK worker failures:
- Each observation processing is atomic
- Failed observations stay in queue
- Next worker run retries
- After 3 failures, mark observation as skipped
Database corruption:
- SQLite with WAL mode (write-ahead logging)
- Regular backups to ~/.claude-mem/backups/
- Automatic recovery from backups
ChromaDB connection failures:
- Graceful degradation (log error, continue)
- Retry with exponential backoff
- Don't block main Claude Code session
Implementation Order
- Database setup - Create tables and migration scripts
- Hook commands - Implement the 4 hook commands (context, new, save, summary)
- SDK worker - Implement the background worker process with response parsing
- SDK prompts - Wire up the prompts and message generator
- Test end-to-end - Run a real Claude Code session and verify it works
Start simple. Get one hook working before moving to the next. Don't try to build everything at once.
Note: MCP is only used for retrieval (when Claude Code needs to access stored memories), not for storage. The SDK agent stores data by outputting specially formatted text that the SDK worker parses and writes to SQLite.