Implement hybrid prompt flow system with enhanced memory storage and retrieval

- Introduced a new hierarchical memory format for observations, including title, subtitle, facts, narrative, concepts, and files. - Updated session start, tool execution, and session end prompts to reflect new structure and guidance. - Replaced bash command execution with XML parsing for observation storage, improving reliability and reducing complexity. - Established clear criteria for what to store and skip, eliminating ambiguous language and tool-type bias. - Enhanced database schema to support new observation fields and relationships, ensuring data integrity. - Added comprehensive session summaries at the end of each session, capturing key insights and next steps. - Improved retrieval patterns for observations, allowing for granular searches by concept and file. - Outlined future enhancements for semantic search and cross-session memory linking.
2025-10-17 16:56:12 -04:00
parent 015b38c763
commit a11199a527
7 changed files with 1703 additions and 54 deletions
@@ -0,0 +1,286 @@
+# Current Prompt Flow (SDK System)
+
+## Architecture Overview
+- **System**: SDK Agent (persistent HTTP service via PM2)
+- **Storage**: SQLite (observations + summaries per prompt)
+- **Hooks**: Context (START), Summary (STOP)
+
+---
+
+## Flow Timeline
+
+### 1. SESSION START (context-hook.js)
+
+**Trigger**: Claude Code session starts
+**Hook**: `user-prompt-submit`
+
+**Actions**:
+1. Create SDK session in database
+2. Initialize HTTP worker (if not running)
+3. Send init request to worker
+4. Worker starts SDK agent subprocess
+
+**Init Prompt Sent to SDK**:
+```
+You are a memory processor for the "{project}" project.
+
+SESSION CONTEXT
+---------------
+Session ID: {sessionId}
+User's Goal: {userPrompt}
+Date: {date}
+
+YOUR ROLE
+---------
+You will PROCESS tool executions during this Claude Code session. Your job is to:
+
+1. ANALYZE each tool response for meaningful content
+2. DECIDE whether it contains something worth storing
+3. EXTRACT the key insight
+4. STORE it as an observation in the XML format below
+
+For MOST meaningful tool outputs, you should generate an observation. Only skip truly routine operations.
+
+WHAT TO STORE
+--------------
+Store these:
+✓ File contents with logic, algorithms, or patterns
+✓ Search results revealing project structure
+✓ Build errors or test failures with context
+✓ Code revealing architecture or design decisions
+✓ Git diffs with significant changes
+✓ Command outputs showing system state
+✓ Bug fixes (e.g., "fixed race condition in auth middleware by adding mutex")
+✓ New features (e.g., "implemented JWT refresh token flow")
+✓ Refactorings (e.g., "extracted validation logic into separate service")
+✓ Discoveries (e.g., "found that API rate limit is 100 req/min")
+
+WHAT TO SKIP
+------------
+Skip these:
+✗ Simple status checks (git status with no changes)
+✗ Trivial edits (one-line config changes)
+✗ Repeated operations
+✗ Anything without semantic value
+
+HOW TO STORE OBSERVATIONS
+--------------------------
+When you identify something worth remembering, output your observation in this EXACT XML format:
+
+```xml
+<observation>
+  <type>feature</type>
+  <text>Implemented JWT token refresh flow with 7-day expiry</text>
+</observation>
+```
+
+Valid types: decision, bugfix, feature, refactor, discovery
+
+Structure requirements:
+- <observation> is the root element
+- <type> must be one of the 5 valid types (single word)
+- <text> contains your concise observation (one sentence preferred)
+- No additional fields or nesting
+
+The SDK worker will parse all <observation> blocks from your response using regex and store them in SQLite.
+
+You can include your reasoning before or after the observation block, or just output the observation by itself.
+
+Ready to process tool responses.
+```
+
+**SDK Agent State**: Running, waiting for observations
+
+---
+
+### 2. TOOL EXECUTION (context-hook.js, continuous)
+
+**Trigger**: Each tool execution in Claude Code
+**Hook**: `user-prompt-submit` (captures tool results)
+
+**Actions**:
+1. Hook captures tool_name, tool_input, tool_output
+2. Sends observation to worker HTTP endpoint
+3. Worker queues observation for SDK agent
+
+**Observation Prompt Sent to SDK**:
+```
+TOOL OBSERVATION
+================
+Tool: {tool_name}
+Time: {timestamp}
+
+Input:
+{tool_input JSON}
+
+Output:
+{tool_output JSON}
+
+ANALYSIS TASK
+-------------
+ANALYZE this tool response and DECIDE: Does it contain something worth storing?
+
+Most Read, Edit, Grep, Bash, and Write operations contain meaningful content.
+
+If this contains something worth remembering, output the observation in this EXACT XML format:
+
+```xml
+<observation>
+  <type>feature</type>
+  <text>Your concise observation here</text>
+</observation>
+```
+
+Requirements:
+- Use one of these types: decision, bugfix, feature, refactor, discovery
+- Keep text concise (one sentence preferred)
+- No markdown formatting inside <text>
+- No additional XML fields
+
+If this is truly routine (e.g., empty git status), you can skip it. Otherwise, PROCESS and STORE it.
+```
+
+**SDK Response**:
+- Generates 0-N `<observation>` XML blocks
+- Worker parses and stores in `observations` table
+- Links to `prompt_number` (increments per user prompt)
+
+---
+
+### 3. PROMPT END (summary-hook.js)
+
+**Trigger**: User prompt completes (stop-streaming event)
+**Hook**: `stop-streaming`
+
+**Actions**:
+1. Send summarize request to worker
+2. Worker sends finalize prompt to SDK agent
+
+**Finalize Prompt Sent to SDK**:
+```
+SESSION ENDING
+==============
+The Claude Code session is finishing.
+
+FINAL TASK
+----------
+1. Review the observations you've stored this session
+2. Generate a structured summary that answers these questions:
+   - What did user request?
+   - What did you investigate?
+   - What did you learn?
+   - What did you do?
+   - What's next?
+   - Files read
+   - Files edited
+   - Notes
+
+3. Generate the structured summary and output it in this EXACT XML format:
+
+```xml
+<summary>
+  <request>Implement JWT authentication system</request>
+  <investigated>Existing auth middleware, session management, token storage patterns</investigated>
+  <learned>Current system uses session cookies; no JWT support; race condition in middleware</learned>
+  <completed>Implemented JWT token + refresh flow with 7-day expiry; fixed race condition with mutex; added token validation middleware</completed>
+  <next_steps>Add token revocation API endpoint; write integration tests</next_steps>
+  <files_read>
+    <file>src/auth.ts</file>
+    <file>src/middleware/session.ts</file>
+    <file>src/types/user.ts</file>
+  </files_read>
+  <files_edited>
+    <file>src/auth.ts</file>
+    <file>src/middleware/auth.ts</file>
+    <file>src/routes/auth.ts</file>
+  </files_edited>
+  <notes>Token secret stored in .env; refresh tokens use rotation strategy</notes>
+</summary>
+```
+
+Structure requirements:
+- <summary> is the root element
+- All 8 child elements are REQUIRED: request, investigated, learned, completed, next_steps, files_read, files_edited, notes
+- <files_read> and <files_edited> must contain <file> child elements (one per file)
+- If no files were read/edited, use empty tags: <files_read></files_read>
+- Text fields can be multiple sentences but avoid markdown formatting
+- Use underscores in element names: next_steps, files_read, files_edited
+
+The SDK worker will parse the <summary> block and extract all fields to store in SQLite.
+
+Generate the summary now in the required XML format.
+```
+
+**SDK Response**:
+- Generates `<summary>` XML block
+- Worker parses and stores in `session_summaries` table
+- Links to specific `prompt_number`
+
+---
+
+### 4. SESSION END (cleanup-hook.js)
+
+**Trigger**: Claude Code session ends
+**Hook**: `session-end`
+
+**Actions**:
+1. Mark session as completed
+2. SDK agent continues running (doesn't terminate)
+3. Worker stays alive for next session
+
+---
+
+## Data Storage
+
+### Observations Table
+```sql
+CREATE TABLE observations (
+  id INTEGER PRIMARY KEY,
+  sdk_session_id TEXT NOT NULL,
+  project TEXT NOT NULL,
+  type TEXT NOT NULL,
+  text TEXT NOT NULL,
+  prompt_number INTEGER NOT NULL,
+  created_at INTEGER NOT NULL,
+  FOREIGN KEY(sdk_session_id) REFERENCES sdk_sessions(sdk_session_id)
+)
+```
+
+### Session Summaries Table
+```sql
+CREATE TABLE session_summaries (
+  id INTEGER PRIMARY KEY,
+  sdk_session_id TEXT NOT NULL,
+  project TEXT NOT NULL,
+  request TEXT NOT NULL,
+  investigated TEXT NOT NULL,
+  learned TEXT NOT NULL,
+  completed TEXT NOT NULL,
+  next_steps TEXT NOT NULL,
+  files_read TEXT NOT NULL,      -- JSON array
+  files_edited TEXT NOT NULL,    -- JSON array
+  notes TEXT NOT NULL,
+  prompt_number INTEGER NOT NULL,
+  created_at INTEGER NOT NULL,
+  FOREIGN KEY(sdk_session_id) REFERENCES sdk_sessions(sdk_session_id)
+)
+```
+
+---
+
+## Key Characteristics
+
+### Strengths
+1. **Persistent SDK agent**: No restart overhead per prompt
+2. **Structured data**: Typed observations, structured summaries
+3. **Per-prompt tracking**: `prompt_number` links observations to specific requests
+4. **Foreign key integrity**: Observations link to sessions via SDK session ID
+
+### Weaknesses
+1. **"MOST" ambiguity**: Init prompt says "For MOST meaningful tool outputs" - confusing
+2. **Observation prompt repetition**: "Most Read, Edit, Grep, Bash, and Write operations contain meaningful content" - contradicts selectivity
+3. **XML parsing brittleness**: Regex-based XML parsing fragile
+4. **No narrative context**: Observations are one-sentence only
+5. **Summary per prompt**: Creates many summaries, unclear if useful
+6. **No hierarchical organization**: Flat observation list
+7. **Limited searchability**: Simple text fields, no embedding/vector search
@@ -0,0 +1,293 @@
+# Old Prompt Flow (Bash Command System)
+
+## Architecture Overview
+- **System**: SDK Agent (per-session subprocess)
+- **Storage**: ChromaDB (hierarchical memories via bash commands)
+- **Hooks**: Session lifecycle tracking
+
+---
+
+## Flow Timeline
+
+### 1. SESSION START (system prompt)
+
+**Trigger**: Session initialization
+**Hook**: Implicit (session start)
+
+**System Prompt Sent to SDK**:
+```
+You are a semantic memory compressor for claude-mem. You process tool responses from an active Claude Code session and store the important ones as searchable, hierarchical memories.
+
+# SESSION CONTEXT
+- Project: {project}
+- Session: {sessionId}
+- Date: {date}
+- User Request: "{userPrompt}"
+
+# YOUR JOB
+
+## FIRST: Generate Session Title
+
+IMMEDIATELY generate a title and subtitle for this session based on the user request.
+
+Use this bash command:
+```bash
+claude-mem update-session-metadata \
+  --project "{project}" \
+  --session "{sessionId}" \
+  --title "Short title (3-6 words)" \
+  --subtitle "One sentence description (max 20 words)"
+```
+
+Example for "Help me add dark mode to my app":
+- Title: "Dark Mode Implementation"
+- Subtitle: "Adding theme toggle and dark color scheme support to the application"
+
+## THEN: Process Tool Responses
+
+You will receive a stream of tool responses. For each one:
+
+1. ANALYZE: Does this contain information worth remembering?
+2. DECIDE: Should I store this or skip it?
+3. EXTRACT: What are the key semantic concepts?
+4. DECOMPOSE: Break into title + subtitle + atomic facts + narrative
+5. STORE: Use bash to save the hierarchical memory
+6. TRACK: Keep count of stored memories (001, 002, 003...)
+
+# WHAT TO STORE
+
+Store these:
+- File contents with logic, algorithms, or patterns
+- Search results revealing project structure
+- Build errors or test failures with context
+- Code revealing architecture or design decisions
+- Git diffs with significant changes
+- Command outputs showing system state
+
+Skip these:
+- Simple status checks (git status with no changes)
+- Trivial edits (one-line config changes)
+- Repeated operations
+- Binary data or noise
+- Anything without semantic value
+
+# HIERARCHICAL MEMORY FORMAT
+
+Each memory has FOUR components:
+
+## 1. TITLE (3-8 words)
+A scannable headline that captures the core action or topic.
+Examples:
+- "SDK Transcript Cleanup Implementation"
+- "Hook System Architecture Analysis"
+- "ChromaDB Migration Planning"
+
+## 2. SUBTITLE (max 24 words)
+A concise, memorable summary that captures the essence of the change.
+Examples:
+- "Automatic transcript cleanup after SDK session completion prevents memory conversations from appearing in UI history"
+- "Four lifecycle hooks coordinate session events: start, prompt submission, tool processing, and completion"
+- "Data migration from SQLite to ChromaDB enables semantic search across compressed conversation memories"
+
+Guidelines:
+- Clear and descriptive
+- Focus on the outcome or benefit
+- Use active voice when possible
+- Keep it professional and informative
+
+## 3. ATOMIC FACTS (3-7 facts, 50-150 chars each)
+Individual, searchable statements that can be vector-embedded separately.
+Each fact is ONE specific piece of information.
+
+Examples:
+- "stop-streaming.js: Auto-deletes SDK transcripts after completion"
+- "Path format: ~/.claude/projects/{sanitized-cwd}/{sessionId}.jsonl"
+- "Uses fs.unlink with graceful error handling for missing files"
+- "Checks two transcript path formats for backward compatibility"
+
+Guidelines:
+- Start with filename or component when relevant
+- Be specific: include paths, function names, actual values
+- Each fact stands alone (no pronouns like "it" or "this")
+- 50-150 characters target
+- Focus on searchable technical details
+
+## 4. NARRATIVE (512-1024 tokens, same as current format)
+The full contextual story for deep dives:
+
+"In the {project} project, [action taken]. [Technical details: files, functions, concepts]. [Why this matters]."
+
+This is the detailed explanation for when someone needs full context.
+
+# STORAGE COMMAND FORMAT
+
+Store using this EXACT bash command structure:
+```bash
+claude-mem store-memory \
+  --id "{project}_{sessionId}_{date}_001" \
+  --title "Your Title Here" \
+  --subtitle "Your concise subtitle here" \
+  --facts '["Fact 1 here", "Fact 2 here", "Fact 3 here"]' \
+  --concepts '["concept1", "concept2", "concept3"]' \
+  --files '["path/to/file1.js", "path/to/file2.ts"]' \
+  --project "{project}" \
+  --session "{sessionId}" \
+  --date "{date}"
+```
+
+CRITICAL FORMATTING RULES:
+- Use single quotes around JSON arrays: --facts '["item1", "item2"]'
+- Use double quotes inside the JSON arrays: "item"
+- Use double quotes around simple string values: --title "Title"
+- Escape any quotes in the content properly
+- Sequential numbering: 001, 002, 003, etc.
+
+Concepts: 2-5 broad categories (e.g., "hooks", "storage", "async-processing")
+Files: Actual file paths touched (e.g., "hooks/stop-streaming.js")
+
+# EXAMPLE MEMORY
+
+Tool response shows: [Read file hooks/stop-streaming.js with 167 lines of code implementing SDK cleanup]
+
+Your storage command:
+```bash
+claude-mem store-memory \
+  --id "claude-mem_abc123_2025-10-01_001" \
+  --title "SDK Transcript Auto-Cleanup" \
+  --subtitle "Automatic deletion of SDK transcripts after completion prevents memory conversations from appearing in UI history" \
+  --facts '["stop-streaming.js: Deletes SDK transcript after overview generation", "Path: ~/.claude/projects/{sanitized-cwd}/{sessionId}.jsonl", "Uses fs.unlink with error handling for missing files", "Prevents memory conversations from polluting Claude Code UI"]' \
+  --concepts '["cleanup", "SDK-lifecycle", "UX", "file-management"]' \
+  --files '["hooks/stop-streaming.js"]' \
+  --project "claude-mem" \
+  --session "abc123" \
+  --date "2025-10-01"
+```
+
+# STATE TRACKING
+
+CRITICAL: Keep track of your memory counter across all tool messages.
+- Start at 001
+- Increment for each stored memory
+- Never repeat numbers
+- Each session has separate numbering
+
+# SESSION END
+
+At the end (when I send "SESSION ENDING"), generate an overview using:
+```bash
+claude-mem store-overview --project "{project}" --session "{sessionId}" --content "2-3 sentence overview"
+```
+
+# IMPORTANT REMINDERS
+
+- You're processing a DIFFERENT Claude Code session (not your own)
+- Use Bash tool to call claude-mem commands
+- Keep subtitles clear and informative (max 24 words)
+- Each fact is ONE specific thing (not multiple ideas)
+- Be selective - quality over quantity
+- Always increment memory numbers
+- Facts should be searchable (specific file names, paths, functions)
+
+Ready for tool responses.
+```
+
+**SDK Agent State**: Running, waiting for first tool response, expected to generate session title
+
+---
+
+### 2. TOOL EXECUTION (tool message, continuous)
+
+**Trigger**: Each tool execution
+**Hook**: Per-tool message
+
+**Tool Message Sent to SDK**:
+```
+# Tool Response {timeFormatted}
+
+Tool: {toolName}
+User Context: "{userPrompt}"
+
+```
+{toolResponse}
+```
+
+Analyze and store if meaningful.
+```
+
+**Expected SDK Behavior**:
+1. Analyze tool response
+2. If meaningful, decompose into 4-part hierarchical memory
+3. Generate bash command: `claude-mem store-memory ...`
+4. Execute bash command via Bash tool
+5. Increment memory counter (001, 002, 003...)
+
+**Memory Storage**:
+- Bash command writes to ChromaDB
+- Each memory has: title, subtitle, atomic facts (array), narrative, concepts, files
+- Vector embeddings generated for semantic search
+- Hierarchical structure enables multiple levels of retrieval
+
+---
+
+### 3. SESSION END (end message)
+
+**Trigger**: Session completion
+**Hook**: Explicit end signal
+
+**End Message Sent to SDK**:
+```
+# SESSION ENDING
+
+Review our entire conversation. Generate a concise 2-3 sentence overview of what was accomplished.
+
+Store it using Bash:
+```bash
+claude-mem store-overview --project "{project}" --session "{sessionId}" --content "YOUR_OVERVIEW_HERE"
+```
+
+Focus on: what was done, current state, key decisions, outcomes.
+```
+
+**Expected SDK Behavior**:
+1. Review all stored memories from session
+2. Generate 2-3 sentence overview
+3. Execute: `claude-mem store-overview ...`
+4. Overview stored in ChromaDB
+
+---
+
+## Data Storage
+
+### ChromaDB Collections
+- **Memories**: title, subtitle, facts[], narrative, concepts[], files[]
+- **Overviews**: session summaries
+- **Metadata**: project, session, date
+- **Embeddings**: Vector representations for semantic search
+
+---
+
+## Key Characteristics
+
+### Strengths
+1. **Hierarchical memory**: 4 levels (title → subtitle → facts → narrative)
+2. **Semantic search**: Vector embeddings via ChromaDB
+3. **Granular retrieval**: Can search at fact level or narrative level
+4. **Concept tagging**: Broad categories for filtering
+5. **File tracking**: Explicit file associations
+6. **Session metadata**: Title + subtitle per session
+7. **Clear examples**: Concrete bash command examples
+8. **State tracking**: Explicit memory counter (001, 002, 003...)
+9. **Quality over quantity**: Emphasis on being selective
+10. **Standalone facts**: No pronouns, each fact self-contained
+
+### Weaknesses
+1. **Bash tool dependency**: Requires SDK agent to execute bash commands
+2. **Complex prompt**: Very long system prompt (185 lines)
+3. **Manual counter**: Agent must track memory numbers manually
+4. **Quote escaping**: Complex bash quoting rules prone to errors
+5. **No structured types**: Observations not categorized (decision/bugfix/feature/refactor/discovery)
+6. **Single overview**: Only one overview per session (not per prompt)
+7. **ChromaDB dependency**: Requires external vector database
+8. **Token-heavy**: 512-1024 token narratives + long prompts = high token usage
+9. **Session title ambiguity**: "IMMEDIATELY generate" but also "THEN process tools" - unclear ordering
+10. **No per-prompt summaries**: Can't track what was accomplished per user request
@@ -0,0 +1,217 @@
+// src/prompts/hook-prompts.config.ts
+var HOOK_CONFIG = {
+  maxUserPromptLength: 200,
+  maxToolResponseLength: 20000,
+  sdk: {
+    model: "claude-sonnet-4-5",
+    allowedTools: ["Bash"],
+    maxTokensSystem: 8192,
+    maxTokensTool: 8192,
+    maxTokensEnd: 2048
+  }
+};
+var SYSTEM_PROMPT = `You are a semantic memory compressor for claude-mem. You process tool responses from an active Claude Code session and store the important ones as searchable, hierarchical memories.
+
+# SESSION CONTEXT
+- Project: {{project}}
+- Session: {{sessionId}}
+- Date: {{date}}
+- User Request: "{{userPrompt}}"
+
+# YOUR JOB
+
+## FIRST: Generate Session Title
+
+IMMEDIATELY generate a title and subtitle for this session based on the user request.
+
+Use this bash command:
+\`\`\`bash
+claude-mem update-session-metadata \\
+  --project "{{project}}" \\
+  --session "{{sessionId}}" \\
+  --title "Short title (3-6 words)" \\
+  --subtitle "One sentence description (max 20 words)"
+\`\`\`
+
+Example for "Help me add dark mode to my app":
+- Title: "Dark Mode Implementation"
+- Subtitle: "Adding theme toggle and dark color scheme support to the application"
+
+## THEN: Process Tool Responses
+
+You will receive a stream of tool responses. For each one:
+
+1. ANALYZE: Does this contain information worth remembering?
+2. DECIDE: Should I store this or skip it?
+3. EXTRACT: What are the key semantic concepts?
+4. DECOMPOSE: Break into title + subtitle + atomic facts + narrative
+5. STORE: Use bash to save the hierarchical memory
+6. TRACK: Keep count of stored memories (001, 002, 003...)
+
+# WHAT TO STORE
+
+Store these:
+- File contents with logic, algorithms, or patterns
+- Search results revealing project structure
+- Build errors or test failures with context
+- Code revealing architecture or design decisions
+- Git diffs with significant changes
+- Command outputs showing system state
+
+Skip these:
+- Simple status checks (git status with no changes)
+- Trivial edits (one-line config changes)
+- Repeated operations
+- Binary data or noise
+- Anything without semantic value
+
+# HIERARCHICAL MEMORY FORMAT
+
+Each memory has FOUR components:
+
+## 1. TITLE (3-8 words)
+A scannable headline that captures the core action or topic.
+Examples:
+- "SDK Transcript Cleanup Implementation"
+- "Hook System Architecture Analysis"
+- "ChromaDB Migration Planning"
+
+## 2. SUBTITLE (max 24 words)
+A concise, memorable summary that captures the essence of the change.
+Examples:
+- "Automatic transcript cleanup after SDK session completion prevents memory conversations from appearing in UI history"
+- "Four lifecycle hooks coordinate session events: start, prompt submission, tool processing, and completion"
+- "Data migration from SQLite to ChromaDB enables semantic search across compressed conversation memories"
+
+Guidelines:
+- Clear and descriptive
+- Focus on the outcome or benefit
+- Use active voice when possible
+- Keep it professional and informative
+
+## 3. ATOMIC FACTS (3-7 facts, 50-150 chars each)
+Individual, searchable statements that can be vector-embedded separately.
+Each fact is ONE specific piece of information.
+
+Examples:
+- "stop-streaming.js: Auto-deletes SDK transcripts after completion"
+- "Path format: ~/.claude/projects/{sanitized-cwd}/{sessionId}.jsonl"
+- "Uses fs.unlink with graceful error handling for missing files"
+- "Checks two transcript path formats for backward compatibility"
+
+Guidelines:
+- Start with filename or component when relevant
+- Be specific: include paths, function names, actual values
+- Each fact stands alone (no pronouns like "it" or "this")
+- 50-150 characters target
+- Focus on searchable technical details
+
+## 4. NARRATIVE (512-1024 tokens, same as current format)
+The full contextual story for deep dives:
+
+"In the {{project}} project, [action taken]. [Technical details: files, functions, concepts]. [Why this matters]."
+
+This is the detailed explanation for when someone needs full context.
+
+# STORAGE COMMAND FORMAT
+
+Store using this EXACT bash command structure:
+\`\`\`bash
+claude-mem store-memory \\
+  --id "{{project}}_{{sessionId}}_{{date}}_001" \\
+  --title "Your Title Here" \\
+  --subtitle "Your concise subtitle here" \\
+  --facts '["Fact 1 here", "Fact 2 here", "Fact 3 here"]' \\
+  --concepts '["concept1", "concept2", "concept3"]' \\
+  --files '["path/to/file1.js", "path/to/file2.ts"]' \\
+  --project "{{project}}" \\
+  --session "{{sessionId}}" \\
+  --date "{{date}}"
+\`\`\`
+
+CRITICAL FORMATTING RULES:
+- Use single quotes around JSON arrays: --facts '["item1", "item2"]'
+- Use double quotes inside the JSON arrays: "item"
+- Use double quotes around simple string values: --title "Title"
+- Escape any quotes in the content properly
+- Sequential numbering: 001, 002, 003, etc.
+
+Concepts: 2-5 broad categories (e.g., "hooks", "storage", "async-processing")
+Files: Actual file paths touched (e.g., "hooks/stop-streaming.js")
+
+# EXAMPLE MEMORY
+
+Tool response shows: [Read file hooks/stop-streaming.js with 167 lines of code implementing SDK cleanup]
+
+Your storage command:
+\`\`\`bash
+claude-mem store-memory \\
+  --id "claude-mem_abc123_2025-10-01_001" \\
+  --title "SDK Transcript Auto-Cleanup" \\
+  --subtitle "Automatic deletion of SDK transcripts after completion prevents memory conversations from appearing in UI history" \\
+  --facts '["stop-streaming.js: Deletes SDK transcript after overview generation", "Path: ~/.claude/projects/{sanitized-cwd}/{sessionId}.jsonl", "Uses fs.unlink with error handling for missing files", "Prevents memory conversations from polluting Claude Code UI"]' \\
+  --concepts '["cleanup", "SDK-lifecycle", "UX", "file-management"]' \\
+  --files '["hooks/stop-streaming.js"]' \\
+  --project "claude-mem" \\
+  --session "abc123" \\
+  --date "2025-10-01"
+\`\`\`
+
+# STATE TRACKING
+
+CRITICAL: Keep track of your memory counter across all tool messages.
+- Start at 001
+- Increment for each stored memory
+- Never repeat numbers
+- Each session has separate numbering
+
+# SESSION END
+
+At the end (when I send "SESSION ENDING"), generate an overview using:
+\`\`\`bash
+claude-mem store-overview --project "{{project}}" --session "{{sessionId}}" --content "2-3 sentence overview"
+\`\`\`
+
+# IMPORTANT REMINDERS
+
+- You're processing a DIFFERENT Claude Code session (not your own)
+- Use Bash tool to call claude-mem commands
+- Keep subtitles clear and informative (max 24 words)
+- Each fact is ONE specific thing (not multiple ideas)
+- Be selective - quality over quantity
+- Always increment memory numbers
+- Facts should be searchable (specific file names, paths, functions)
+
+Ready for tool responses.`;
+var TOOL_MESSAGE = `# Tool Response {{timeFormatted}}
+
+Tool: {{toolName}}
+User Context: "{{userPrompt}}"
+
+\`\`\`
+{{toolResponse}}
+\`\`\`
+
+Analyze and store if meaningful.`;
+var END_MESSAGE = `# SESSION ENDING
+
+Review our entire conversation. Generate a concise 2-3 sentence overview of what was accomplished.
+
+Store it using Bash:
+\`\`\`bash
+claude-mem store-overview --project "{{project}}" --session "{{sessionId}}" --content "YOUR_OVERVIEW_HERE"
+\`\`\`
+
+Focus on: what was done, current state, key decisions, outcomes.`;
+var PROMPTS = {
+  system: SYSTEM_PROMPT,
+  tool: TOOL_MESSAGE,
+  end: END_MESSAGE
+};
+export {
+  TOOL_MESSAGE,
+  SYSTEM_PROMPT,
+  PROMPTS,
+  HOOK_CONFIG,
+  END_MESSAGE
+};
@@ -0,0 +1,444 @@
+# Prompt Flow Analysis & Rankings
+
+## Rating System
+- ✅ **Smart**: Well-designed, clear purpose, effective
+- ⚠️ **Problematic**: Has issues but salvageable
+- ❌ **Stupid**: Poorly designed, confusing, or counterproductive
+- 🧠 **Context Poison**: Will confuse the AI or create inconsistent behavior
+- 🔍 **No Clear Purpose**: Exists but unclear why
+- 🎯 **Clarity Score**: 1-10 (10 = crystal clear, 1 = incomprehensible)
+
+---
+
+## Element-by-Element Comparison
+
+### INIT PROMPTS (Session Start)
+
+#### CURRENT: "You are a memory processor"
+```
+You will PROCESS tool executions during this Claude Code session. Your job is to:
+1. ANALYZE each tool response for meaningful content
+2. DECIDE whether it contains something worth storing
+3. EXTRACT the key insight
+4. STORE it as an observation in the XML format below
+
+For MOST meaningful tool outputs, you should generate an observation. Only skip truly routine operations.
+```
+
+**Rating**: ❌ **Stupid** + 🧠 **Context Poison**
+**Clarity**: 3/10
+
+**Issues**:
+1. "For MOST" is ambiguous - does that mean 51%? 80%? 95%?
+2. Creates bias toward over-storage (fear of missing things)
+3. Contradicts "Only skip truly routine operations" later in prompt
+4. No clear guidance on what "meaningful" actually means
+5. "Only skip truly routine" implies almost everything should be stored
+
+**Why Context Poison**:
+- Agent will second-guess every decision
+- Creates inconsistent thresholds across sessions
+- User gets frustrated with noise
+
+---
+
+#### OLD: "You are a semantic memory compressor"
+```
+## FIRST: Generate Session Title
+IMMEDIATELY generate a title and subtitle for this session based on the user request.
+
+## THEN: Process Tool Responses
+You will receive a stream of tool responses. For each one:
+1. ANALYZE: Does this contain information worth remembering?
+2. DECIDE: Should I store this or skip it?
+3. EXTRACT: What are the key semantic concepts?
+4. DECOMPOSE: Break into title + subtitle + atomic facts + narrative
+5. STORE: Use bash to save the hierarchical memory
+6. TRACK: Keep count of stored memories (001, 002, 003...)
+
+# IMPORTANT REMINDERS
+- Be selective - quality over quantity
+```
+
+**Rating**: ⚠️ **Problematic** but contains ✅ **Smart** elements
+**Clarity**: 6/10
+
+**Issues**:
+1. "IMMEDIATELY" vs "THEN" creates ordering confusion
+2. Session title generation is unclear when it should happen
+3. Bash tool dependency is fragile
+4. Manual counter tracking is error-prone
+
+**Smart Elements**:
+1. "Quality over quantity" is clear directive
+2. Hierarchical decomposition gives structure
+3. Explicit state tracking (counter)
+4. "Be selective" is unambiguous
+
+**Verdict**: The philosophy is better (selective, quality-focused), execution is messier (bash commands, ordering confusion)
+
+---
+
+### OBSERVATION PROMPTS
+
+#### CURRENT: buildObservationPrompt
+```
+ANALYSIS TASK
+-------------
+ANALYZE this tool response and DECIDE: Does it contain something worth storing?
+
+Most Read, Edit, Grep, Bash, and Write operations contain meaningful content.
+
+If this contains something worth remembering, output the observation...
+```
+
+**Rating**: ❌ **Stupid** + 🧠 **Context Poison**
+**Clarity**: 4/10
+
+**Issues**:
+1. "Most Read, Edit, Grep, Bash, and Write operations contain meaningful content" - Why repeat this per observation?
+2. Contradicts init prompt's "be selective" guidance
+3. Creates bias: agent thinks "well, this is a Read, so I should probably store it"
+4. No guidance on WHAT to extract from the content
+5. One-sentence observations lose context
+
+**Why Context Poison**:
+- Tool-type bias ("It's a Read, so store it") instead of content-based evaluation
+- Encourages lazy pattern matching instead of semantic analysis
+- Results in garbage like "Read package.json file" with no insight
+
+---
+
+#### OLD: Tool Message
+```
+# Tool Response {timeFormatted}
+
+Tool: {toolName}
+User Context: "{userPrompt}"
+
+```
+{toolResponse}
+```
+
+Analyze and store if meaningful.
+```
+
+**Rating**: ✅ **Smart** (minimal, non-leading)
+**Clarity**: 8/10
+
+**Smart Elements**:
+1. Doesn't tell agent what to think about tool types
+2. Trusts agent's judgment from system prompt
+3. Short and clear
+4. Includes user context for relevance filtering
+
+**Issue**:
+1. Maybe TOO minimal - no reminder of format requirements
+
+---
+
+### STORAGE FORMATS
+
+#### CURRENT: XML Observations
+```xml
+<observation>
+  <type>feature</type>
+  <text>Implemented JWT token refresh flow with 7-day expiry</text>
+</observation>
+```
+
+**Rating**: ⚠️ **Problematic**
+**Clarity**: 7/10
+
+**Issues**:
+1. One sentence only - loses narrative context
+2. Five types (decision, bugfix, feature, refactor, discovery) - are these actually useful categories?
+3. No file associations
+4. No concept tagging
+5. Flat structure - all observations equal weight
+
+**Smart Elements**:
+1. Simple to parse
+2. Structured typing
+3. Regex-parseable
+
+---
+
+#### OLD: Hierarchical Memory (4 levels)
+```bash
+--title "SDK Transcript Auto-Cleanup"
+--subtitle "Automatic deletion of SDK transcripts after completion prevents memory conversations from appearing in UI history"
+--facts '["stop-streaming.js: Deletes SDK transcript after overview generation", "Path: ~/.claude/projects/{sanitized-cwd}/{sessionId}.jsonl"]'
+--concepts '["cleanup", "SDK-lifecycle", "UX"]'
+--files '["hooks/stop-streaming.js"]'
+```
+
+**Rating**: ✅ **Smart** (structure) but ❌ **Stupid** (execution via bash)
+**Clarity**: 8/10 (concept), 3/10 (implementation)
+
+**Smart Elements**:
+1. Multiple levels of granularity (title → subtitle → facts → narrative)
+2. Atomic facts enable precise retrieval
+3. File associations explicit
+4. Concept tags for categorization
+5. Subtitle gives the "why it matters"
+
+**Stupid Elements**:
+1. Bash command execution is fragile
+2. Quote escaping nightmare
+3. Manual counter tracking
+4. JSON in bash arguments is error-prone
+
+**Verdict**: Great data model, terrible implementation
+
+---
+
+### SUMMARY/FINALIZE PROMPTS
+
+#### CURRENT: buildFinalizePrompt (per prompt)
+```xml
+<summary>
+  <request>Implement JWT authentication system</request>
+  <investigated>Existing auth middleware, session management</investigated>
+  <learned>Current system uses session cookies; no JWT support</learned>
+  <completed>Implemented JWT token + refresh flow</completed>
+  <next_steps>Add token revocation API endpoint</next_steps>
+  <files_read><file>src/auth.ts</file></files_read>
+  <files_edited><file>src/auth.ts</file></files_edited>
+  <notes>Token secret stored in .env</notes>
+</summary>
+```
+
+**Rating**: ✅ **Smart** (structure) but 🔍 **No Clear Purpose** (frequency)
+**Clarity**: 9/10
+
+**Smart Elements**:
+1. Structured format with clear fields
+2. Tracks what was learned (semantic value)
+3. Files read/edited tracked explicitly
+4. Next steps captured
+
+**Issues**:
+1. Generated PER PROMPT - is this too granular?
+2. Will create many summaries per session
+3. Unclear how these summaries are used
+4. No aggregation across prompts
+
+**Question**: Should this be per-session instead of per-prompt?
+
+---
+
+#### OLD: Session Overview (per session)
+```bash
+claude-mem store-overview --project "{project}" --session "{sessionId}" --content "2-3 sentence overview"
+```
+
+**Rating**: ⚠️ **Problematic**
+**Clarity**: 5/10
+
+**Issues**:
+1. Only 2-3 sentences - very lossy
+2. No structured fields
+3. Happens once at end - loses per-prompt context
+4. Relies on agent's memory of entire session
+
+**Smart Element**:
+1. One overview per session (not noisy)
+
+---
+
+### DECISION GUIDANCE
+
+#### CURRENT: What to Store/Skip
+```
+Store these:
+✓ File contents with logic, algorithms, or patterns
+✓ Search results revealing project structure
+✓ Build errors or test failures with context
+...
+
+Skip these:
+✗ Simple status checks (git status with no changes)
+✗ Trivial edits (one-line config changes)
+...
+```
+
+**Rating**: ✅ **Smart**
+**Clarity**: 8/10
+
+**Smart Elements**:
+1. Concrete examples
+2. Both positive and negative cases
+3. Action-oriented
+
+**Issue**:
+1. Contradicted by "For MOST" and "Most Read, Edit..." statements elsewhere
+
+---
+
+#### OLD: What to Store/Skip
+```
+Store these:
+- File contents with logic, algorithms, or patterns
+- Search results revealing project structure
+...
+
+Skip these:
+- Simple status checks (git status with no changes)
+- Trivial edits (one-line config changes)
+- Binary data or noise
+- Anything without semantic value
+```
+
+**Rating**: ✅ **Smart**
+**Clarity**: 8/10
+
+**Same as current**, which is good.
+
+---
+
+## CRITICAL ISSUES RANKED
+
+### 1. "For MOST meaningful tool outputs" - 🧠 **CONTEXT POISON #1**
+**Severity**: CRITICAL
+**Impact**: Destroys selectivity, fills DB with noise
+**Fix**: Remove entirely. Replace with: "Be selective. Only store if it reveals important information about the codebase."
+
+---
+
+### 2. "Most Read, Edit, Grep, Bash, and Write operations contain meaningful content" - 🧠 **CONTEXT POISON #2**
+**Severity**: CRITICAL
+**Impact**: Creates tool-type bias instead of content-based evaluation
+**Fix**: Remove entirely. It's redundant and harmful.
+
+---
+
+### 3. One-sentence observations lose context - ❌ **STUPID**
+**Severity**: HIGH
+**Impact**: Can't understand observation without narrative
+**Fix**: Add narrative field to observations (like old system)
+
+---
+
+### 4. No hierarchical structure in current system - ❌ **STUPID**
+**Severity**: HIGH
+**Impact**: Can't do granular retrieval (fact-level vs narrative-level)
+**Fix**: Adopt 4-level hierarchy from old system
+
+---
+
+### 5. Bash command execution in old system - ❌ **STUPID**
+**Severity**: HIGH
+**Impact**: Fragile, error-prone, quote-escaping nightmare
+**Fix**: Keep current approach (XML parsing + direct DB writes)
+
+---
+
+### 6. Manual memory counter in old system - ⚠️ **PROBLEMATIC**
+**Severity**: MEDIUM
+**Impact**: Agent forgets, skips numbers, duplicates
+**Fix**: Auto-increment in database (current approach)
+
+---
+
+### 7. Per-prompt summaries unclear purpose - 🔍 **NO CLEAR PURPOSE**
+**Severity**: MEDIUM
+**Impact**: Creates many summaries, unclear how they're used
+**Fix**: Decide: per-session summary only, or per-prompt with aggregation?
+
+---
+
+### 8. Five observation types unclear value - 🔍 **NO CLEAR PURPOSE**
+**Severity**: LOW
+**Impact**: Are these categories actually useful for retrieval?
+**Fix**: Evaluate if types should be: (1) kept as-is, (2) expanded, (3) removed
+
+---
+
+## BEST ELEMENTS FROM EACH SYSTEM
+
+### From OLD System (Keep These)
+1. ✅ 4-level hierarchy (title → subtitle → facts → narrative)
+2. ✅ "Be selective - quality over quantity"
+3. ✅ Atomic facts (50-150 char, self-contained, no pronouns)
+4. ✅ Concept tagging
+5. ✅ File associations
+6. ✅ Minimal observation prompts (don't bias agent)
+
+### From CURRENT System (Keep These)
+1. ✅ XML parsing (not bash commands)
+2. ✅ Auto-increment IDs (not manual counters)
+3. ✅ Structured summary format (8 fields)
+4. ✅ Per-prompt tracking
+5. ✅ Foreign key integrity
+6. ✅ Typed observations (decision/bugfix/feature/refactor/discovery)
+
+### From NEITHER System (Add These)
+1. Clear threshold guidance: "Only store if it reveals important information about the codebase"
+2. Explicit narrative field in observations
+3. Vector embeddings for semantic search (current stores in SQLite only)
+
+---
+
+## RECOMMENDED HYBRID SYSTEM
+
+### Storage Format: Hierarchical Observations (XML)
+```xml
+<observation>
+  <type>feature</type>
+  <title>JWT Token Refresh Implementation</title>
+  <subtitle>Added 7-day refresh token rotation with Redis storage</subtitle>
+  <facts>
+    <fact>src/auth.ts: refreshToken() generates new JWT with 7-day expiry</fact>
+    <fact>Redis key format: refresh:{userId}:{tokenId} with TTL 604800s</fact>
+    <fact>Old token invalidated on refresh to prevent replay attacks</fact>
+  </facts>
+  <narrative>Implemented JWT refresh token functionality in src/auth.ts. The refreshToken() function validates the old refresh token from Redis, generates a new JWT access token (7-day expiry) and new refresh token, stores the new refresh token in Redis with key format refresh:{userId}:{tokenId} and TTL of 604800 seconds (7 days), and invalidates the old refresh token to prevent replay attacks. This enables long-lived authenticated sessions without requiring users to re-login while maintaining security through token rotation.</narrative>
+  <concepts>
+    <concept>authentication</concept>
+    <concept>security</concept>
+    <concept>session-management</concept>
+  </concepts>
+  <files>
+    <file>src/auth.ts</file>
+    <file>src/middleware/auth.ts</file>
+  </files>
+</observation>
+```
+
+### Guidance: Clear and Unambiguous
+```
+Be selective. Only store observations when the tool output reveals important information about:
+- Architecture or design patterns
+- Implementation details of features or bug fixes
+- System state or configuration
+- Business logic or algorithms
+
+Skip routine operations like empty git status, simple npm installs, or trivial config changes.
+
+Each observation should be self-contained and searchable.
+```
+
+### Summary: Per-Session (Not Per-Prompt)
+- Generate ONE summary when session ends
+- Aggregate all observations from session
+- Use current structured format (request, investigated, learned, completed, next_steps, files_read, files_edited, notes)
+
+---
+
+## FINAL VERDICT
+
+| Element | Current | Old | Winner |
+|---------|---------|-----|--------|
+| **Storage Structure** | Flat one-sentence | 4-level hierarchy | **OLD** |
+| **Storage Implementation** | XML parsing | Bash commands | **CURRENT** |
+| **Decision Guidance** | Contradictory | Clear | **OLD** |
+| **Session Metadata** | None | Title + subtitle | **OLD** |
+| **Per-Prompt Tracking** | Yes (summaries) | No | **CURRENT** |
+| **Semantic Search** | No | Yes (ChromaDB) | **OLD** |
+| **Observation Prompts** | Biased, repetitive | Minimal, clear | **OLD** |
+| **Auto-Increment IDs** | Yes | No (manual) | **CURRENT** |
+| **File Associations** | No | Yes | **OLD** |
+| **Concept Tagging** | No | Yes | **OLD** |
+
+**Optimal System**: Hybrid - Old system's data model + Current system's implementation approach
@@ -0,0 +1,389 @@
+# Recommended Prompt Flow (Hybrid System)
+
+## Design Principles
+
+1. **Be Selective**: Quality over quantity - only store meaningful insights
+2. **Hierarchical Storage**: Multiple levels for granular retrieval
+3. **Clear Guidance**: No ambiguous language like "MOST"
+4. **Structured Data**: XML format with clear schema
+5. **Session Tracking**: Title + subtitle per session
+6. **Per-Prompt Context**: Track which observations came from which user request
+
+---
+
+## Flow Timeline
+
+### 1. SESSION START
+
+**Trigger**: Claude Code session starts
+**Hook**: `user-prompt-submit` (context-hook.js)
+
+**Init Prompt Sent to SDK**:
+```
+You are a memory processor for the "{project}" project.
+
+SESSION CONTEXT
+---------------
+Session ID: {sessionId}
+User's Goal: {userPrompt}
+Date: {date}
+
+YOUR ROLE
+---------
+Process tool executions from this Claude Code session and store important observations.
+
+Be selective. Only store observations when the tool output reveals important information about:
+- Architecture or design patterns
+- Implementation details of features or bug fixes
+- System state or configuration
+- Business logic or algorithms
+
+Skip routine operations like:
+- Empty git status checks
+- Simple npm install output
+- Trivial config changes
+- Repetitive operations
+
+OBSERVATION FORMAT
+------------------
+When you identify something worth remembering, output this XML structure:
+
+```xml
+<observation>
+  <type>feature</type>
+  <title>Short Title (3-8 words)</title>
+  <subtitle>Concise summary explaining the significance (max 24 words)</subtitle>
+  <facts>
+    <fact>Specific detail 1 (50-150 chars, self-contained)</fact>
+    <fact>Specific detail 2 (50-150 chars, self-contained)</fact>
+    <fact>Specific detail 3 (50-150 chars, self-contained)</fact>
+  </facts>
+  <narrative>Full context: what was done, why it matters, how it works. (200-400 words)</narrative>
+  <concepts>
+    <concept>broad-category-1</concept>
+    <concept>broad-category-2</concept>
+  </concepts>
+  <files>
+    <file>path/to/file1.ts</file>
+    <file>path/to/file2.ts</file>
+  </files>
+</observation>
+```
+
+FIELD REQUIREMENTS
+------------------
+
+**Type**: One of: decision, bugfix, feature, refactor, discovery
+
+**Title**: 3-8 words capturing the core action
+- Examples: "JWT Refresh Token Implementation", "Race Condition Fix in Auth Middleware"
+
+**Subtitle**: Max 24 words explaining the significance
+- Focus on outcome or benefit
+- Examples: "Added 7-day refresh token rotation with Redis storage for secure long-lived sessions"
+
+**Facts**: 3-7 atomic facts (50-150 chars each)
+- Each fact is ONE specific piece of information
+- Include filename/component when relevant
+- No pronouns - each fact stands alone
+- Examples:
+  - "src/auth.ts: refreshToken() generates new JWT with 7-day expiry"
+  - "Redis key format: refresh:{userId}:{tokenId} with TTL 604800s"
+  - "Old token invalidated on refresh to prevent replay attacks"
+
+**Narrative**: Full story (200-400 words)
+- What was done
+- Technical details (files, functions, implementation)
+- Why it matters
+
+**Concepts**: 2-5 broad categories for filtering
+- Examples: "authentication", "caching", "error-handling"
+
+**Files**: All files touched
+- Full relative paths from project root
+
+Ready to process tool executions.
+```
+
+---
+
+### 2. TOOL EXECUTION
+
+**Trigger**: Each tool execution
+**Hook**: `user-prompt-submit` (context-hook.js)
+
+**Observation Prompt Sent to SDK**:
+```
+TOOL OBSERVATION
+================
+Tool: {tool_name}
+Time: {timestamp}
+Prompt: {prompt_number}
+
+Input:
+{tool_input JSON}
+
+Output:
+{tool_output JSON}
+
+Analyze this tool output. If it reveals important information about the codebase, generate an observation using the XML format from the init prompt.
+```
+
+**SDK Response Processing**:
+1. SDK agent analyzes output
+2. If meaningful, generates `<observation>` XML block
+3. Worker parses XML and stores in SQLite
+4. Links to `prompt_number` for per-request tracking
+
+**Database Schema**:
+```sql
+CREATE TABLE observations (
+  id INTEGER PRIMARY KEY AUTOINCREMENT,
+  sdk_session_id TEXT NOT NULL,
+  project TEXT NOT NULL,
+  prompt_number INTEGER NOT NULL,
+
+  -- Core fields
+  type TEXT NOT NULL,
+  title TEXT NOT NULL,
+  subtitle TEXT NOT NULL,
+  narrative TEXT NOT NULL,
+
+  -- Arrays (stored as JSON)
+  facts TEXT NOT NULL,      -- JSON array of strings
+  concepts TEXT NOT NULL,   -- JSON array of strings
+  files TEXT NOT NULL,      -- JSON array of strings
+
+  created_at INTEGER NOT NULL,
+
+  FOREIGN KEY(sdk_session_id) REFERENCES sdk_sessions(sdk_session_id) ON DELETE CASCADE
+);
+
+-- Indexes for fast retrieval
+CREATE INDEX idx_observations_session ON observations(sdk_session_id);
+CREATE INDEX idx_observations_type ON observations(type);
+CREATE INDEX idx_observations_prompt ON observations(prompt_number);
+```
+
+---
+
+### 3. SESSION END
+
+**Trigger**: Claude Code session ends
+**Hook**: `session-end` (cleanup-hook.js)
+
+**Finalize Prompt Sent to SDK**:
+```
+SESSION ENDING
+==============
+The Claude Code session is completing.
+
+FINAL TASK
+----------
+Review all observations you've generated and create a session summary.
+
+Output this XML structure:
+
+```xml
+<summary>
+  <request>What did the user request?</request>
+  <investigated>What code/systems did you explore?</investigated>
+  <learned>What did you learn about the codebase?</learned>
+  <completed>What was accomplished?</completed>
+  <next_steps>What should happen next?</next_steps>
+  <files_read>
+    <file>path/to/file1.ts</file>
+    <file>path/to/file2.ts</file>
+  </files_read>
+  <files_edited>
+    <file>path/to/file3.ts</file>
+  </files_edited>
+  <notes>Additional context or insights</notes>
+</summary>
+```
+
+Be concise but comprehensive. Focus on semantic insights, not mechanical details.
+```
+
+**Database Schema**:
+```sql
+CREATE TABLE session_summaries (
+  id INTEGER PRIMARY KEY AUTOINCREMENT,
+  sdk_session_id TEXT NOT NULL,
+  project TEXT NOT NULL,
+
+  request TEXT NOT NULL,
+  investigated TEXT NOT NULL,
+  learned TEXT NOT NULL,
+  completed TEXT NOT NULL,
+  next_steps TEXT NOT NULL,
+  files_read TEXT NOT NULL,   -- JSON array
+  files_edited TEXT NOT NULL, -- JSON array
+  notes TEXT NOT NULL,
+
+  created_at INTEGER NOT NULL,
+
+  FOREIGN KEY(sdk_session_id) REFERENCES sdk_sessions(sdk_session_id) ON DELETE CASCADE
+);
+```
+
+---
+
+## Data Retrieval Patterns
+
+### Level 1: Session Titles (High-Level Browsing)
+```sql
+SELECT
+  sdk_session_id,
+  user_prompt as title,
+  created_at
+FROM sdk_sessions
+WHERE project = ?
+ORDER BY created_at DESC;
+```
+
+### Level 2: Session Summaries (Session Overview)
+```sql
+SELECT
+  request,
+  completed,
+  next_steps
+FROM session_summaries
+WHERE sdk_session_id = ?;
+```
+
+### Level 3: Observation Titles (Scannable List)
+```sql
+SELECT
+  type,
+  title,
+  subtitle
+FROM observations
+WHERE sdk_session_id = ?
+ORDER BY id;
+```
+
+### Level 4: Atomic Facts (Precise Search)
+```sql
+SELECT
+  title,
+  facts
+FROM observations
+WHERE
+  sdk_session_id = ?
+  AND facts LIKE '%keyword%';
+```
+
+### Level 5: Full Narrative (Deep Dive)
+```sql
+SELECT
+  title,
+  subtitle,
+  facts,
+  narrative,
+  files
+FROM observations
+WHERE id = ?;
+```
+
+### By Concept (Category Filter)
+```sql
+SELECT
+  title,
+  subtitle,
+  concepts
+FROM observations
+WHERE concepts LIKE '%"authentication"%';
+```
+
+### By File (File-Based Search)
+```sql
+SELECT
+  title,
+  subtitle,
+  files
+FROM observations
+WHERE files LIKE '%src/auth.ts%';
+```
+
+---
+
+## Future Enhancements
+
+### Phase 2: Semantic Search
+- Add vector embeddings for facts and narratives
+- Store in ChromaDB or similar
+- Enable similarity search: "Find observations about authentication patterns"
+
+### Phase 3: Cross-Session Memory
+- Link related observations across sessions
+- "Show all JWT-related observations from past 30 days"
+
+### Phase 4: Session Metadata
+- Add title/subtitle to sdk_sessions table
+- Auto-generate from user_prompt or first summary
+
+---
+
+## Migration from Current System
+
+### Step 1: Update Database Schema
+```sql
+-- Add new columns to observations table
+ALTER TABLE observations ADD COLUMN title TEXT;
+ALTER TABLE observations ADD COLUMN subtitle TEXT;
+ALTER TABLE observations ADD COLUMN narrative TEXT;
+ALTER TABLE observations ADD COLUMN facts TEXT;
+ALTER TABLE observations ADD COLUMN concepts TEXT;
+ALTER TABLE observations ADD COLUMN files TEXT;
+
+-- Migrate existing observations (best-effort)
+UPDATE observations
+SET
+  title = type || ' - ' || substr(text, 1, 50),
+  subtitle = text,
+  narrative = text,
+  facts = '[]',
+  concepts = '[]',
+  files = '[]'
+WHERE title IS NULL;
+```
+
+### Step 2: Update Prompts
+- Replace `buildInitPrompt()` with new version (no "MOST")
+- Replace `buildObservationPrompt()` with new version (no tool-type bias)
+- Keep `buildFinalizePrompt()` mostly as-is
+
+### Step 3: Update Parser
+- Extend `parseObservations()` to extract all new fields
+- Add `extractFactArray()`, `extractConceptArray()`, `extractFileArray()` helpers
+- Keep backward compatibility with old one-sentence format
+
+### Step 4: Update Storage
+- Modify `HooksDatabase.storeObservation()` to accept all fields
+- Store arrays as JSON strings
+
+---
+
+## Key Improvements Over Current System
+
+1. ✅ **No "MOST" ambiguity** - Clear "be selective" guidance
+2. ✅ **No tool-type bias** - Observation prompt doesn't mention tool names
+3. ✅ **Hierarchical storage** - Title → Subtitle → Facts → Narrative
+4. ✅ **Atomic facts** - Precise, searchable details
+5. ✅ **File associations** - Track which files each observation relates to
+6. ✅ **Concept tagging** - Categorical organization
+7. ✅ **Rich narratives** - Full context for deep dives
+8. ✅ **Multiple retrieval levels** - Can search at any granularity
+
+---
+
+## Key Improvements Over Old System
+
+1. ✅ **No bash commands** - XML parsing instead of shell execution
+2. ✅ **Auto-increment IDs** - No manual counter tracking
+3. ✅ **Per-prompt tracking** - `prompt_number` links observations to requests
+4. ✅ **Foreign key integrity** - Automatic cascade deletes
+5. ✅ **No quote escaping hell** - JSON arrays instead of bash arguments
+6. ✅ **Structured typing** - Typed observations (decision/bugfix/feature/refactor/discovery)
+7. ✅ **Session summary at end** - Not just 2-3 sentences, but full structured summary
@@ -380,22 +380,7 @@ class WorkerService {
          db.close();

          if (dbSession) {
-            const summarizePrompt = `You have been processing tool observations for this session. Please generate a summary of what you've learned from prompt #${message.prompt_number}.
-
-Use this XML format:
-
-<session_summary>
-  <request>What was the user trying to accomplish in this prompt?</request>
-  <investigated>What code/systems did you explore?</investigated>
-  <learned>What did you learn about the codebase?</learned>
-  <completed>What was done or determined?</completed>
-  <next_steps>What should happen next?</next_steps>
-  <files_read>["file1.ts", "file2.ts"]</files_read>
-  <files_edited>["file3.ts"]</files_edited>
-  <notes>Any additional context or insights</notes>
-</session_summary>
-
-Respond ONLY with the XML block. Be concise and specific.`;
+            const summarizePrompt = buildFinalizePrompt(dbSession);

            console.log(`[WorkerService] Yielding summarize prompt:\n${summarizePrompt}`);