# Progressive Disclosure: Claude-Mem's Context Priming Philosophy ## Core Principle **Show what exists and its retrieval cost first. Let the agent decide what to fetch based on relevance and need.** --- ## What is Progressive Disclosure? Progressive disclosure is an information architecture pattern where you reveal complexity gradually rather than all at once. In the context of AI agents, it means: 1. **Layer 1 (Index)**: Show lightweight metadata (titles, dates, types, token counts) 2. **Layer 2 (Details)**: Fetch full content only when needed 3. **Layer 3 (Deep Dive)**: Read original source files if required This mirrors how humans work: We scan headlines before reading articles, review table of contents before diving into chapters, and check file names before opening files. --- ## The Problem: Context Pollution Traditional RAG (Retrieval-Augmented Generation) systems fetch everything upfront: ``` ❌ Traditional Approach: ┌─────────────────────────────────────┐ │ Session Start │ │ │ │ [15,000 tokens of past sessions] │ │ [8,000 tokens of observations] │ │ [12,000 tokens of file summaries] │ │ │ │ Total: 35,000 tokens │ │ Relevant: ~2,000 tokens (6%) │ └─────────────────────────────────────┘ ``` **Problems:** - Wastes 94% of attention budget on irrelevant context - User prompt gets buried under mountain of history - Agent must process everything before understanding task - No way to know what's actually useful until after reading --- ## Claude-Mem's Solution: Progressive Disclosure ``` ✅ Progressive Disclosure Approach: ┌─────────────────────────────────────┐ │ Session Start │ │ │ │ Index of 50 observations: ~800 tokens│ │ ↓ │ │ Agent sees: "🔴 Hook timeout issue" │ │ Agent decides: "Relevant!" │ │ ↓ │ │ Fetch observation #2543: ~120 tokens│ │ │ │ Total: 920 tokens │ │ Relevant: 920 tokens (100%) │ └─────────────────────────────────────┘ ``` **Benefits:** - Agent controls its own context consumption - Directly relevant to current task - Can fetch more if needed - Can skip everything if not relevant - Clear cost/benefit for each retrieval decision --- ## How It Works in Claude-Mem ### The Index Format Every SessionStart hook provides a compact index: ```markdown ### Oct 26, 2025 **General** | ID | Time | T | Title | Tokens | |----|------|---|-------|--------| | #2586 | 12:58 AM | 🔵 | Context hook file exists but is empty | ~51 | | #2587 | ″ | 🔵 | Context hook script file is empty | ~46 | | #2589 | ″ | 🟡 | Investigated hook debug output docs | ~105 | **src/hooks/context-hook.ts** | ID | Time | T | Title | Tokens | |----|------|---|-------|--------| | #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 | | #2592 | 1:16 AM | ⚖️ | Web UI strategy redesigned | ~193 | ``` **What the agent sees:** - **What exists**: Observation titles give semantic meaning - **When it happened**: Timestamps for temporal context - **What type**: Icons indicate observation category - **Retrieval cost**: Token counts for informed decisions - **Where to get it**: MCP search tools referenced at bottom ### The Legend System ``` 🎯 session-request - User's original goal 🔴 gotcha - Critical edge case or pitfall 🟡 problem-solution - Bug fix or workaround 🔵 how-it-works - Technical explanation 🟢 what-changed - Code/architecture change 🟣 discovery - Learning or insight 🟠 why-it-exists - Design rationale 🟤 decision - Architecture decision ⚖️ trade-off - Deliberate compromise ``` **Purpose:** - Visual scanning (humans and AI both benefit) - Semantic categorization - Priority signaling (🔴 gotchas are more critical) - Pattern recognition across sessions ### Progressive Disclosure Instructions The index includes usage guidance: ```markdown 💡 **Progressive Disclosure:** This index shows WHAT exists and retrieval COST. - Use MCP search tools to fetch full observation details on-demand - Prefer searching observations over re-reading code for past decisions - Critical types (🔴 gotcha, 🟤 decision, ⚖️ trade-off) often worth fetching immediately ``` **What this does:** - Teaches the agent the pattern - Suggests when to fetch (critical types) - Recommends search over code re-reading (efficiency) - Makes the system self-documenting --- ## The Philosophy: Context as Currency ### Mental Model: Token Budget as Money Think of context window as a bank account: | Approach | Metaphor | Outcome | |----------|----------|---------| | **Dump everything** | Spending your entire paycheck on groceries you might need someday | Waste, clutter, can't afford what you actually need | | **Fetch nothing** | Refusing to spend any money | Starvation, can't accomplish tasks | | **Progressive disclosure** | Check your pantry, make a shopping list, buy only what you need | Efficiency, room for unexpected needs | ### The Attention Budget LLMs have finite attention: - Every token attends to every other token (n² relationships) - 100,000 token window ≠ 100,000 tokens of useful attention - Context "rot" happens as window fills - Later tokens get less attention than earlier ones **Claude-Mem's approach:** - Start with ~1,000 tokens of index - Agent has 99,000 tokens free for task - Agent fetches ~200 tokens when needed - Final budget: ~98,000 tokens for actual work ### Design for Autonomy > "As models improve, let them act intelligently" Progressive disclosure treats the agent as an **intelligent information forager**, not a passive recipient of pre-selected context. **Traditional RAG:** ``` System → [Decides relevance] → Agent ↑ Hope this helps! ``` **Progressive Disclosure:** ``` System → [Shows index] → Agent → [Decides relevance] → [Fetches details] ↑ You know best! ``` The agent knows: - The current task context - What information would help - How much budget to spend - When to stop searching We don't. --- ## Implementation Principles ### 1. Make Costs Visible Every item in the index shows token count: ``` | #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 | ^^^^ Retrieval cost ``` **Why:** - Agent can make informed ROI decisions - Small observations (~50 tokens) are "cheap" to fetch - Large observations (~500 tokens) require stronger justification - Matches how humans think about effort ### 2. Use Semantic Compression Titles compress full observations into ~10 words: **Bad title:** ``` Observation about a thing ``` **Good title:** ``` 🔴 Hook timeout issue: 60s default too short for npm install ``` **What makes a good title:** - Specific: Identifies exact issue - Actionable: Clear what to do - Self-contained: Doesn't require reading observation - Searchable: Contains key terms (hook, timeout, npm) - Categorized: Icon indicates type ### 3. Group by Context Observations are grouped by: - **Date**: Temporal context - **File path**: Spatial context (work on specific files) - **Project**: Logical context ```markdown **src/hooks/context-hook.ts** | ID | Time | T | Title | Tokens | |----|------|---|-------|--------| | #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 | | #2594 | 1:17 AM | 🟠 | Removed stderr section from docs | ~93 | ``` **Benefit:** If agent is working on `src/hooks/context-hook.ts`, related observations are already grouped together. ### 4. Provide Retrieval Tools The index is useless without retrieval mechanisms: ```markdown *Use claude-mem MCP search to access records with the given ID* ``` **Available tools:** - `search_observations` - Full-text search - `find_by_concept` - Concept-based retrieval - `find_by_file` - File-based retrieval - `find_by_type` - Type-based retrieval - `get_recent_context` - Recent session summaries Each tool supports `format: "index"` (default) and `format: "full"`. --- ## Real-World Example ### Scenario: Agent asked to fix a bug in hooks **Without progressive disclosure:** ``` SessionStart injects 25,000 tokens of past context Agent reads everything Agent finds 1 relevant observation (buried in middle) Total tokens consumed: 25,000 Relevant tokens: ~200 Efficiency: 0.8% ``` **With progressive disclosure:** ``` SessionStart shows index: ~800 tokens Agent sees title: "🔴 Hook timeout issue: 60s too short" Agent thinks: "This looks relevant to my bug!" Agent fetches observation #2543: ~155 tokens Total tokens consumed: 955 Relevant tokens: 955 Efficiency: 100% ``` ### The Index Entry ```markdown | #2543 | 2:14 PM | 🔴 | Hook timeout: 60s too short for npm install | ~155 | ``` **What the agent learns WITHOUT fetching:** - There's a known gotcha (🔴) about hook timeouts - It's related to npm install taking too long - Full details are ~155 tokens (cheap) - Happened at 2:14 PM (recent) **Decision tree:** ``` Is my task related to hooks? → YES Is my task related to timeouts? → YES Is my task related to npm? → YES 155 tokens is cheap → FETCH IT ``` --- ## The Two-Tier Search Strategy Claude-Mem implements progressive disclosure in search results too: ### Tier 1: Index Format (Default) ```typescript search_observations({ query: "hook timeout", format: "index" // Default }) ``` **Returns:** ``` Found 3 observations matching "hook timeout": | ID | Date | Type | Title | Tokens | |----|------|------|-------|--------| | #2543 | Oct 26 | gotcha | Hook timeout: 60s too short | ~155 | | #2891 | Oct 25 | how-it-works | Hook timeout configuration | ~203 | | #2102 | Oct 20 | problem-solution | Fixed timeout in CI | ~89 | ``` **Cost:** ~100 tokens for 3 results **Value:** Agent can scan and decide which to fetch ### Tier 2: Full Format (On-Demand) ```typescript search_observations({ query: "hook timeout", format: "full", limit: 1 // Fetch just the most relevant }) ``` **Returns:** ``` #2543 🔴 Hook timeout: 60s too short for npm install ───────────────────────────────────────────────── Date: Oct 26, 2025 2:14 PM Type: gotcha Project: claude-mem Narrative: Discovered that the default 60-second hook timeout is insufficient for npm install operations, especially with large dependency trees or slow network conditions. This causes SessionStart hook to fail silently, preventing context injection. Facts: - Default timeout: 60 seconds - npm install with cold cache: ~90 seconds - Configured timeout: 120 seconds in plugin/hooks/hooks.json:25 Files Modified: - plugin/hooks/hooks.json Concepts: hooks, timeout, npm, configuration ``` **Cost:** ~155 tokens for full details **Value:** Complete understanding of the issue --- ## Cognitive Load Theory Progressive disclosure is grounded in **Cognitive Load Theory**: ### Intrinsic Load The inherent difficulty of the task itself. **Example:** "Fix authentication bug" - Must understand auth system - Must understand the bug - Must write the fix This load is unavoidable. ### Extraneous Load The cognitive burden of poorly presented information. **Traditional RAG adds extraneous load:** - Scanning irrelevant observations - Filtering out noise - Remembering what to ignore - Re-contextualizing after each section **Progressive disclosure minimizes extraneous load:** - Scan titles (low effort) - Fetch only relevant (targeted effort) - Full attention on current task ### Germane Load The effort of building mental models and schemas. **Progressive disclosure supports germane load:** - Consistent structure (legend, grouping) - Clear categorization (types, icons) - Semantic compression (good titles) - Explicit costs (token counts) --- ## Anti-Patterns to Avoid ### ❌ Verbose Titles **Bad:** ``` | #2543 | 2:14 PM | 🔴 | Investigation into the issue where hooks time out | ~155 | ``` **Good:** ``` | #2543 | 2:14 PM | 🔴 | Hook timeout: 60s too short for npm install | ~155 | ``` ### ❌ Hiding Costs **Bad:** ``` | #2543 | 2:14 PM | 🔴 | Hook timeout issue | ``` **Good:** ``` | #2543 | 2:14 PM | 🔴 | Hook timeout issue | ~155 | ``` ### ❌ No Retrieval Path **Bad:** ``` Here are 10 observations. [No instructions on how to get full details] ``` **Good:** ``` Here are 10 observations. *Use MCP search tools to fetch full observation details on-demand* ``` ### ❌ Defaulting to Full Format **Bad:** ```typescript search_observations({ query: "hooks", format: "full" // Fetches everything }) ``` **Good:** ```typescript search_observations({ query: "hooks", format: "index", // Scan first limit: 20 }) // Then, if needed: search_observations({ query: "hooks", format: "full", limit: 1 // Just the most relevant }) ``` --- ## Key Design Decisions ### Why Token Counts? **Decision:** Show approximate token counts (~155, ~203) rather than exact counts. **Rationale:** - Communicates scale (50 vs 500) without false precision - Maps to human intuition (small/medium/large) - Allows agent to budget attention - Encourages cost-conscious retrieval ### Why Icons Instead of Text Labels? **Decision:** Use emoji icons (🔴, 🟡, 🔵) rather than text (GOTCHA, PROBLEM, HOWTO). **Rationale:** - Visual scanning (pattern recognition) - Token efficient (1 char vs 10 chars) - Language-agnostic - Aesthetically distinct - Works for both humans and AI ### Why Index-First, Not Smart Pre-Fetch? **Decision:** Always show index first, even if we "know" what's relevant. **Rationale:** - We can't know what's relevant better than the agent - Pre-fetching assumes we understand the task - Agent knows current context, we don't - Respects agent autonomy - Fails gracefully (can always fetch more) ### Why Group by File Path? **Decision:** Group observations by file path in addition to date. **Rationale:** - Spatial locality: Work on file X likely needs context about file X - Reduces scanning effort - Matches how developers think - Clear semantic boundaries --- ## Measuring Success Progressive disclosure is working when: ### ✅ Low Waste Ratio ``` Relevant Tokens / Total Context Tokens > 80% ``` Most of the context consumed is actually useful. ### ✅ Selective Fetching ``` Index Shown: 50 observations Details Fetched: 2-3 observations ``` Agent is being selective, not fetching everything. ### ✅ Fast Task Completion ``` Session with index: 30 seconds to find relevant context Session without: 90 seconds scanning all context ``` Time-to-relevant-information is faster. ### ✅ Appropriate Depth ``` Simple task: Only index needed Medium task: 1-2 observations fetched Complex task: 5-10 observations + code reads ``` Depth scales with task complexity. --- ## Future Enhancements ### Adaptive Index Size ```typescript // Vary index size based on session type SessionStart({ source: "startup" }): → Show last 10 sessions (small index) SessionStart({ source: "resume" }): → Show only current session (micro index) SessionStart({ source: "compact" }): → Show last 20 sessions (larger index) ``` ### Relevance Scoring ```typescript // Use embeddings to pre-sort index by relevance search_observations({ query: "authentication bug", format: "index", sort: "relevance" // Based on semantic similarity }) ``` ### Cost Forecasting ```markdown 💡 **Budget Estimate:** - Fetching all 🔴 gotchas: ~450 tokens - Fetching all file-related: ~1,200 tokens - Fetching everything: ~8,500 tokens ``` ### Progressive Detail Levels ``` Layer 1: Index (titles only) Layer 2: Summaries (2-3 sentences) Layer 3: Full details (complete observation) Layer 4: Source files (referenced code) ``` --- ## Key Takeaways 1. **Show, don't tell**: Index reveals what exists without forcing consumption 2. **Cost-conscious**: Make retrieval costs visible for informed decisions 3. **Agent autonomy**: Let the agent decide what's relevant 4. **Semantic compression**: Good titles make or break the system 5. **Consistent structure**: Patterns reduce cognitive load 6. **Two-tier everything**: Index first, details on-demand 7. **Context as currency**: Spend wisely on high-value information --- ## Remember > "The best interface is one that disappears when not needed, and appears exactly when it is." Progressive disclosure respects the agent's intelligence and autonomy. We provide the map; the agent chooses the path. --- ## Further Reading - [Context Engineering for AI Agents](context-engineering) - Foundational principles - [Claude-Mem Architecture](architecture/overview) - How it all fits together - Cognitive Load Theory (Sweller, 1988) - Information Foraging Theory (Pirolli & Card, 1999) - Progressive Disclosure (Nielsen Norman Group) --- *This philosophy emerged from real-world usage of Claude-Mem across hundreds of coding sessions. The pattern works because it aligns with both human cognition and LLM attention mechanics.*