b68ea38bcc
Changes: - Fixed 11 broken internal links across 3 documentation files - Updated relative paths to use correct format without /docs/ prefix - Removed broken CHANGELOG.md anchor links (mintlify doesn't support external file anchors) - Changed /docs/progressive-disclosure → progressive-disclosure - Changed /docs/hooks-architecture → hooks-architecture - Changed /docs/context-engineering → context-engineering - Changed /docs/architecture → architecture/overview - Changed /docs/worker-service → architecture/worker-service - Changed /docs/viewer-ui → VIEWER - Verified with mintlify broken-links: 0 broken links remaining 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
656 lines
17 KiB
Plaintext
656 lines
17 KiB
Plaintext
# Progressive Disclosure: Claude-Mem's Context Priming Philosophy
|
|
|
|
## Core Principle
|
|
**Show what exists and its retrieval cost first. Let the agent decide what to fetch based on relevance and need.**
|
|
|
|
---
|
|
|
|
## What is Progressive Disclosure?
|
|
|
|
Progressive disclosure is an information architecture pattern where you reveal complexity gradually rather than all at once. In the context of AI agents, it means:
|
|
|
|
1. **Layer 1 (Index)**: Show lightweight metadata (titles, dates, types, token counts)
|
|
2. **Layer 2 (Details)**: Fetch full content only when needed
|
|
3. **Layer 3 (Deep Dive)**: Read original source files if required
|
|
|
|
This mirrors how humans work: We scan headlines before reading articles, review table of contents before diving into chapters, and check file names before opening files.
|
|
|
|
---
|
|
|
|
## The Problem: Context Pollution
|
|
|
|
Traditional RAG (Retrieval-Augmented Generation) systems fetch everything upfront:
|
|
|
|
```
|
|
❌ Traditional Approach:
|
|
┌─────────────────────────────────────┐
|
|
│ Session Start │
|
|
│ │
|
|
│ [15,000 tokens of past sessions] │
|
|
│ [8,000 tokens of observations] │
|
|
│ [12,000 tokens of file summaries] │
|
|
│ │
|
|
│ Total: 35,000 tokens │
|
|
│ Relevant: ~2,000 tokens (6%) │
|
|
└─────────────────────────────────────┘
|
|
```
|
|
|
|
**Problems:**
|
|
- Wastes 94% of attention budget on irrelevant context
|
|
- User prompt gets buried under mountain of history
|
|
- Agent must process everything before understanding task
|
|
- No way to know what's actually useful until after reading
|
|
|
|
---
|
|
|
|
## Claude-Mem's Solution: Progressive Disclosure
|
|
|
|
```
|
|
✅ Progressive Disclosure Approach:
|
|
┌─────────────────────────────────────┐
|
|
│ Session Start │
|
|
│ │
|
|
│ Index of 50 observations: ~800 tokens│
|
|
│ ↓ │
|
|
│ Agent sees: "🔴 Hook timeout issue" │
|
|
│ Agent decides: "Relevant!" │
|
|
│ ↓ │
|
|
│ Fetch observation #2543: ~120 tokens│
|
|
│ │
|
|
│ Total: 920 tokens │
|
|
│ Relevant: 920 tokens (100%) │
|
|
└─────────────────────────────────────┘
|
|
```
|
|
|
|
**Benefits:**
|
|
- Agent controls its own context consumption
|
|
- Directly relevant to current task
|
|
- Can fetch more if needed
|
|
- Can skip everything if not relevant
|
|
- Clear cost/benefit for each retrieval decision
|
|
|
|
---
|
|
|
|
## How It Works in Claude-Mem
|
|
|
|
### The Index Format
|
|
|
|
Every SessionStart hook provides a compact index:
|
|
|
|
```markdown
|
|
### Oct 26, 2025
|
|
|
|
**General**
|
|
| ID | Time | T | Title | Tokens |
|
|
|----|------|---|-------|--------|
|
|
| #2586 | 12:58 AM | 🔵 | Context hook file exists but is empty | ~51 |
|
|
| #2587 | ″ | 🔵 | Context hook script file is empty | ~46 |
|
|
| #2589 | ″ | 🟡 | Investigated hook debug output docs | ~105 |
|
|
|
|
**src/hooks/context-hook.ts**
|
|
| ID | Time | T | Title | Tokens |
|
|
|----|------|---|-------|--------|
|
|
| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
|
|
| #2592 | 1:16 AM | ⚖️ | Web UI strategy redesigned | ~193 |
|
|
```
|
|
|
|
**What the agent sees:**
|
|
- **What exists**: Observation titles give semantic meaning
|
|
- **When it happened**: Timestamps for temporal context
|
|
- **What type**: Icons indicate observation category
|
|
- **Retrieval cost**: Token counts for informed decisions
|
|
- **Where to get it**: MCP search tools referenced at bottom
|
|
|
|
### The Legend System
|
|
|
|
```
|
|
🎯 session-request - User's original goal
|
|
🔴 gotcha - Critical edge case or pitfall
|
|
🟡 problem-solution - Bug fix or workaround
|
|
🔵 how-it-works - Technical explanation
|
|
🟢 what-changed - Code/architecture change
|
|
🟣 discovery - Learning or insight
|
|
🟠 why-it-exists - Design rationale
|
|
🟤 decision - Architecture decision
|
|
⚖️ trade-off - Deliberate compromise
|
|
```
|
|
|
|
**Purpose:**
|
|
- Visual scanning (humans and AI both benefit)
|
|
- Semantic categorization
|
|
- Priority signaling (🔴 gotchas are more critical)
|
|
- Pattern recognition across sessions
|
|
|
|
### Progressive Disclosure Instructions
|
|
|
|
The index includes usage guidance:
|
|
|
|
```markdown
|
|
💡 **Progressive Disclosure:** This index shows WHAT exists and retrieval COST.
|
|
- Use MCP search tools to fetch full observation details on-demand
|
|
- Prefer searching observations over re-reading code for past decisions
|
|
- Critical types (🔴 gotcha, 🟤 decision, ⚖️ trade-off) often worth fetching immediately
|
|
```
|
|
|
|
**What this does:**
|
|
- Teaches the agent the pattern
|
|
- Suggests when to fetch (critical types)
|
|
- Recommends search over code re-reading (efficiency)
|
|
- Makes the system self-documenting
|
|
|
|
---
|
|
|
|
## The Philosophy: Context as Currency
|
|
|
|
### Mental Model: Token Budget as Money
|
|
|
|
Think of context window as a bank account:
|
|
|
|
| Approach | Metaphor | Outcome |
|
|
|----------|----------|---------|
|
|
| **Dump everything** | Spending your entire paycheck on groceries you might need someday | Waste, clutter, can't afford what you actually need |
|
|
| **Fetch nothing** | Refusing to spend any money | Starvation, can't accomplish tasks |
|
|
| **Progressive disclosure** | Check your pantry, make a shopping list, buy only what you need | Efficiency, room for unexpected needs |
|
|
|
|
### The Attention Budget
|
|
|
|
LLMs have finite attention:
|
|
- Every token attends to every other token (n² relationships)
|
|
- 100,000 token window ≠ 100,000 tokens of useful attention
|
|
- Context "rot" happens as window fills
|
|
- Later tokens get less attention than earlier ones
|
|
|
|
**Claude-Mem's approach:**
|
|
- Start with ~1,000 tokens of index
|
|
- Agent has 99,000 tokens free for task
|
|
- Agent fetches ~200 tokens when needed
|
|
- Final budget: ~98,000 tokens for actual work
|
|
|
|
### Design for Autonomy
|
|
|
|
> "As models improve, let them act intelligently"
|
|
|
|
Progressive disclosure treats the agent as an **intelligent information forager**, not a passive recipient of pre-selected context.
|
|
|
|
**Traditional RAG:**
|
|
```
|
|
System → [Decides relevance] → Agent
|
|
↑
|
|
Hope this helps!
|
|
```
|
|
|
|
**Progressive Disclosure:**
|
|
```
|
|
System → [Shows index] → Agent → [Decides relevance] → [Fetches details]
|
|
↑
|
|
You know best!
|
|
```
|
|
|
|
The agent knows:
|
|
- The current task context
|
|
- What information would help
|
|
- How much budget to spend
|
|
- When to stop searching
|
|
|
|
We don't.
|
|
|
|
---
|
|
|
|
## Implementation Principles
|
|
|
|
### 1. Make Costs Visible
|
|
|
|
Every item in the index shows token count:
|
|
|
|
```
|
|
| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
|
|
^^^^
|
|
Retrieval cost
|
|
```
|
|
|
|
**Why:**
|
|
- Agent can make informed ROI decisions
|
|
- Small observations (~50 tokens) are "cheap" to fetch
|
|
- Large observations (~500 tokens) require stronger justification
|
|
- Matches how humans think about effort
|
|
|
|
### 2. Use Semantic Compression
|
|
|
|
Titles compress full observations into ~10 words:
|
|
|
|
**Bad title:**
|
|
```
|
|
Observation about a thing
|
|
```
|
|
|
|
**Good title:**
|
|
```
|
|
🔴 Hook timeout issue: 60s default too short for npm install
|
|
```
|
|
|
|
**What makes a good title:**
|
|
- Specific: Identifies exact issue
|
|
- Actionable: Clear what to do
|
|
- Self-contained: Doesn't require reading observation
|
|
- Searchable: Contains key terms (hook, timeout, npm)
|
|
- Categorized: Icon indicates type
|
|
|
|
### 3. Group by Context
|
|
|
|
Observations are grouped by:
|
|
- **Date**: Temporal context
|
|
- **File path**: Spatial context (work on specific files)
|
|
- **Project**: Logical context
|
|
|
|
```markdown
|
|
**src/hooks/context-hook.ts**
|
|
| ID | Time | T | Title | Tokens |
|
|
|----|------|---|-------|--------|
|
|
| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
|
|
| #2594 | 1:17 AM | 🟠 | Removed stderr section from docs | ~93 |
|
|
```
|
|
|
|
**Benefit:** If agent is working on `src/hooks/context-hook.ts`, related observations are already grouped together.
|
|
|
|
### 4. Provide Retrieval Tools
|
|
|
|
The index is useless without retrieval mechanisms:
|
|
|
|
```markdown
|
|
*Use claude-mem MCP search to access records with the given ID*
|
|
```
|
|
|
|
**Available tools:**
|
|
- `search_observations` - Full-text search
|
|
- `find_by_concept` - Concept-based retrieval
|
|
- `find_by_file` - File-based retrieval
|
|
- `find_by_type` - Type-based retrieval
|
|
- `get_recent_context` - Recent session summaries
|
|
|
|
Each tool supports `format: "index"` (default) and `format: "full"`.
|
|
|
|
---
|
|
|
|
## Real-World Example
|
|
|
|
### Scenario: Agent asked to fix a bug in hooks
|
|
|
|
**Without progressive disclosure:**
|
|
```
|
|
SessionStart injects 25,000 tokens of past context
|
|
Agent reads everything
|
|
Agent finds 1 relevant observation (buried in middle)
|
|
Total tokens consumed: 25,000
|
|
Relevant tokens: ~200
|
|
Efficiency: 0.8%
|
|
```
|
|
|
|
**With progressive disclosure:**
|
|
```
|
|
SessionStart shows index: ~800 tokens
|
|
Agent sees title: "🔴 Hook timeout issue: 60s too short"
|
|
Agent thinks: "This looks relevant to my bug!"
|
|
Agent fetches observation #2543: ~155 tokens
|
|
Total tokens consumed: 955
|
|
Relevant tokens: 955
|
|
Efficiency: 100%
|
|
```
|
|
|
|
### The Index Entry
|
|
|
|
```markdown
|
|
| #2543 | 2:14 PM | 🔴 | Hook timeout: 60s too short for npm install | ~155 |
|
|
```
|
|
|
|
**What the agent learns WITHOUT fetching:**
|
|
- There's a known gotcha (🔴) about hook timeouts
|
|
- It's related to npm install taking too long
|
|
- Full details are ~155 tokens (cheap)
|
|
- Happened at 2:14 PM (recent)
|
|
|
|
**Decision tree:**
|
|
```
|
|
Is my task related to hooks? → YES
|
|
Is my task related to timeouts? → YES
|
|
Is my task related to npm? → YES
|
|
155 tokens is cheap → FETCH IT
|
|
```
|
|
|
|
---
|
|
|
|
## The Two-Tier Search Strategy
|
|
|
|
Claude-Mem implements progressive disclosure in search results too:
|
|
|
|
### Tier 1: Index Format (Default)
|
|
|
|
```typescript
|
|
search_observations({
|
|
query: "hook timeout",
|
|
format: "index" // Default
|
|
})
|
|
```
|
|
|
|
**Returns:**
|
|
```
|
|
Found 3 observations matching "hook timeout":
|
|
|
|
| ID | Date | Type | Title | Tokens |
|
|
|----|------|------|-------|--------|
|
|
| #2543 | Oct 26 | gotcha | Hook timeout: 60s too short | ~155 |
|
|
| #2891 | Oct 25 | how-it-works | Hook timeout configuration | ~203 |
|
|
| #2102 | Oct 20 | problem-solution | Fixed timeout in CI | ~89 |
|
|
```
|
|
|
|
**Cost:** ~100 tokens for 3 results
|
|
**Value:** Agent can scan and decide which to fetch
|
|
|
|
### Tier 2: Full Format (On-Demand)
|
|
|
|
```typescript
|
|
search_observations({
|
|
query: "hook timeout",
|
|
format: "full",
|
|
limit: 1 // Fetch just the most relevant
|
|
})
|
|
```
|
|
|
|
**Returns:**
|
|
```
|
|
#2543 🔴 Hook timeout: 60s too short for npm install
|
|
─────────────────────────────────────────────────
|
|
Date: Oct 26, 2025 2:14 PM
|
|
Type: gotcha
|
|
Project: claude-mem
|
|
|
|
Narrative:
|
|
Discovered that the default 60-second hook timeout is insufficient
|
|
for npm install operations, especially with large dependency trees
|
|
or slow network conditions. This causes SessionStart hook to fail
|
|
silently, preventing context injection.
|
|
|
|
Facts:
|
|
- Default timeout: 60 seconds
|
|
- npm install with cold cache: ~90 seconds
|
|
- Configured timeout: 120 seconds in plugin/hooks/hooks.json:25
|
|
|
|
Files Modified:
|
|
- plugin/hooks/hooks.json
|
|
|
|
Concepts: hooks, timeout, npm, configuration
|
|
```
|
|
|
|
**Cost:** ~155 tokens for full details
|
|
**Value:** Complete understanding of the issue
|
|
|
|
---
|
|
|
|
## Cognitive Load Theory
|
|
|
|
Progressive disclosure is grounded in **Cognitive Load Theory**:
|
|
|
|
### Intrinsic Load
|
|
The inherent difficulty of the task itself.
|
|
|
|
**Example:** "Fix authentication bug"
|
|
- Must understand auth system
|
|
- Must understand the bug
|
|
- Must write the fix
|
|
|
|
This load is unavoidable.
|
|
|
|
### Extraneous Load
|
|
The cognitive burden of poorly presented information.
|
|
|
|
**Traditional RAG adds extraneous load:**
|
|
- Scanning irrelevant observations
|
|
- Filtering out noise
|
|
- Remembering what to ignore
|
|
- Re-contextualizing after each section
|
|
|
|
**Progressive disclosure minimizes extraneous load:**
|
|
- Scan titles (low effort)
|
|
- Fetch only relevant (targeted effort)
|
|
- Full attention on current task
|
|
|
|
### Germane Load
|
|
The effort of building mental models and schemas.
|
|
|
|
**Progressive disclosure supports germane load:**
|
|
- Consistent structure (legend, grouping)
|
|
- Clear categorization (types, icons)
|
|
- Semantic compression (good titles)
|
|
- Explicit costs (token counts)
|
|
|
|
---
|
|
|
|
## Anti-Patterns to Avoid
|
|
|
|
### ❌ Verbose Titles
|
|
|
|
**Bad:**
|
|
```
|
|
| #2543 | 2:14 PM | 🔴 | Investigation into the issue where hooks time out | ~155 |
|
|
```
|
|
|
|
**Good:**
|
|
```
|
|
| #2543 | 2:14 PM | 🔴 | Hook timeout: 60s too short for npm install | ~155 |
|
|
```
|
|
|
|
### ❌ Hiding Costs
|
|
|
|
**Bad:**
|
|
```
|
|
| #2543 | 2:14 PM | 🔴 | Hook timeout issue |
|
|
```
|
|
|
|
**Good:**
|
|
```
|
|
| #2543 | 2:14 PM | 🔴 | Hook timeout issue | ~155 |
|
|
```
|
|
|
|
### ❌ No Retrieval Path
|
|
|
|
**Bad:**
|
|
```
|
|
Here are 10 observations. [No instructions on how to get full details]
|
|
```
|
|
|
|
**Good:**
|
|
```
|
|
Here are 10 observations.
|
|
*Use MCP search tools to fetch full observation details on-demand*
|
|
```
|
|
|
|
### ❌ Defaulting to Full Format
|
|
|
|
**Bad:**
|
|
```typescript
|
|
search_observations({
|
|
query: "hooks",
|
|
format: "full" // Fetches everything
|
|
})
|
|
```
|
|
|
|
**Good:**
|
|
```typescript
|
|
search_observations({
|
|
query: "hooks",
|
|
format: "index", // Scan first
|
|
limit: 20
|
|
})
|
|
|
|
// Then, if needed:
|
|
search_observations({
|
|
query: "hooks",
|
|
format: "full",
|
|
limit: 1 // Just the most relevant
|
|
})
|
|
```
|
|
|
|
---
|
|
|
|
## Key Design Decisions
|
|
|
|
### Why Token Counts?
|
|
|
|
**Decision:** Show approximate token counts (~155, ~203) rather than exact counts.
|
|
|
|
**Rationale:**
|
|
- Communicates scale (50 vs 500) without false precision
|
|
- Maps to human intuition (small/medium/large)
|
|
- Allows agent to budget attention
|
|
- Encourages cost-conscious retrieval
|
|
|
|
### Why Icons Instead of Text Labels?
|
|
|
|
**Decision:** Use emoji icons (🔴, 🟡, 🔵) rather than text (GOTCHA, PROBLEM, HOWTO).
|
|
|
|
**Rationale:**
|
|
- Visual scanning (pattern recognition)
|
|
- Token efficient (1 char vs 10 chars)
|
|
- Language-agnostic
|
|
- Aesthetically distinct
|
|
- Works for both humans and AI
|
|
|
|
### Why Index-First, Not Smart Pre-Fetch?
|
|
|
|
**Decision:** Always show index first, even if we "know" what's relevant.
|
|
|
|
**Rationale:**
|
|
- We can't know what's relevant better than the agent
|
|
- Pre-fetching assumes we understand the task
|
|
- Agent knows current context, we don't
|
|
- Respects agent autonomy
|
|
- Fails gracefully (can always fetch more)
|
|
|
|
### Why Group by File Path?
|
|
|
|
**Decision:** Group observations by file path in addition to date.
|
|
|
|
**Rationale:**
|
|
- Spatial locality: Work on file X likely needs context about file X
|
|
- Reduces scanning effort
|
|
- Matches how developers think
|
|
- Clear semantic boundaries
|
|
|
|
---
|
|
|
|
## Measuring Success
|
|
|
|
Progressive disclosure is working when:
|
|
|
|
### ✅ Low Waste Ratio
|
|
```
|
|
Relevant Tokens / Total Context Tokens > 80%
|
|
```
|
|
|
|
Most of the context consumed is actually useful.
|
|
|
|
### ✅ Selective Fetching
|
|
```
|
|
Index Shown: 50 observations
|
|
Details Fetched: 2-3 observations
|
|
```
|
|
|
|
Agent is being selective, not fetching everything.
|
|
|
|
### ✅ Fast Task Completion
|
|
```
|
|
Session with index: 30 seconds to find relevant context
|
|
Session without: 90 seconds scanning all context
|
|
```
|
|
|
|
Time-to-relevant-information is faster.
|
|
|
|
### ✅ Appropriate Depth
|
|
```
|
|
Simple task: Only index needed
|
|
Medium task: 1-2 observations fetched
|
|
Complex task: 5-10 observations + code reads
|
|
```
|
|
|
|
Depth scales with task complexity.
|
|
|
|
---
|
|
|
|
## Future Enhancements
|
|
|
|
### Adaptive Index Size
|
|
|
|
```typescript
|
|
// Vary index size based on session type
|
|
SessionStart({ source: "startup" }):
|
|
→ Show last 10 sessions (small index)
|
|
|
|
SessionStart({ source: "resume" }):
|
|
→ Show only current session (micro index)
|
|
|
|
SessionStart({ source: "compact" }):
|
|
→ Show last 20 sessions (larger index)
|
|
```
|
|
|
|
### Relevance Scoring
|
|
|
|
```typescript
|
|
// Use embeddings to pre-sort index by relevance
|
|
search_observations({
|
|
query: "authentication bug",
|
|
format: "index",
|
|
sort: "relevance" // Based on semantic similarity
|
|
})
|
|
```
|
|
|
|
### Cost Forecasting
|
|
|
|
```markdown
|
|
💡 **Budget Estimate:**
|
|
- Fetching all 🔴 gotchas: ~450 tokens
|
|
- Fetching all file-related: ~1,200 tokens
|
|
- Fetching everything: ~8,500 tokens
|
|
```
|
|
|
|
### Progressive Detail Levels
|
|
|
|
```
|
|
Layer 1: Index (titles only)
|
|
Layer 2: Summaries (2-3 sentences)
|
|
Layer 3: Full details (complete observation)
|
|
Layer 4: Source files (referenced code)
|
|
```
|
|
|
|
---
|
|
|
|
## Key Takeaways
|
|
|
|
1. **Show, don't tell**: Index reveals what exists without forcing consumption
|
|
2. **Cost-conscious**: Make retrieval costs visible for informed decisions
|
|
3. **Agent autonomy**: Let the agent decide what's relevant
|
|
4. **Semantic compression**: Good titles make or break the system
|
|
5. **Consistent structure**: Patterns reduce cognitive load
|
|
6. **Two-tier everything**: Index first, details on-demand
|
|
7. **Context as currency**: Spend wisely on high-value information
|
|
|
|
---
|
|
|
|
## Remember
|
|
|
|
> "The best interface is one that disappears when not needed, and appears exactly when it is."
|
|
|
|
Progressive disclosure respects the agent's intelligence and autonomy. We provide the map; the agent chooses the path.
|
|
|
|
---
|
|
|
|
## Further Reading
|
|
|
|
- [Context Engineering for AI Agents](context-engineering) - Foundational principles
|
|
- [Claude-Mem Architecture](architecture/overview) - How it all fits together
|
|
- Cognitive Load Theory (Sweller, 1988)
|
|
- Information Foraging Theory (Pirolli & Card, 1999)
|
|
- Progressive Disclosure (Nielsen Norman Group)
|
|
|
|
---
|
|
|
|
*This philosophy emerged from real-world usage of Claude-Mem across hundreds of coding sessions. The pattern works because it aligns with both human cognition and LLM attention mechanics.*
|