claude-mem/docs/progressive-disclosure.mdx

# Progressive Disclosure: Claude-Mem's Context Priming Philosophy

## Core Principle
**Show what exists and its retrieval cost first. Let the agent decide what to fetch based on relevance and need.**

---

## What is Progressive Disclosure?

Progressive disclosure is an information architecture pattern where you reveal complexity gradually rather than all at once. In the context of AI agents, it means:

1. **Layer 1 (Index)**: Show lightweight metadata (titles, dates, types, token counts)
2. **Layer 2 (Details)**: Fetch full content only when needed
3. **Layer 3 (Deep Dive)**: Read original source files if required

This mirrors how humans work: We scan headlines before reading articles, review table of contents before diving into chapters, and check file names before opening files.

---

## The Problem: Context Pollution

Traditional RAG (Retrieval-Augmented Generation) systems fetch everything upfront:

```
❌ Traditional Approach:
┌─────────────────────────────────────┐
│ Session Start                        │
│                                      │
│ [15,000 tokens of past sessions]    │
│ [8,000 tokens of observations]      │
│ [12,000 tokens of file summaries]   │
│                                      │
│ Total: 35,000 tokens                │
│ Relevant: ~2,000 tokens (6%)        │
└─────────────────────────────────────┘
```

**Problems:**
- Wastes 94% of attention budget on irrelevant context
- User prompt gets buried under mountain of history
- Agent must process everything before understanding task
- No way to know what's actually useful until after reading

---

## Claude-Mem's Solution: Progressive Disclosure

```
✅ Progressive Disclosure Approach:
┌─────────────────────────────────────┐
│ Session Start                        │
│                                      │
│ Index of 50 observations: ~800 tokens│
│ ↓                                    │
│ Agent sees: "🔴 Hook timeout issue"  │
│ Agent decides: "Relevant!"           │
│ ↓                                    │
│ Fetch observation #2543: ~120 tokens│
│                                      │
│ Total: 920 tokens                   │
│ Relevant: 920 tokens (100%)         │
└─────────────────────────────────────┘
```

**Benefits:**
- Agent controls its own context consumption
- Directly relevant to current task
- Can fetch more if needed
- Can skip everything if not relevant
- Clear cost/benefit for each retrieval decision

---

## How It Works in Claude-Mem

### The Index Format

Every SessionStart hook provides a compact index:

```markdown
### Oct 26, 2025

**General**
| ID | Time | T | Title | Tokens |
|----|------|---|-------|--------|
| #2586 | 12:58 AM | 🔵 | Context hook file exists but is empty | ~51 |
| #2587 | ″ | 🔵 | Context hook script file is empty | ~46 |
| #2589 | ″ | 🟡 | Investigated hook debug output docs | ~105 |

**src/hooks/context-hook.ts**
| ID | Time | T | Title | Tokens |
|----|------|---|-------|--------|
| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
| #2592 | 1:16 AM | ⚖️ | Web UI strategy redesigned | ~193 |
```

**What the agent sees:**
- **What exists**: Observation titles give semantic meaning
- **When it happened**: Timestamps for temporal context
- **What type**: Icons indicate observation category
- **Retrieval cost**: Token counts for informed decisions
- **Where to get it**: MCP search tools referenced at bottom

### The Legend System

```
🎯 session-request  - User's original goal
🔴 gotcha          - Critical edge case or pitfall
🟡 problem-solution - Bug fix or workaround
🔵 how-it-works    - Technical explanation
🟢 what-changed    - Code/architecture change
🟣 discovery       - Learning or insight
🟠 why-it-exists   - Design rationale
🟤 decision        - Architecture decision
⚖️ trade-off       - Deliberate compromise
```

**Purpose:**
- Visual scanning (humans and AI both benefit)
- Semantic categorization
- Priority signaling (🔴 gotchas are more critical)
- Pattern recognition across sessions

### Progressive Disclosure Instructions

The index includes usage guidance:

```markdown
💡 **Progressive Disclosure:** This index shows WHAT exists and retrieval COST.
- Use MCP search tools to fetch full observation details on-demand
- Prefer searching observations over re-reading code for past decisions
- Critical types (🔴 gotcha, 🟤 decision, ⚖️ trade-off) often worth fetching immediately
```

**What this does:**
- Teaches the agent the pattern
- Suggests when to fetch (critical types)
- Recommends search over code re-reading (efficiency)
- Makes the system self-documenting

---

## The Philosophy: Context as Currency

### Mental Model: Token Budget as Money

Think of context window as a bank account:

| Approach | Metaphor | Outcome |
|----------|----------|---------|
| **Dump everything** | Spending your entire paycheck on groceries you might need someday | Waste, clutter, can't afford what you actually need |
| **Fetch nothing** | Refusing to spend any money | Starvation, can't accomplish tasks |
| **Progressive disclosure** | Check your pantry, make a shopping list, buy only what you need | Efficiency, room for unexpected needs |

### The Attention Budget

LLMs have finite attention:
- Every token attends to every other token (n² relationships)
- 100,000 token window ≠ 100,000 tokens of useful attention
- Context "rot" happens as window fills
- Later tokens get less attention than earlier ones

**Claude-Mem's approach:**
- Start with ~1,000 tokens of index
- Agent has 99,000 tokens free for task
- Agent fetches ~200 tokens when needed
- Final budget: ~98,000 tokens for actual work

### Design for Autonomy

> "As models improve, let them act intelligently"

Progressive disclosure treats the agent as an **intelligent information forager**, not a passive recipient of pre-selected context.

**Traditional RAG:**
```
System → [Decides relevance] → Agent
        ↑
   Hope this helps!
```

**Progressive Disclosure:**
```
System → [Shows index] → Agent → [Decides relevance] → [Fetches details]
                          ↑
                   You know best!
```

The agent knows:
- The current task context
- What information would help
- How much budget to spend
- When to stop searching

We don't.

---

## Implementation Principles

### 1. Make Costs Visible

Every item in the index shows token count:

```
| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
                                                        ^^^^
                                                    Retrieval cost
```

**Why:**
- Agent can make informed ROI decisions
- Small observations (~50 tokens) are "cheap" to fetch
- Large observations (~500 tokens) require stronger justification
- Matches how humans think about effort

### 2. Use Semantic Compression

Titles compress full observations into ~10 words:

**Bad title:**
```
Observation about a thing
```

**Good title:**
```
🔴 Hook timeout issue: 60s default too short for npm install
```

**What makes a good title:**
- Specific: Identifies exact issue
- Actionable: Clear what to do
- Self-contained: Doesn't require reading observation
- Searchable: Contains key terms (hook, timeout, npm)
- Categorized: Icon indicates type

### 3. Group by Context

Observations are grouped by:
- **Date**: Temporal context
- **File path**: Spatial context (work on specific files)
- **Project**: Logical context

```markdown
**src/hooks/context-hook.ts**
| ID | Time | T | Title | Tokens |
|----|------|---|-------|--------|
| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
| #2594 | 1:17 AM | 🟠 | Removed stderr section from docs | ~93 |
```

**Benefit:** If agent is working on `src/hooks/context-hook.ts`, related observations are already grouped together.

### 4. Provide Retrieval Tools

The index is useless without retrieval mechanisms:

```markdown
*Use claude-mem MCP search to access records with the given ID*
```

**Available tools:**
- `search_observations` - Full-text search
- `find_by_concept` - Concept-based retrieval
- `find_by_file` - File-based retrieval
- `find_by_type` - Type-based retrieval
- `get_recent_context` - Recent session summaries

Each tool supports `format: "index"` (default) and `format: "full"`.

---

## Real-World Example

### Scenario: Agent asked to fix a bug in hooks

**Without progressive disclosure:**
```
SessionStart injects 25,000 tokens of past context
Agent reads everything
Agent finds 1 relevant observation (buried in middle)
Total tokens consumed: 25,000
Relevant tokens: ~200
Efficiency: 0.8%
```

**With progressive disclosure:**
```
SessionStart shows index: ~800 tokens
Agent sees title: "🔴 Hook timeout issue: 60s too short"
Agent thinks: "This looks relevant to my bug!"
Agent fetches observation #2543: ~155 tokens
Total tokens consumed: 955
Relevant tokens: 955
Efficiency: 100%
```

### The Index Entry

```markdown
| #2543 | 2:14 PM | 🔴 | Hook timeout: 60s too short for npm install | ~155 |
```

**What the agent learns WITHOUT fetching:**
- There's a known gotcha (🔴) about hook timeouts
- It's related to npm install taking too long
- Full details are ~155 tokens (cheap)
- Happened at 2:14 PM (recent)

**Decision tree:**
```
Is my task related to hooks? → YES
Is my task related to timeouts? → YES
Is my task related to npm? → YES
155 tokens is cheap → FETCH IT
```

---

## The Two-Tier Search Strategy

Claude-Mem implements progressive disclosure in search results too:

### Tier 1: Index Format (Default)

```typescript
search_observations({
  query: "hook timeout",
  format: "index"  // Default
})
```

**Returns:**
```
Found 3 observations matching "hook timeout":

| ID | Date | Type | Title | Tokens |
|----|------|------|-------|--------|
| #2543 | Oct 26 | gotcha | Hook timeout: 60s too short | ~155 |
| #2891 | Oct 25 | how-it-works | Hook timeout configuration | ~203 |
| #2102 | Oct 20 | problem-solution | Fixed timeout in CI | ~89 |
```

**Cost:** ~100 tokens for 3 results
**Value:** Agent can scan and decide which to fetch

### Tier 2: Full Format (On-Demand)

```typescript
search_observations({
  query: "hook timeout",
  format: "full",
  limit: 1  // Fetch just the most relevant
})
```

**Returns:**
```
#2543 🔴 Hook timeout: 60s too short for npm install
─────────────────────────────────────────────────
Date: Oct 26, 2025 2:14 PM
Type: gotcha
Project: claude-mem

Narrative:
Discovered that the default 60-second hook timeout is insufficient
for npm install operations, especially with large dependency trees
or slow network conditions. This causes SessionStart hook to fail
silently, preventing context injection.

Facts:
- Default timeout: 60 seconds
- npm install with cold cache: ~90 seconds
- Configured timeout: 120 seconds in plugin/hooks/hooks.json:25

Files Modified:
- plugin/hooks/hooks.json

Concepts: hooks, timeout, npm, configuration
```

**Cost:** ~155 tokens for full details
**Value:** Complete understanding of the issue

---

## Cognitive Load Theory

Progressive disclosure is grounded in **Cognitive Load Theory**:

### Intrinsic Load
The inherent difficulty of the task itself.

**Example:** "Fix authentication bug"
- Must understand auth system
- Must understand the bug
- Must write the fix

This load is unavoidable.

### Extraneous Load
The cognitive burden of poorly presented information.

**Traditional RAG adds extraneous load:**
- Scanning irrelevant observations
- Filtering out noise
- Remembering what to ignore
- Re-contextualizing after each section

**Progressive disclosure minimizes extraneous load:**
- Scan titles (low effort)
- Fetch only relevant (targeted effort)
- Full attention on current task

### Germane Load
The effort of building mental models and schemas.

**Progressive disclosure supports germane load:**
- Consistent structure (legend, grouping)
- Clear categorization (types, icons)
- Semantic compression (good titles)
- Explicit costs (token counts)

---

## Anti-Patterns to Avoid

### ❌ Verbose Titles

**Bad:**
```
| #2543 | 2:14 PM | 🔴 | Investigation into the issue where hooks time out | ~155 |
```

**Good:**
```
| #2543 | 2:14 PM | 🔴 | Hook timeout: 60s too short for npm install | ~155 |
```

### ❌ Hiding Costs

**Bad:**
```
| #2543 | 2:14 PM | 🔴 | Hook timeout issue |
```

**Good:**
```
| #2543 | 2:14 PM | 🔴 | Hook timeout issue | ~155 |
```

### ❌ No Retrieval Path

**Bad:**
```
Here are 10 observations. [No instructions on how to get full details]
```

**Good:**
```
Here are 10 observations.
*Use MCP search tools to fetch full observation details on-demand*
```

### ❌ Defaulting to Full Format

**Bad:**
```typescript
search_observations({
  query: "hooks",
  format: "full"  // Fetches everything
})
```

**Good:**
```typescript
search_observations({
  query: "hooks",
  format: "index",  // Scan first
  limit: 20
})

// Then, if needed:
search_observations({
  query: "hooks",
  format: "full",
  limit: 1  // Just the most relevant
})
```

---

## Key Design Decisions

### Why Token Counts?

**Decision:** Show approximate token counts (~155, ~203) rather than exact counts.

**Rationale:**
- Communicates scale (50 vs 500) without false precision
- Maps to human intuition (small/medium/large)
- Allows agent to budget attention
- Encourages cost-conscious retrieval

### Why Icons Instead of Text Labels?

**Decision:** Use emoji icons (🔴, 🟡, 🔵) rather than text (GOTCHA, PROBLEM, HOWTO).

**Rationale:**
- Visual scanning (pattern recognition)
- Token efficient (1 char vs 10 chars)
- Language-agnostic
- Aesthetically distinct
- Works for both humans and AI

### Why Index-First, Not Smart Pre-Fetch?

**Decision:** Always show index first, even if we "know" what's relevant.

**Rationale:**
- We can't know what's relevant better than the agent
- Pre-fetching assumes we understand the task
- Agent knows current context, we don't
- Respects agent autonomy
- Fails gracefully (can always fetch more)

### Why Group by File Path?

**Decision:** Group observations by file path in addition to date.

**Rationale:**
- Spatial locality: Work on file X likely needs context about file X
- Reduces scanning effort
- Matches how developers think
- Clear semantic boundaries

---

## Measuring Success

Progressive disclosure is working when:

### ✅ Low Waste Ratio
```
Relevant Tokens / Total Context Tokens > 80%
```

Most of the context consumed is actually useful.

### ✅ Selective Fetching
```
Index Shown: 50 observations
Details Fetched: 2-3 observations
```

Agent is being selective, not fetching everything.

### ✅ Fast Task Completion
```
Session with index: 30 seconds to find relevant context
Session without: 90 seconds scanning all context
```

Time-to-relevant-information is faster.

### ✅ Appropriate Depth
```
Simple task: Only index needed
Medium task: 1-2 observations fetched
Complex task: 5-10 observations + code reads
```

Depth scales with task complexity.

---

## Future Enhancements

### Adaptive Index Size

```typescript
// Vary index size based on session type
SessionStart({ source: "startup" }):
  → Show last 10 sessions (small index)

SessionStart({ source: "resume" }):
  → Show only current session (micro index)

SessionStart({ source: "compact" }):
  → Show last 20 sessions (larger index)
```

### Relevance Scoring

```typescript
// Use embeddings to pre-sort index by relevance
search_observations({
  query: "authentication bug",
  format: "index",
  sort: "relevance"  // Based on semantic similarity
})
```

### Cost Forecasting

```markdown
💡 **Budget Estimate:**
- Fetching all 🔴 gotchas: ~450 tokens
- Fetching all file-related: ~1,200 tokens
- Fetching everything: ~8,500 tokens
```

### Progressive Detail Levels

```
Layer 1: Index (titles only)
Layer 2: Summaries (2-3 sentences)
Layer 3: Full details (complete observation)
Layer 4: Source files (referenced code)
```

---

## Key Takeaways

1. **Show, don't tell**: Index reveals what exists without forcing consumption
2. **Cost-conscious**: Make retrieval costs visible for informed decisions
3. **Agent autonomy**: Let the agent decide what's relevant
4. **Semantic compression**: Good titles make or break the system
5. **Consistent structure**: Patterns reduce cognitive load
6. **Two-tier everything**: Index first, details on-demand
7. **Context as currency**: Spend wisely on high-value information

---

## Remember

> "The best interface is one that disappears when not needed, and appears exactly when it is."

Progressive disclosure respects the agent's intelligence and autonomy. We provide the map; the agent chooses the path.

---

## Further Reading

- [Context Engineering for AI Agents](context-engineering) - Foundational principles
- [Claude-Mem Architecture](architecture/overview) - How it all fits together
- Cognitive Load Theory (Sweller, 1988)
- Information Foraging Theory (Pirolli & Card, 1999)
- Progressive Disclosure (Nielsen Norman Group)

---

*This philosophy emerged from real-world usage of Claude-Mem across hundreds of coding sessions. The pattern works because it aligns with both human cognition and LLM attention mechanics.*