Add stderr test hook for UI experiment
This commit is contained in:
@@ -0,0 +1,655 @@
|
||||
# Progressive Disclosure: Claude-Mem's Context Priming Philosophy
|
||||
|
||||
## Core Principle
|
||||
**Show what exists and its retrieval cost first. Let the agent decide what to fetch based on relevance and need.**
|
||||
|
||||
---
|
||||
|
||||
## What is Progressive Disclosure?
|
||||
|
||||
Progressive disclosure is an information architecture pattern where you reveal complexity gradually rather than all at once. In the context of AI agents, it means:
|
||||
|
||||
1. **Layer 1 (Index)**: Show lightweight metadata (titles, dates, types, token counts)
|
||||
2. **Layer 2 (Details)**: Fetch full content only when needed
|
||||
3. **Layer 3 (Deep Dive)**: Read original source files if required
|
||||
|
||||
This mirrors how humans work: We scan headlines before reading articles, review table of contents before diving into chapters, and check file names before opening files.
|
||||
|
||||
---
|
||||
|
||||
## The Problem: Context Pollution
|
||||
|
||||
Traditional RAG (Retrieval-Augmented Generation) systems fetch everything upfront:
|
||||
|
||||
```
|
||||
❌ Traditional Approach:
|
||||
┌─────────────────────────────────────┐
|
||||
│ Session Start │
|
||||
│ │
|
||||
│ [15,000 tokens of past sessions] │
|
||||
│ [8,000 tokens of observations] │
|
||||
│ [12,000 tokens of file summaries] │
|
||||
│ │
|
||||
│ Total: 35,000 tokens │
|
||||
│ Relevant: ~2,000 tokens (6%) │
|
||||
└─────────────────────────────────────┘
|
||||
```
|
||||
|
||||
**Problems:**
|
||||
- Wastes 94% of attention budget on irrelevant context
|
||||
- User prompt gets buried under mountain of history
|
||||
- Agent must process everything before understanding task
|
||||
- No way to know what's actually useful until after reading
|
||||
|
||||
---
|
||||
|
||||
## Claude-Mem's Solution: Progressive Disclosure
|
||||
|
||||
```
|
||||
✅ Progressive Disclosure Approach:
|
||||
┌─────────────────────────────────────┐
|
||||
│ Session Start │
|
||||
│ │
|
||||
│ Index of 50 observations: ~800 tokens│
|
||||
│ ↓ │
|
||||
│ Agent sees: "🔴 Hook timeout issue" │
|
||||
│ Agent decides: "Relevant!" │
|
||||
│ ↓ │
|
||||
│ Fetch observation #2543: ~120 tokens│
|
||||
│ │
|
||||
│ Total: 920 tokens │
|
||||
│ Relevant: 920 tokens (100%) │
|
||||
└─────────────────────────────────────┘
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- Agent controls its own context consumption
|
||||
- Directly relevant to current task
|
||||
- Can fetch more if needed
|
||||
- Can skip everything if not relevant
|
||||
- Clear cost/benefit for each retrieval decision
|
||||
|
||||
---
|
||||
|
||||
## How It Works in Claude-Mem
|
||||
|
||||
### The Index Format
|
||||
|
||||
Every SessionStart hook provides a compact index:
|
||||
|
||||
```markdown
|
||||
### Oct 26, 2025
|
||||
|
||||
**General**
|
||||
| ID | Time | T | Title | Tokens |
|
||||
|----|------|---|-------|--------|
|
||||
| #2586 | 12:58 AM | 🔵 | Context hook file exists but is empty | ~51 |
|
||||
| #2587 | ″ | 🔵 | Context hook script file is empty | ~46 |
|
||||
| #2589 | ″ | 🟡 | Investigated hook debug output docs | ~105 |
|
||||
|
||||
**src/hooks/context-hook.ts**
|
||||
| ID | Time | T | Title | Tokens |
|
||||
|----|------|---|-------|--------|
|
||||
| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
|
||||
| #2592 | 1:16 AM | ⚖️ | Web UI strategy redesigned | ~193 |
|
||||
```
|
||||
|
||||
**What the agent sees:**
|
||||
- **What exists**: Observation titles give semantic meaning
|
||||
- **When it happened**: Timestamps for temporal context
|
||||
- **What type**: Icons indicate observation category
|
||||
- **Retrieval cost**: Token counts for informed decisions
|
||||
- **Where to get it**: MCP search tools referenced at bottom
|
||||
|
||||
### The Legend System
|
||||
|
||||
```
|
||||
🎯 session-request - User's original goal
|
||||
🔴 gotcha - Critical edge case or pitfall
|
||||
🟡 problem-solution - Bug fix or workaround
|
||||
🔵 how-it-works - Technical explanation
|
||||
🟢 what-changed - Code/architecture change
|
||||
🟣 discovery - Learning or insight
|
||||
🟠 why-it-exists - Design rationale
|
||||
🟤 decision - Architecture decision
|
||||
⚖️ trade-off - Deliberate compromise
|
||||
```
|
||||
|
||||
**Purpose:**
|
||||
- Visual scanning (humans and AI both benefit)
|
||||
- Semantic categorization
|
||||
- Priority signaling (🔴 gotchas are more critical)
|
||||
- Pattern recognition across sessions
|
||||
|
||||
### Progressive Disclosure Instructions
|
||||
|
||||
The index includes usage guidance:
|
||||
|
||||
```markdown
|
||||
💡 **Progressive Disclosure:** This index shows WHAT exists and retrieval COST.
|
||||
- Use MCP search tools to fetch full observation details on-demand
|
||||
- Prefer searching observations over re-reading code for past decisions
|
||||
- Critical types (🔴 gotcha, 🟤 decision, ⚖️ trade-off) often worth fetching immediately
|
||||
```
|
||||
|
||||
**What this does:**
|
||||
- Teaches the agent the pattern
|
||||
- Suggests when to fetch (critical types)
|
||||
- Recommends search over code re-reading (efficiency)
|
||||
- Makes the system self-documenting
|
||||
|
||||
---
|
||||
|
||||
## The Philosophy: Context as Currency
|
||||
|
||||
### Mental Model: Token Budget as Money
|
||||
|
||||
Think of context window as a bank account:
|
||||
|
||||
| Approach | Metaphor | Outcome |
|
||||
|----------|----------|---------|
|
||||
| **Dump everything** | Spending your entire paycheck on groceries you might need someday | Waste, clutter, can't afford what you actually need |
|
||||
| **Fetch nothing** | Refusing to spend any money | Starvation, can't accomplish tasks |
|
||||
| **Progressive disclosure** | Check your pantry, make a shopping list, buy only what you need | Efficiency, room for unexpected needs |
|
||||
|
||||
### The Attention Budget
|
||||
|
||||
LLMs have finite attention:
|
||||
- Every token attends to every other token (n² relationships)
|
||||
- 100,000 token window ≠ 100,000 tokens of useful attention
|
||||
- Context "rot" happens as window fills
|
||||
- Later tokens get less attention than earlier ones
|
||||
|
||||
**Claude-Mem's approach:**
|
||||
- Start with ~1,000 tokens of index
|
||||
- Agent has 99,000 tokens free for task
|
||||
- Agent fetches ~200 tokens when needed
|
||||
- Final budget: ~98,000 tokens for actual work
|
||||
|
||||
### Design for Autonomy
|
||||
|
||||
> "As models improve, let them act intelligently"
|
||||
|
||||
Progressive disclosure treats the agent as an **intelligent information forager**, not a passive recipient of pre-selected context.
|
||||
|
||||
**Traditional RAG:**
|
||||
```
|
||||
System → [Decides relevance] → Agent
|
||||
↑
|
||||
Hope this helps!
|
||||
```
|
||||
|
||||
**Progressive Disclosure:**
|
||||
```
|
||||
System → [Shows index] → Agent → [Decides relevance] → [Fetches details]
|
||||
↑
|
||||
You know best!
|
||||
```
|
||||
|
||||
The agent knows:
|
||||
- The current task context
|
||||
- What information would help
|
||||
- How much budget to spend
|
||||
- When to stop searching
|
||||
|
||||
We don't.
|
||||
|
||||
---
|
||||
|
||||
## Implementation Principles
|
||||
|
||||
### 1. Make Costs Visible
|
||||
|
||||
Every item in the index shows token count:
|
||||
|
||||
```
|
||||
| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
|
||||
^^^^
|
||||
Retrieval cost
|
||||
```
|
||||
|
||||
**Why:**
|
||||
- Agent can make informed ROI decisions
|
||||
- Small observations (~50 tokens) are "cheap" to fetch
|
||||
- Large observations (~500 tokens) require stronger justification
|
||||
- Matches how humans think about effort
|
||||
|
||||
### 2. Use Semantic Compression
|
||||
|
||||
Titles compress full observations into ~10 words:
|
||||
|
||||
**Bad title:**
|
||||
```
|
||||
Observation about a thing
|
||||
```
|
||||
|
||||
**Good title:**
|
||||
```
|
||||
🔴 Hook timeout issue: 60s default too short for npm install
|
||||
```
|
||||
|
||||
**What makes a good title:**
|
||||
- Specific: Identifies exact issue
|
||||
- Actionable: Clear what to do
|
||||
- Self-contained: Doesn't require reading observation
|
||||
- Searchable: Contains key terms (hook, timeout, npm)
|
||||
- Categorized: Icon indicates type
|
||||
|
||||
### 3. Group by Context
|
||||
|
||||
Observations are grouped by:
|
||||
- **Date**: Temporal context
|
||||
- **File path**: Spatial context (work on specific files)
|
||||
- **Project**: Logical context
|
||||
|
||||
```markdown
|
||||
**src/hooks/context-hook.ts**
|
||||
| ID | Time | T | Title | Tokens |
|
||||
|----|------|---|-------|--------|
|
||||
| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
|
||||
| #2594 | 1:17 AM | 🟠 | Removed stderr section from docs | ~93 |
|
||||
```
|
||||
|
||||
**Benefit:** If agent is working on `src/hooks/context-hook.ts`, related observations are already grouped together.
|
||||
|
||||
### 4. Provide Retrieval Tools
|
||||
|
||||
The index is useless without retrieval mechanisms:
|
||||
|
||||
```markdown
|
||||
*Use claude-mem MCP search to access records with the given ID*
|
||||
```
|
||||
|
||||
**Available tools:**
|
||||
- `search_observations` - Full-text search
|
||||
- `find_by_concept` - Concept-based retrieval
|
||||
- `find_by_file` - File-based retrieval
|
||||
- `find_by_type` - Type-based retrieval
|
||||
- `get_recent_context` - Recent session summaries
|
||||
|
||||
Each tool supports `format: "index"` (default) and `format: "full"`.
|
||||
|
||||
---
|
||||
|
||||
## Real-World Example
|
||||
|
||||
### Scenario: Agent asked to fix a bug in hooks
|
||||
|
||||
**Without progressive disclosure:**
|
||||
```
|
||||
SessionStart injects 25,000 tokens of past context
|
||||
Agent reads everything
|
||||
Agent finds 1 relevant observation (buried in middle)
|
||||
Total tokens consumed: 25,000
|
||||
Relevant tokens: ~200
|
||||
Efficiency: 0.8%
|
||||
```
|
||||
|
||||
**With progressive disclosure:**
|
||||
```
|
||||
SessionStart shows index: ~800 tokens
|
||||
Agent sees title: "🔴 Hook timeout issue: 60s too short"
|
||||
Agent thinks: "This looks relevant to my bug!"
|
||||
Agent fetches observation #2543: ~155 tokens
|
||||
Total tokens consumed: 955
|
||||
Relevant tokens: 955
|
||||
Efficiency: 100%
|
||||
```
|
||||
|
||||
### The Index Entry
|
||||
|
||||
```markdown
|
||||
| #2543 | 2:14 PM | 🔴 | Hook timeout: 60s too short for npm install | ~155 |
|
||||
```
|
||||
|
||||
**What the agent learns WITHOUT fetching:**
|
||||
- There's a known gotcha (🔴) about hook timeouts
|
||||
- It's related to npm install taking too long
|
||||
- Full details are ~155 tokens (cheap)
|
||||
- Happened at 2:14 PM (recent)
|
||||
|
||||
**Decision tree:**
|
||||
```
|
||||
Is my task related to hooks? → YES
|
||||
Is my task related to timeouts? → YES
|
||||
Is my task related to npm? → YES
|
||||
155 tokens is cheap → FETCH IT
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## The Two-Tier Search Strategy
|
||||
|
||||
Claude-Mem implements progressive disclosure in search results too:
|
||||
|
||||
### Tier 1: Index Format (Default)
|
||||
|
||||
```typescript
|
||||
search_observations({
|
||||
query: "hook timeout",
|
||||
format: "index" // Default
|
||||
})
|
||||
```
|
||||
|
||||
**Returns:**
|
||||
```
|
||||
Found 3 observations matching "hook timeout":
|
||||
|
||||
| ID | Date | Type | Title | Tokens |
|
||||
|----|------|------|-------|--------|
|
||||
| #2543 | Oct 26 | gotcha | Hook timeout: 60s too short | ~155 |
|
||||
| #2891 | Oct 25 | how-it-works | Hook timeout configuration | ~203 |
|
||||
| #2102 | Oct 20 | problem-solution | Fixed timeout in CI | ~89 |
|
||||
```
|
||||
|
||||
**Cost:** ~100 tokens for 3 results
|
||||
**Value:** Agent can scan and decide which to fetch
|
||||
|
||||
### Tier 2: Full Format (On-Demand)
|
||||
|
||||
```typescript
|
||||
search_observations({
|
||||
query: "hook timeout",
|
||||
format: "full",
|
||||
limit: 1 // Fetch just the most relevant
|
||||
})
|
||||
```
|
||||
|
||||
**Returns:**
|
||||
```
|
||||
#2543 🔴 Hook timeout: 60s too short for npm install
|
||||
─────────────────────────────────────────────────
|
||||
Date: Oct 26, 2025 2:14 PM
|
||||
Type: gotcha
|
||||
Project: claude-mem
|
||||
|
||||
Narrative:
|
||||
Discovered that the default 60-second hook timeout is insufficient
|
||||
for npm install operations, especially with large dependency trees
|
||||
or slow network conditions. This causes SessionStart hook to fail
|
||||
silently, preventing context injection.
|
||||
|
||||
Facts:
|
||||
- Default timeout: 60 seconds
|
||||
- npm install with cold cache: ~90 seconds
|
||||
- Configured timeout: 120 seconds in plugin/hooks/hooks.json:25
|
||||
|
||||
Files Modified:
|
||||
- plugin/hooks/hooks.json
|
||||
|
||||
Concepts: hooks, timeout, npm, configuration
|
||||
```
|
||||
|
||||
**Cost:** ~155 tokens for full details
|
||||
**Value:** Complete understanding of the issue
|
||||
|
||||
---
|
||||
|
||||
## Cognitive Load Theory
|
||||
|
||||
Progressive disclosure is grounded in **Cognitive Load Theory**:
|
||||
|
||||
### Intrinsic Load
|
||||
The inherent difficulty of the task itself.
|
||||
|
||||
**Example:** "Fix authentication bug"
|
||||
- Must understand auth system
|
||||
- Must understand the bug
|
||||
- Must write the fix
|
||||
|
||||
This load is unavoidable.
|
||||
|
||||
### Extraneous Load
|
||||
The cognitive burden of poorly presented information.
|
||||
|
||||
**Traditional RAG adds extraneous load:**
|
||||
- Scanning irrelevant observations
|
||||
- Filtering out noise
|
||||
- Remembering what to ignore
|
||||
- Re-contextualizing after each section
|
||||
|
||||
**Progressive disclosure minimizes extraneous load:**
|
||||
- Scan titles (low effort)
|
||||
- Fetch only relevant (targeted effort)
|
||||
- Full attention on current task
|
||||
|
||||
### Germane Load
|
||||
The effort of building mental models and schemas.
|
||||
|
||||
**Progressive disclosure supports germane load:**
|
||||
- Consistent structure (legend, grouping)
|
||||
- Clear categorization (types, icons)
|
||||
- Semantic compression (good titles)
|
||||
- Explicit costs (token counts)
|
||||
|
||||
---
|
||||
|
||||
## Anti-Patterns to Avoid
|
||||
|
||||
### ❌ Verbose Titles
|
||||
|
||||
**Bad:**
|
||||
```
|
||||
| #2543 | 2:14 PM | 🔴 | Investigation into the issue where hooks time out | ~155 |
|
||||
```
|
||||
|
||||
**Good:**
|
||||
```
|
||||
| #2543 | 2:14 PM | 🔴 | Hook timeout: 60s too short for npm install | ~155 |
|
||||
```
|
||||
|
||||
### ❌ Hiding Costs
|
||||
|
||||
**Bad:**
|
||||
```
|
||||
| #2543 | 2:14 PM | 🔴 | Hook timeout issue |
|
||||
```
|
||||
|
||||
**Good:**
|
||||
```
|
||||
| #2543 | 2:14 PM | 🔴 | Hook timeout issue | ~155 |
|
||||
```
|
||||
|
||||
### ❌ No Retrieval Path
|
||||
|
||||
**Bad:**
|
||||
```
|
||||
Here are 10 observations. [No instructions on how to get full details]
|
||||
```
|
||||
|
||||
**Good:**
|
||||
```
|
||||
Here are 10 observations.
|
||||
*Use MCP search tools to fetch full observation details on-demand*
|
||||
```
|
||||
|
||||
### ❌ Defaulting to Full Format
|
||||
|
||||
**Bad:**
|
||||
```typescript
|
||||
search_observations({
|
||||
query: "hooks",
|
||||
format: "full" // Fetches everything
|
||||
})
|
||||
```
|
||||
|
||||
**Good:**
|
||||
```typescript
|
||||
search_observations({
|
||||
query: "hooks",
|
||||
format: "index", // Scan first
|
||||
limit: 20
|
||||
})
|
||||
|
||||
// Then, if needed:
|
||||
search_observations({
|
||||
query: "hooks",
|
||||
format: "full",
|
||||
limit: 1 // Just the most relevant
|
||||
})
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Key Design Decisions
|
||||
|
||||
### Why Token Counts?
|
||||
|
||||
**Decision:** Show approximate token counts (~155, ~203) rather than exact counts.
|
||||
|
||||
**Rationale:**
|
||||
- Communicates scale (50 vs 500) without false precision
|
||||
- Maps to human intuition (small/medium/large)
|
||||
- Allows agent to budget attention
|
||||
- Encourages cost-conscious retrieval
|
||||
|
||||
### Why Icons Instead of Text Labels?
|
||||
|
||||
**Decision:** Use emoji icons (🔴, 🟡, 🔵) rather than text (GOTCHA, PROBLEM, HOWTO).
|
||||
|
||||
**Rationale:**
|
||||
- Visual scanning (pattern recognition)
|
||||
- Token efficient (1 char vs 10 chars)
|
||||
- Language-agnostic
|
||||
- Aesthetically distinct
|
||||
- Works for both humans and AI
|
||||
|
||||
### Why Index-First, Not Smart Pre-Fetch?
|
||||
|
||||
**Decision:** Always show index first, even if we "know" what's relevant.
|
||||
|
||||
**Rationale:**
|
||||
- We can't know what's relevant better than the agent
|
||||
- Pre-fetching assumes we understand the task
|
||||
- Agent knows current context, we don't
|
||||
- Respects agent autonomy
|
||||
- Fails gracefully (can always fetch more)
|
||||
|
||||
### Why Group by File Path?
|
||||
|
||||
**Decision:** Group observations by file path in addition to date.
|
||||
|
||||
**Rationale:**
|
||||
- Spatial locality: Work on file X likely needs context about file X
|
||||
- Reduces scanning effort
|
||||
- Matches how developers think
|
||||
- Clear semantic boundaries
|
||||
|
||||
---
|
||||
|
||||
## Measuring Success
|
||||
|
||||
Progressive disclosure is working when:
|
||||
|
||||
### ✅ Low Waste Ratio
|
||||
```
|
||||
Relevant Tokens / Total Context Tokens > 80%
|
||||
```
|
||||
|
||||
Most of the context consumed is actually useful.
|
||||
|
||||
### ✅ Selective Fetching
|
||||
```
|
||||
Index Shown: 50 observations
|
||||
Details Fetched: 2-3 observations
|
||||
```
|
||||
|
||||
Agent is being selective, not fetching everything.
|
||||
|
||||
### ✅ Fast Task Completion
|
||||
```
|
||||
Session with index: 30 seconds to find relevant context
|
||||
Session without: 90 seconds scanning all context
|
||||
```
|
||||
|
||||
Time-to-relevant-information is faster.
|
||||
|
||||
### ✅ Appropriate Depth
|
||||
```
|
||||
Simple task: Only index needed
|
||||
Medium task: 1-2 observations fetched
|
||||
Complex task: 5-10 observations + code reads
|
||||
```
|
||||
|
||||
Depth scales with task complexity.
|
||||
|
||||
---
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
### Adaptive Index Size
|
||||
|
||||
```typescript
|
||||
// Vary index size based on session type
|
||||
SessionStart({ source: "startup" }):
|
||||
→ Show last 10 sessions (small index)
|
||||
|
||||
SessionStart({ source: "resume" }):
|
||||
→ Show only current session (micro index)
|
||||
|
||||
SessionStart({ source: "compact" }):
|
||||
→ Show last 20 sessions (larger index)
|
||||
```
|
||||
|
||||
### Relevance Scoring
|
||||
|
||||
```typescript
|
||||
// Use embeddings to pre-sort index by relevance
|
||||
search_observations({
|
||||
query: "authentication bug",
|
||||
format: "index",
|
||||
sort: "relevance" // Based on semantic similarity
|
||||
})
|
||||
```
|
||||
|
||||
### Cost Forecasting
|
||||
|
||||
```markdown
|
||||
💡 **Budget Estimate:**
|
||||
- Fetching all 🔴 gotchas: ~450 tokens
|
||||
- Fetching all file-related: ~1,200 tokens
|
||||
- Fetching everything: ~8,500 tokens
|
||||
```
|
||||
|
||||
### Progressive Detail Levels
|
||||
|
||||
```
|
||||
Layer 1: Index (titles only)
|
||||
Layer 2: Summaries (2-3 sentences)
|
||||
Layer 3: Full details (complete observation)
|
||||
Layer 4: Source files (referenced code)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Key Takeaways
|
||||
|
||||
1. **Show, don't tell**: Index reveals what exists without forcing consumption
|
||||
2. **Cost-conscious**: Make retrieval costs visible for informed decisions
|
||||
3. **Agent autonomy**: Let the agent decide what's relevant
|
||||
4. **Semantic compression**: Good titles make or break the system
|
||||
5. **Consistent structure**: Patterns reduce cognitive load
|
||||
6. **Two-tier everything**: Index first, details on-demand
|
||||
7. **Context as currency**: Spend wisely on high-value information
|
||||
|
||||
---
|
||||
|
||||
## Remember
|
||||
|
||||
> "The best interface is one that disappears when not needed, and appears exactly when it is."
|
||||
|
||||
Progressive disclosure respects the agent's intelligence and autonomy. We provide the map; the agent chooses the path.
|
||||
|
||||
---
|
||||
|
||||
## Further Reading
|
||||
|
||||
- [Context Engineering for AI Agents](/docs/context-engineering) - Foundational principles
|
||||
- [Claude-Mem Architecture](/docs/architecture) - How it all fits together
|
||||
- Cognitive Load Theory (Sweller, 1988)
|
||||
- Information Foraging Theory (Pirolli & Card, 1999)
|
||||
- Progressive Disclosure (Nielsen Norman Group)
|
||||
|
||||
---
|
||||
|
||||
*This philosophy emerged from real-world usage of Claude-Mem across hundreds of coding sessions. The pattern works because it aligns with both human cognition and LLM attention mechanics.*
|
||||
Reference in New Issue
Block a user