Files
claude-mem/docs/context/transcript-data-discovery.md
T
Alex Newman 68290a9121 Performance improvements: Token reduction and enhanced summaries (#101)
* refactor: Reduce continuation prompt token usage by 95 lines

Removed redundant instructions from continuation prompt that were originally
added to mitigate a session continuity issue. That issue has since been
resolved, making these detailed instructions unnecessary on every continuation.

Changes:
- Reduced continuation prompt from ~106 lines to ~11 lines (~95 line reduction)
- Changed "User's Goal:" to "Next Prompt in Session:" (more accurate framing)
- Removed redundant WHAT TO RECORD, WHEN TO SKIP, and OUTPUT FORMAT sections
- Kept concise reminder: "Continue generating observations and progress summaries..."
- Initial prompt still contains all detailed instructions

Impact:
- Significant token savings on every continuation prompt
- Faster context injection with no loss of functionality
- Instructions remain comprehensive in initial prompt

Files modified:
- src/sdk/prompts.ts (buildContinuationPrompt function)
- plugin/scripts/worker-service.cjs (compiled output)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* refactor: Enhance observation and summary prompts for clarity and token efficiency

* Enhance prompt clarity and instructions in prompts.ts

- Added a reminder to think about instructions before starting work.
- Simplified the continuation prompt instruction by removing "for this ongoing session."

* feat: Enhance settings.json with permissions and deny access to sensitive files

refactor: Remove PLAN-full-observation-display.md and PR_SUMMARY.md as they are no longer needed

chore: Delete SECURITY_SUMMARY.md since it is redundant after recent changes

fix: Update worker-service.cjs to streamline observation generation instructions

cleanup: Remove src-analysis.md and src-tree.md for a cleaner codebase

refactor: Modify prompts.ts to clarify instructions for memory processing

* refactor: Remove legacy worker service implementation

* feat: Enhance summary hook to extract last assistant message and improve logging

- Added function to extract the last assistant message from the transcript.
- Updated summary hook to include last assistant message in the summary request.
- Modified SDKSession interface to store last assistant message.
- Adjusted buildSummaryPrompt to utilize last assistant message for generating summaries.
- Updated worker service and session manager to handle last assistant message in summarize requests.
- Introduced silentDebug utility for improved logging and diagnostics throughout the summary process.

* docs: Add comprehensive implementation plan for ROI metrics feature

Added detailed implementation plan covering:
- Token usage capture from Agent SDK
- Database schema changes (migration #8)
- Discovery cost tracking per observation
- Context hook display with ROI metrics
- Testing and rollout strategy

Timeline: ~20 hours over 4 days
Goal: Empirical data for YC application amendment

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* feat: Add transcript processing scripts for analysis and formatting

- Implemented `dump-transcript-readable.ts` to generate a readable markdown dump of transcripts, excluding certain entry types.
- Created `extract-rich-context-examples.ts` to extract and showcase rich context examples from transcripts, highlighting user requests and assistant reasoning.
- Developed `format-transcript-context.ts` to format transcript context into a structured markdown format for improved observation generation.
- Added `test-transcript-parser.ts` for validating data extraction from transcript JSONL files, including statistics and error reporting.
- Introduced `transcript-to-markdown.ts` for a complete representation of transcript data in markdown format, showing all context data.
- Enhanced type definitions in `transcript.ts` to support new features and ensure type safety.
- Built `transcript-parser.ts` to handle parsing of transcript JSONL files, including error handling and data extraction methods.

* Refactor hooks and SDKAgent for improved observation handling

- Updated `new-hook.ts` to clean user prompts by stripping leading slashes for better semantic clarity.
- Enhanced `save-hook.ts` to include additional tools in the SKIP_TOOLS set, preventing unnecessary observations from certain command invocations.
- Modified `prompts.ts` to change the structure of observation prompts, emphasizing the observational role and providing a detailed XML output format for observations.
- Adjusted `SDKAgent.ts` to enforce stricter tool usage restrictions, ensuring the memory agent operates solely as an observer without any tool access.

* feat: Enhance session initialization to accept user prompts and prompt numbers

- Updated `handleSessionInit` in `worker-service.ts` to extract `userPrompt` and `promptNumber` from the request body and pass them to `initializeSession`.
- Modified `initializeSession` in `SessionManager.ts` to handle optional `currentUserPrompt` and `promptNumber` parameters.
- Added logic to update the existing session's `userPrompt` and `lastPromptNumber` if a `currentUserPrompt` is provided.
- Implemented debug logging for session initialization and updates to track user prompts and prompt numbers.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-11-13 18:22:44 -05:00

235 lines
7.2 KiB
Markdown

# Claude Code Transcript Data Discovery
## Executive Summary
This document details findings from implementing a validated transcript parser for Claude Code JSONL transcripts. The parser enables extraction of rich contextual data that can optimize prompt generation and track token usage for ROI metrics.
## Transcript Structure
### File Location
```
~/.claude/projects/<encoded-project-path>/<session-id>.jsonl
```
Example:
```
~/.claude/projects/-Users-alexnewman-Scripts-claude-mem/2933cff9-f0a7-4f0b-8296-0a030e7658a6.jsonl
```
### Entry Types
Discovered 5 transcript entry types:
1. **`file-history-snapshot`** (NEW - not in Python model)
- Purpose: Track file state snapshots
- Frequency: ~10 entries per session
2. **`user`** - User messages and tool results
- Contains actual user text messages OR tool result data
- Can have string content or array of ContentItems
3. **`assistant`** - Assistant responses and tool uses
- Contains text responses, tool uses, and thinking blocks
- **Critical**: Contains usage data with token counts
4. **`summary`** (not yet observed in test data)
- Session summaries
5. **`system`** (not yet observed in test data)
- System messages/warnings
6. **`queue-operation`** (not yet observed in test data)
- Queue tracking for message flow
## Key Findings
### 1. Message Extraction Complexity
**Problem**: Naively getting the "last" entry doesn't work because:
- Last user entry might be a tool result, not a text message
- Last assistant entry might only contain tool uses, no text
**Solution**: Iterate backward through entries to find the last entry with actual text content.
### 2. Tool Use Tracking
**Discovery**: Tool uses are in **assistant** messages, not user messages.
**Data Available**:
```typescript
{
name: string; // Tool name (e.g., "Bash", "Read", "TodoWrite")
timestamp: string; // When the tool was used
input: any; // Full tool input parameters
}
```
**Test Session Results** (168 entries):
- 42 tool uses across 7 different tool types
- Most used: Bash (24x), TodoWrite (5x), Edit (4x)
### 3. Token Usage Data (ROI Foundation)
**Critical Discovery**: Every assistant message contains complete token usage data:
```typescript
interface UsageInfo {
input_tokens?: number; // Total input tokens (includes context)
cache_creation_input_tokens?: number; // Tokens used to create cache
cache_read_input_tokens?: number; // Cached tokens read (discounted cost)
output_tokens?: number; // Model output tokens
}
```
**Test Session Token Analysis**:
```
Input tokens: 858
Output tokens: 44,165
Cache creation tokens: 469,650
Cache read tokens: 5,294,101 ← 5.29M tokens saved by caching!
Total tokens: 45,023
```
**ROI Implication**: This validates our ROI implementation plan. We can track:
- Discovery cost = sum of all input + output tokens across session
- Context savings = cache_read_input_tokens (tokens NOT paid for in full)
- ROI = Discovery cost / Context savings
### 4. Parse Reliability
**Result**: 0.00% parse failure rate on production transcript with 168 entries.
**Conclusion**: The JSONL format is stable and well-formed. No need for extensive error handling.
## Implementation Files
### Created Files
1. **`src/types/transcript.ts`** - TypeScript types matching Python Pydantic model
- All entry types, content types, usage info
- Drop-in compatible with Python model structure
2. **`src/utils/transcript-parser.ts`** - Robust transcript parsing class
- Handles all entry types
- Smart message extraction (finds last text message, not just last entry)
- Tool use history extraction
- Token usage aggregation
- Parse statistics and error tracking
3. **`scripts/test-transcript-parser.ts`** - Validation script
- Tests all extraction methods
- Reports parse statistics
- Shows token usage breakdown
- Lists tool use history
### Usage Example
```typescript
import { TranscriptParser } from '../src/utils/transcript-parser.js';
const parser = new TranscriptParser('/path/to/transcript.jsonl');
// Extract messages
const lastUserMsg = parser.getLastUserMessage();
const lastAssistantMsg = parser.getLastAssistantMessage();
// Get tool history
const tools = parser.getToolUseHistory();
// => [{name: 'Bash', timestamp: '...', input: {...}}, ...]
// Get token usage
const tokens = parser.getTotalTokenUsage();
// => {inputTokens: 858, outputTokens: 44165, cacheReadTokens: 5294101, ...}
// Parse statistics
const stats = parser.getParseStats();
// => {totalLines: 168, parsedEntries: 168, failedLines: 0, ...}
```
## Next Steps for PR Review
### Addressing "Drops Unknown Lines" Concern
**Original Issue**: Summary hook silently skipped malformed lines without visibility.
**Root Cause**: We didn't understand the full transcript model. The "skip malformed lines" was a band-aid.
**Solution**: Replace ad-hoc parsing in `summary-hook.ts` with validated `TranscriptParser` class:
**Before** (summary-hook.ts:38-117):
```typescript
// Manually parsing with try/catch, no type safety
for (let i = lines.length - 1; i >= 0; i--) {
try {
const line = JSON.parse(lines[i]);
if (line.type === 'user' && line.message?.content) {
// ... extraction logic
}
} catch (parseError) {
// Skip malformed lines ← BLACK HOLE
continue;
}
}
```
**After** (using TranscriptParser):
```typescript
import { TranscriptParser } from '../utils/transcript-parser.js';
const parser = new TranscriptParser(transcriptPath);
const lastUserMessage = parser.getLastUserMessage();
const lastAssistantMessage = parser.getLastAssistantMessage();
// Parse errors are tracked in parser.getParseErrors()
```
**Benefits**:
1. ✅ Type-safe extraction based on validated model
2. ✅ No silent failures - parse errors are tracked
3. ✅ Smart extraction (finds last TEXT message, not last entry)
4. ✅ Reusable across all hooks and scripts
5. ✅ Enables token usage tracking (ROI metrics)
6. ✅ Enables tool use tracking (prompt optimization)
## Prompt Optimization Opportunities
With rich transcript data available, we can enhance prompts with:
### 1. Tool Use Patterns
- "In this session you've used: Bash (24x), TodoWrite (5x), Edit (4x)"
- Helps Claude understand what kind of work is being done
### 2. Token Economics Awareness
- "Cache read tokens: 5.29M (context savings)"
- Reinforces value of memory system
### 3. Session Flow Understanding
- Number of user/assistant exchanges
- Tools used per exchange
- Session complexity metrics
### 4. File History Snapshots
- Track which files were modified during session
- Provide file change context to summaries
## Testing
Run the validation script:
```bash
# Find your current session transcript
ls -lt ~/.claude/projects/-Users-alexnewman-Scripts-claude-mem/*.jsonl | head -1
# Test the parser
npx tsx scripts/test-transcript-parser.ts <path-to-transcript.jsonl>
```
## Conclusion
The transcript parser implementation:
1. ✅ Addresses PR review concern about dropped lines
2. ✅ Validates the ROI metrics implementation plan
3. ✅ Enables prompt optimization with rich context
4. ✅ Provides foundation for future enhancements
**Recommendation**: Replace ad-hoc transcript parsing in hooks with `TranscriptParser` class for improved reliability and feature richness.