Performance improvements: Token reduction and enhanced summaries (#101)

* refactor: Reduce continuation prompt token usage by 95 lines

Removed redundant instructions from continuation prompt that were originally
added to mitigate a session continuity issue. That issue has since been
resolved, making these detailed instructions unnecessary on every continuation.

Changes:
- Reduced continuation prompt from ~106 lines to ~11 lines (~95 line reduction)
- Changed "User's Goal:" to "Next Prompt in Session:" (more accurate framing)
- Removed redundant WHAT TO RECORD, WHEN TO SKIP, and OUTPUT FORMAT sections
- Kept concise reminder: "Continue generating observations and progress summaries..."
- Initial prompt still contains all detailed instructions

Impact:
- Significant token savings on every continuation prompt
- Faster context injection with no loss of functionality
- Instructions remain comprehensive in initial prompt

Files modified:
- src/sdk/prompts.ts (buildContinuationPrompt function)
- plugin/scripts/worker-service.cjs (compiled output)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* refactor: Enhance observation and summary prompts for clarity and token efficiency

* Enhance prompt clarity and instructions in prompts.ts

- Added a reminder to think about instructions before starting work.
- Simplified the continuation prompt instruction by removing "for this ongoing session."

* feat: Enhance settings.json with permissions and deny access to sensitive files

refactor: Remove PLAN-full-observation-display.md and PR_SUMMARY.md as they are no longer needed

chore: Delete SECURITY_SUMMARY.md since it is redundant after recent changes

fix: Update worker-service.cjs to streamline observation generation instructions

cleanup: Remove src-analysis.md and src-tree.md for a cleaner codebase

refactor: Modify prompts.ts to clarify instructions for memory processing

* refactor: Remove legacy worker service implementation

* feat: Enhance summary hook to extract last assistant message and improve logging

- Added function to extract the last assistant message from the transcript.
- Updated summary hook to include last assistant message in the summary request.
- Modified SDKSession interface to store last assistant message.
- Adjusted buildSummaryPrompt to utilize last assistant message for generating summaries.
- Updated worker service and session manager to handle last assistant message in summarize requests.
- Introduced silentDebug utility for improved logging and diagnostics throughout the summary process.

* docs: Add comprehensive implementation plan for ROI metrics feature

Added detailed implementation plan covering:
- Token usage capture from Agent SDK
- Database schema changes (migration #8)
- Discovery cost tracking per observation
- Context hook display with ROI metrics
- Testing and rollout strategy

Timeline: ~20 hours over 4 days
Goal: Empirical data for YC application amendment

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* feat: Add transcript processing scripts for analysis and formatting

- Implemented `dump-transcript-readable.ts` to generate a readable markdown dump of transcripts, excluding certain entry types.
- Created `extract-rich-context-examples.ts` to extract and showcase rich context examples from transcripts, highlighting user requests and assistant reasoning.
- Developed `format-transcript-context.ts` to format transcript context into a structured markdown format for improved observation generation.
- Added `test-transcript-parser.ts` for validating data extraction from transcript JSONL files, including statistics and error reporting.
- Introduced `transcript-to-markdown.ts` for a complete representation of transcript data in markdown format, showing all context data.
- Enhanced type definitions in `transcript.ts` to support new features and ensure type safety.
- Built `transcript-parser.ts` to handle parsing of transcript JSONL files, including error handling and data extraction methods.

* Refactor hooks and SDKAgent for improved observation handling

- Updated `new-hook.ts` to clean user prompts by stripping leading slashes for better semantic clarity.
- Enhanced `save-hook.ts` to include additional tools in the SKIP_TOOLS set, preventing unnecessary observations from certain command invocations.
- Modified `prompts.ts` to change the structure of observation prompts, emphasizing the observational role and providing a detailed XML output format for observations.
- Adjusted `SDKAgent.ts` to enforce stricter tool usage restrictions, ensuring the memory agent operates solely as an observer without any tool access.

* feat: Enhance session initialization to accept user prompts and prompt numbers

- Updated `handleSessionInit` in `worker-service.ts` to extract `userPrompt` and `promptNumber` from the request body and pass them to `initializeSession`.
- Modified `initializeSession` in `SessionManager.ts` to handle optional `currentUserPrompt` and `promptNumber` parameters.
- Added logic to update the existing session's `userPrompt` and `lastPromptNumber` if a `currentUserPrompt` is provided.
- Implemented debug logging for session initialization and updates to track user prompts and prompt numbers.

---------

Co-authored-by: Claude <noreply@anthropic.com>
This commit is contained in:
Alex Newman
2025-11-13 18:22:44 -05:00
committed by GitHub
parent ab5d78717f
commit 68290a9121
39 changed files with 4584 additions and 2809 deletions
+113
View File
@@ -0,0 +1,113 @@
#!/usr/bin/env tsx
/**
* Debug Transcript Structure
* Examines the first few entries to understand the conversation flow
*/
import { TranscriptParser } from '../src/utils/transcript-parser.js';
const transcriptPath = process.argv[2];
if (!transcriptPath) {
console.error('Usage: tsx scripts/debug-transcript-structure.ts <path-to-transcript.jsonl>');
process.exit(1);
}
const parser = new TranscriptParser(transcriptPath);
const entries = parser.getAllEntries();
console.log(`Total entries: ${entries.length}\n`);
// Count entry types
const typeCounts: Record<string, number> = {};
for (const entry of entries) {
typeCounts[entry.type] = (typeCounts[entry.type] || 0) + 1;
}
console.log('Entry types:');
for (const [type, count] of Object.entries(typeCounts)) {
console.log(` ${type}: ${count}`);
}
// Find first user and assistant entries
const firstUser = entries.find(e => e.type === 'user');
const firstAssistant = entries.find(e => e.type === 'assistant');
if (firstUser) {
const userIndex = entries.indexOf(firstUser);
console.log(`\n\n=== First User Entry (index ${userIndex}) ===`);
console.log(`Timestamp: ${firstUser.timestamp}`);
if (typeof firstUser.content === 'string') {
console.log(`Content (string): ${firstUser.content.substring(0, 200)}...`);
} else if (Array.isArray(firstUser.content)) {
console.log(`Content blocks: ${firstUser.content.length}`);
for (const block of firstUser.content) {
if (block.type === 'text') {
console.log(` - text: ${(block as any).text?.substring(0, 200)}...`);
} else {
console.log(` - ${block.type}`);
}
}
}
}
if (firstAssistant) {
const assistantIndex = entries.indexOf(firstAssistant);
console.log(`\n\n=== First Assistant Entry (index ${assistantIndex}) ===`);
console.log(`Timestamp: ${firstAssistant.timestamp}`);
if (Array.isArray(firstAssistant.content)) {
console.log(`Content blocks: ${firstAssistant.content.length}`);
for (const block of firstAssistant.content) {
if (block.type === 'text') {
console.log(` - text: ${(block as any).text?.substring(0, 200)}...`);
} else if (block.type === 'thinking') {
console.log(` - thinking: ${(block as any).thinking?.substring(0, 200)}...`);
} else if (block.type === 'tool_use') {
console.log(` - tool_use: ${(block as any).name}`);
}
}
}
}
// Find a few more user/assistant pairs
console.log('\n\n=== First 3 Conversation Exchanges ===\n');
let userCount = 0;
let assistantCount = 0;
let exchangeNum = 0;
for (const entry of entries) {
if (entry.type === 'user') {
userCount++;
if (userCount <= 3) {
exchangeNum++;
console.log(`\n--- Exchange ${exchangeNum}: USER ---`);
if (typeof entry.content === 'string') {
console.log(entry.content.substring(0, 150) + (entry.content.length > 150 ? '...' : ''));
} else if (Array.isArray(entry.content)) {
const textBlock = entry.content.find((b: any) => b.type === 'text');
if (textBlock) {
const text = (textBlock as any).text || '';
console.log(text.substring(0, 150) + (text.length > 150 ? '...' : ''));
}
}
}
} else if (entry.type === 'assistant' && userCount <= 3) {
assistantCount++;
if (Array.isArray(entry.content)) {
const textBlock = entry.content.find((b: any) => b.type === 'text');
const toolUses = entry.content.filter((b: any) => b.type === 'tool_use');
console.log(`\n--- Exchange ${exchangeNum}: ASSISTANT ---`);
if (textBlock) {
const text = (textBlock as any).text || '';
console.log(text.substring(0, 150) + (text.length > 150 ? '...' : ''));
}
if (toolUses.length > 0) {
console.log(`\nTools used: ${toolUses.map((t: any) => t.name).join(', ')}`);
}
}
}
if (userCount >= 3 && assistantCount >= 3) break;
}
+99
View File
@@ -0,0 +1,99 @@
#!/usr/bin/env tsx
/**
* Simple 1:1 transcript dump in readable markdown format
* Shows exactly what's in the transcript, chronologically
*/
import { TranscriptParser } from '../src/utils/transcript-parser.js';
import { writeFileSync } from 'fs';
const transcriptPath = process.argv[2];
if (!transcriptPath) {
console.error('Usage: tsx scripts/dump-transcript-readable.ts <path-to-transcript.jsonl>');
process.exit(1);
}
const parser = new TranscriptParser(transcriptPath);
const entries = parser.getAllEntries();
let output = '# Transcript Dump\n\n';
output += `Total entries: ${entries.length}\n\n`;
output += '---\n\n';
let entryNum = 0;
for (const entry of entries) {
entryNum++;
// Skip file-history-snapshot and summary entries for now
if (entry.type === 'file-history-snapshot' || entry.type === 'summary') continue;
output += `## Entry ${entryNum}: ${entry.type.toUpperCase()}\n`;
output += `**Timestamp:** ${entry.timestamp}\n\n`;
if (entry.type === 'user') {
const content = entry.message.content;
if (typeof content === 'string') {
output += `**Content:**\n\`\`\`\n${content}\n\`\`\`\n\n`;
} else if (Array.isArray(content)) {
for (const block of content) {
if (block.type === 'text') {
output += `**Text:**\n\`\`\`\n${(block as any).text}\n\`\`\`\n\n`;
} else if (block.type === 'tool_result') {
output += `**Tool Result (${(block as any).tool_use_id}):**\n`;
const resultContent = (block as any).content;
if (typeof resultContent === 'string') {
const preview = resultContent.substring(0, 500);
output += `\`\`\`\n${preview}${resultContent.length > 500 ? '\n...(truncated)' : ''}\n\`\`\`\n\n`;
} else {
output += `\`\`\`json\n${JSON.stringify(resultContent, null, 2).substring(0, 500)}\n\`\`\`\n\n`;
}
}
}
}
}
if (entry.type === 'assistant') {
const content = entry.message.content;
if (Array.isArray(content)) {
for (const block of content) {
if (block.type === 'text') {
output += `**Text:**\n\`\`\`\n${(block as any).text}\n\`\`\`\n\n`;
} else if (block.type === 'thinking') {
output += `**Thinking:**\n\`\`\`\n${(block as any).thinking}\n\`\`\`\n\n`;
} else if (block.type === 'tool_use') {
const tool = block as any;
output += `**Tool Use: ${tool.name}**\n`;
output += `\`\`\`json\n${JSON.stringify(tool.input, null, 2)}\n\`\`\`\n\n`;
}
}
}
// Show token usage if available
const usage = entry.message.usage;
if (usage) {
output += `**Usage:**\n`;
output += `- Input: ${usage.input_tokens || 0}\n`;
output += `- Output: ${usage.output_tokens || 0}\n`;
output += `- Cache creation: ${usage.cache_creation_input_tokens || 0}\n`;
output += `- Cache read: ${usage.cache_read_input_tokens || 0}\n\n`;
}
}
output += '---\n\n';
// Limit to first 20 entries to keep file manageable
if (entryNum >= 20) {
output += `\n_Remaining ${entries.length - 20} entries omitted for brevity_\n`;
break;
}
}
const outputPath = '/Users/alexnewman/Scripts/claude-mem/docs/context/transcript-dump.md';
writeFileSync(outputPath, output, 'utf-8');
console.log(`\nTranscript dumped to: ${outputPath}`);
console.log(`Showing first 20 conversation entries (skipped file-history-snapshot and summary types)\n`);
+177
View File
@@ -0,0 +1,177 @@
#!/usr/bin/env tsx
/**
* Extract Rich Context Examples
* Shows what data we have available for memory worker using TranscriptParser API
*/
import { TranscriptParser } from '../src/utils/transcript-parser.js';
import { writeFileSync } from 'fs';
import type { AssistantTranscriptEntry, UserTranscriptEntry } from '../src/types/transcript.js';
const transcriptPath = process.argv[2];
if (!transcriptPath) {
console.error('Usage: tsx scripts/extract-rich-context-examples.ts <path-to-transcript.jsonl>');
process.exit(1);
}
const parser = new TranscriptParser(transcriptPath);
let output = '# Rich Context Examples\n\n';
output += 'This document shows what contextual data is available in transcripts\n';
output += 'that could improve observation generation quality.\n\n';
// Get stats using parser API
const stats = parser.getParseStats();
const tokens = parser.getTotalTokenUsage();
output += `## Statistics\n\n`;
output += `- Total entries: ${stats.parsedEntries}\n`;
output += `- User messages: ${stats.entriesByType['user'] || 0}\n`;
output += `- Assistant messages: ${stats.entriesByType['assistant'] || 0}\n`;
output += `- Token usage: ${(tokens.inputTokens + tokens.outputTokens).toLocaleString()} total\n`;
output += `- Cache efficiency: ${tokens.cacheReadTokens.toLocaleString()} tokens read from cache\n\n`;
// Extract conversation pairs with tool uses
const assistantEntries = parser.getAssistantEntries();
const userEntries = parser.getUserEntries();
output += `## Conversation Flow\n\n`;
output += `This shows how user requests, assistant reasoning, and tool executions flow together.\n`;
output += `This is the rich context currently missing from individual tool observations.\n\n`;
let examplesFound = 0;
const maxExamples = 5;
// Match assistant entries with their preceding user message
for (let i = 0; i < assistantEntries.length && examplesFound < maxExamples; i++) {
const assistantEntry = assistantEntries[i];
const content = assistantEntry.message.content;
if (!Array.isArray(content)) continue;
// Extract components from assistant message
const textBlocks = content.filter((c: any) => c.type === 'text');
const thinkingBlocks = content.filter((c: any) => c.type === 'thinking');
const toolUseBlocks = content.filter((c: any) => c.type === 'tool_use');
// Skip if no tools or only MCP tools
const regularTools = toolUseBlocks.filter((t: any) =>
!t.name.startsWith('mcp__')
);
if (regularTools.length === 0) continue;
// Find the user message that preceded this assistant response
let userMessage = '';
const assistantTimestamp = new Date(assistantEntry.timestamp).getTime();
for (const userEntry of userEntries) {
const userTimestamp = new Date(userEntry.timestamp).getTime();
if (userTimestamp < assistantTimestamp) {
// Extract user text using parser's helper
const extractText = (content: any): string => {
if (typeof content === 'string') return content;
if (Array.isArray(content)) {
return content
.filter((c: any) => c.type === 'text')
.map((c: any) => c.text)
.join('\n');
}
return '';
};
const text = extractText(userEntry.message.content);
if (text.trim()) {
userMessage = text;
}
}
}
examplesFound++;
output += `---\n\n`;
output += `### Example ${examplesFound}\n\n`;
// 1. User Request
if (userMessage) {
output += `#### 👤 User Request\n`;
const preview = userMessage.substring(0, 400);
output += `\`\`\`\n${preview}${userMessage.length > 400 ? '\n...(truncated)' : ''}\n\`\`\`\n\n`;
}
// 2. Assistant's Explanation (what it plans to do)
if (textBlocks.length > 0) {
const text = textBlocks.map((b: any) => b.text).join('\n');
output += `#### 🤖 Assistant's Plan\n`;
const preview = text.substring(0, 400);
output += `\`\`\`\n${preview}${text.length > 400 ? '\n...(truncated)' : ''}\n\`\`\`\n\n`;
}
// 3. Internal Reasoning (thinking)
if (thinkingBlocks.length > 0) {
const thinking = thinkingBlocks.map((b: any) => b.thinking).join('\n');
output += `#### 💭 Internal Reasoning\n`;
const preview = thinking.substring(0, 300);
output += `\`\`\`\n${preview}${thinking.length > 300 ? '\n...(truncated)' : ''}\n\`\`\`\n\n`;
}
// 4. Tool Executions
output += `#### 🔧 Tools Executed (${regularTools.length})\n\n`;
for (const tool of regularTools) {
const toolData = tool as any;
output += `**${toolData.name}**\n`;
// Show relevant input fields
const input = toolData.input;
if (toolData.name === 'Read') {
output += `- Reading: \`${input.file_path}\`\n`;
} else if (toolData.name === 'Write') {
output += `- Writing: \`${input.file_path}\` (${input.content?.length || 0} chars)\n`;
} else if (toolData.name === 'Edit') {
output += `- Editing: \`${input.file_path}\`\n`;
} else if (toolData.name === 'Bash') {
output += `- Command: \`${input.command}\`\n`;
} else if (toolData.name === 'Glob') {
output += `- Pattern: \`${input.pattern}\`\n`;
} else if (toolData.name === 'Grep') {
output += `- Searching for: \`${input.pattern}\`\n`;
} else {
output += `\`\`\`json\n${JSON.stringify(input, null, 2).substring(0, 200)}\n\`\`\`\n`;
}
}
output += `\n`;
// Summary of what data is available
output += `**📊 Data Available for This Exchange:**\n`;
output += `- User intent: ✅ (${userMessage.length} chars)\n`;
output += `- Assistant reasoning: ✅ (${textBlocks.reduce((sum, b: any) => sum + b.text.length, 0)} chars)\n`;
output += `- Thinking process: ${thinkingBlocks.length > 0 ? '✅' : '❌'} ${thinkingBlocks.length > 0 ? `(${thinkingBlocks.reduce((sum, b: any) => sum + b.thinking.length, 0)} chars)` : ''}\n`;
output += `- Tool executions: ✅ (${regularTools.length} tools)\n`;
output += `- **Currently sent to memory worker:** Tool inputs/outputs only (no context!) ❌\n\n`;
}
output += `\n---\n\n`;
output += `## Key Insight\n\n`;
output += `Currently, the memory worker receives **isolated tool executions** via save-hook:\n`;
output += `- tool_name: "Read"\n`;
output += `- tool_input: {"file_path": "src/foo.ts"}\n`;
output += `- tool_output: {file contents}\n\n`;
output += `But the transcript contains **rich contextual data**:\n`;
output += `- WHY the tool was used (user's request)\n`;
output += `- WHAT the assistant planned to accomplish\n`;
output += `- HOW it fits into the broader task\n`;
output += `- The assistant's reasoning/thinking\n`;
output += `- Multiple related tools used together\n\n`;
output += `This context would help the memory worker:\n`;
output += `1. Understand if a tool use is meaningful or routine\n`;
output += `2. Generate observations that capture WHY, not just WHAT\n`;
output += `3. Group related tools into coherent actions\n`;
output += `4. Avoid "investigating" - the context is already present\n\n`;
// Write to file
const outputPath = '/Users/alexnewman/Scripts/claude-mem/docs/context/rich-context-examples.md';
writeFileSync(outputPath, output, 'utf-8');
console.log(`\nExtracted ${examplesFound} examples with rich context`);
console.log(`Written to: ${outputPath}\n`);
console.log(`This shows the gap between what's available (rich context) and what's sent (isolated tools)\n`);
+240
View File
@@ -0,0 +1,240 @@
#!/usr/bin/env tsx
/**
* Format Transcript Context
*
* Parses a Claude Code transcript and formats it to show rich contextual data
* that could be used for improved observation generation.
*/
import { TranscriptParser } from '../src/utils/transcript-parser.js';
import { writeFileSync } from 'fs';
import { basename } from 'path';
interface ConversationTurn {
turnNumber: number;
userMessage?: {
content: string;
timestamp: string;
};
assistantMessage?: {
textContent: string;
thinkingContent?: string;
toolUses: Array<{
name: string;
input: any;
timestamp: string;
}>;
timestamp: string;
};
toolResults?: Array<{
toolName: string;
result: any;
timestamp: string;
}>;
}
function extractConversationTurns(parser: TranscriptParser): ConversationTurn[] {
const entries = parser.getAllEntries();
const turns: ConversationTurn[] = [];
let currentTurn: ConversationTurn | null = null;
let turnNumber = 0;
for (const entry of entries) {
// User messages start a new turn
if (entry.type === 'user') {
// If previous turn exists, push it
if (currentTurn) {
turns.push(currentTurn);
}
// Start new turn
turnNumber++;
currentTurn = {
turnNumber,
toolResults: []
};
// Extract user text (skip tool results)
if (typeof entry.content === 'string') {
currentTurn.userMessage = {
content: entry.content,
timestamp: entry.timestamp
};
} else if (Array.isArray(entry.content)) {
const textContent = entry.content
.filter((c: any) => c.type === 'text')
.map((c: any) => c.text)
.join('\n');
if (textContent.trim()) {
currentTurn.userMessage = {
content: textContent,
timestamp: entry.timestamp
};
}
// Extract tool results
const toolResults = entry.content.filter((c: any) => c.type === 'tool_result');
for (const result of toolResults) {
currentTurn.toolResults!.push({
toolName: result.tool_use_id || 'unknown',
result: result.content,
timestamp: entry.timestamp
});
}
}
}
// Assistant messages
if (entry.type === 'assistant' && currentTurn) {
if (!Array.isArray(entry.content)) continue;
const textBlocks = entry.content.filter((c: any) => c.type === 'text');
const thinkingBlocks = entry.content.filter((c: any) => c.type === 'thinking');
const toolUseBlocks = entry.content.filter((c: any) => c.type === 'tool_use');
currentTurn.assistantMessage = {
textContent: textBlocks.map((c: any) => c.text).join('\n'),
thinkingContent: thinkingBlocks.map((c: any) => c.thinking).join('\n'),
toolUses: toolUseBlocks.map((t: any) => ({
name: t.name,
input: t.input,
timestamp: entry.timestamp
})),
timestamp: entry.timestamp
};
}
}
// Push last turn
if (currentTurn) {
turns.push(currentTurn);
}
return turns;
}
function formatTurnToMarkdown(turn: ConversationTurn): string {
let md = '';
md += `## Turn ${turn.turnNumber}\n\n`;
// User message
if (turn.userMessage) {
md += `### 👤 User Request\n`;
md += `**Time:** ${new Date(turn.userMessage.timestamp).toLocaleString()}\n\n`;
md += '```\n';
md += turn.userMessage.content.substring(0, 500);
if (turn.userMessage.content.length > 500) {
md += '\n... (truncated)';
}
md += '\n```\n\n';
}
// Assistant response
if (turn.assistantMessage) {
md += `### 🤖 Assistant Response\n`;
md += `**Time:** ${new Date(turn.assistantMessage.timestamp).toLocaleString()}\n\n`;
// Text content
if (turn.assistantMessage.textContent.trim()) {
md += '**Response:**\n```\n';
md += turn.assistantMessage.textContent.substring(0, 500);
if (turn.assistantMessage.textContent.length > 500) {
md += '\n... (truncated)';
}
md += '\n```\n\n';
}
// Thinking
if (turn.assistantMessage.thinkingContent?.trim()) {
md += '**Thinking:**\n```\n';
md += turn.assistantMessage.thinkingContent.substring(0, 300);
if (turn.assistantMessage.thinkingContent.length > 300) {
md += '\n... (truncated)';
}
md += '\n```\n\n';
}
// Tool uses
if (turn.assistantMessage.toolUses.length > 0) {
md += `**Tools Used:** ${turn.assistantMessage.toolUses.length}\n\n`;
for (const tool of turn.assistantMessage.toolUses) {
md += `- **${tool.name}**\n`;
md += ` \`\`\`json\n`;
const inputStr = JSON.stringify(tool.input, null, 2);
md += inputStr.substring(0, 200);
if (inputStr.length > 200) {
md += '\n ... (truncated)';
}
md += '\n ```\n';
}
md += '\n';
}
}
// Tool results summary
if (turn.toolResults && turn.toolResults.length > 0) {
md += `**Tool Results:** ${turn.toolResults.length} results received\n\n`;
}
md += '---\n\n';
return md;
}
function formatTranscriptToMarkdown(transcriptPath: string): string {
const parser = new TranscriptParser(transcriptPath);
const turns = extractConversationTurns(parser);
const stats = parser.getParseStats();
const tokens = parser.getTotalTokenUsage();
let md = `# Transcript Context Analysis\n\n`;
md += `**File:** ${basename(transcriptPath)}\n`;
md += `**Parsed:** ${new Date().toLocaleString()}\n\n`;
md += `## Statistics\n\n`;
md += `- Total entries: ${stats.totalLines}\n`;
md += `- Successfully parsed: ${stats.parsedEntries}\n`;
md += `- Failed lines: ${stats.failedLines}\n`;
md += `- Conversation turns: ${turns.length}\n\n`;
md += `## Token Usage\n\n`;
md += `- Input tokens: ${tokens.inputTokens.toLocaleString()}\n`;
md += `- Output tokens: ${tokens.outputTokens.toLocaleString()}\n`;
md += `- Cache creation: ${tokens.cacheCreationTokens.toLocaleString()}\n`;
md += `- Cache read: ${tokens.cacheReadTokens.toLocaleString()}\n`;
const totalTokens = tokens.inputTokens + tokens.outputTokens;
md += `- Total: ${totalTokens.toLocaleString()}\n\n`;
md += `---\n\n`;
md += `# Conversation Turns\n\n`;
// Format each turn
for (const turn of turns.slice(0, 20)) { // Limit to first 20 turns for readability
md += formatTurnToMarkdown(turn);
}
if (turns.length > 20) {
md += `\n_... ${turns.length - 20} more turns omitted for brevity_\n`;
}
return md;
}
// Main execution
const transcriptPath = process.argv[2];
if (!transcriptPath) {
console.error('Usage: tsx scripts/format-transcript-context.ts <path-to-transcript.jsonl>');
process.exit(1);
}
console.log(`Parsing transcript: ${transcriptPath}`);
const markdown = formatTranscriptToMarkdown(transcriptPath);
const outputPath = transcriptPath.replace('.jsonl', '-formatted.md');
writeFileSync(outputPath, markdown, 'utf-8');
console.log(`\nFormatted transcript written to: ${outputPath}`);
console.log(`\nOpen with: cat "${outputPath}"\n`);
+167
View File
@@ -0,0 +1,167 @@
#!/usr/bin/env tsx
/**
* Test script for TranscriptParser
* Validates data extraction from Claude Code transcript JSONL files
*
* Usage: npx tsx scripts/test-transcript-parser.ts <path-to-transcript.jsonl>
*/
import { TranscriptParser } from '../src/utils/transcript-parser.js';
import { existsSync } from 'fs';
import { resolve } from 'path';
function formatTokens(num: number): string {
return num.toLocaleString();
}
function formatPercentage(num: number): string {
return `${(num * 100).toFixed(2)}%`;
}
function main() {
const args = process.argv.slice(2);
if (args.length === 0) {
console.error('Usage: npx tsx scripts/test-transcript-parser.ts <path-to-transcript.jsonl>');
console.error('\nExample: npx tsx scripts/test-transcript-parser.ts ~/.cache/claude-code/transcripts/latest.jsonl');
process.exit(1);
}
const transcriptPath = resolve(args[0]);
if (!existsSync(transcriptPath)) {
console.error(`Error: Transcript file not found: ${transcriptPath}`);
process.exit(1);
}
console.log(`\n🔍 Parsing transcript: ${transcriptPath}\n`);
try {
const parser = new TranscriptParser(transcriptPath);
// Get parse statistics
const stats = parser.getParseStats();
console.log('📊 Parse Statistics:');
console.log('─'.repeat(60));
console.log(`Total lines: ${stats.totalLines}`);
console.log(`Parsed entries: ${stats.parsedEntries}`);
console.log(`Failed lines: ${stats.failedLines}`);
console.log(`Failure rate: ${formatPercentage(stats.failureRate)}`);
console.log();
console.log('📋 Entries by Type:');
console.log('─'.repeat(60));
for (const [type, count] of Object.entries(stats.entriesByType)) {
console.log(` ${type.padEnd(20)} ${count}`);
}
console.log();
// Show parse errors if any
if (stats.failedLines > 0) {
console.log('❌ Parse Errors:');
console.log('─'.repeat(60));
const errors = parser.getParseErrors();
errors.slice(0, 5).forEach(err => {
console.log(` Line ${err.lineNumber}: ${err.error}`);
});
if (errors.length > 5) {
console.log(` ... and ${errors.length - 5} more errors`);
}
console.log();
}
// Test data extraction methods
console.log('💬 Message Extraction:');
console.log('─'.repeat(60));
const lastUserMessage = parser.getLastUserMessage();
console.log(`Last user message: ${lastUserMessage ? `"${lastUserMessage.substring(0, 100)}..."` : '(none)'}`);
console.log();
const lastAssistantMessage = parser.getLastAssistantMessage();
console.log(`Last assistant message: ${lastAssistantMessage ? `"${lastAssistantMessage.substring(0, 100)}..."` : '(none)'}`);
console.log();
// Token usage
const tokenUsage = parser.getTotalTokenUsage();
console.log('💰 Token Usage:');
console.log('─'.repeat(60));
console.log(`Input tokens: ${formatTokens(tokenUsage.inputTokens)}`);
console.log(`Output tokens: ${formatTokens(tokenUsage.outputTokens)}`);
console.log(`Cache creation tokens: ${formatTokens(tokenUsage.cacheCreationTokens)}`);
console.log(`Cache read tokens: ${formatTokens(tokenUsage.cacheReadTokens)}`);
console.log(`Total tokens: ${formatTokens(tokenUsage.inputTokens + tokenUsage.outputTokens)}`);
console.log();
// Tool use history
const toolUses = parser.getToolUseHistory();
console.log('🔧 Tool Use History:');
console.log('─'.repeat(60));
if (toolUses.length > 0) {
console.log(`Total tool uses: ${toolUses.length}\n`);
// Group by tool name
const toolCounts = toolUses.reduce((acc, tool) => {
acc[tool.name] = (acc[tool.name] || 0) + 1;
return acc;
}, {} as Record<string, number>);
console.log('Tools used:');
for (const [name, count] of Object.entries(toolCounts).sort((a, b) => b[1] - a[1])) {
console.log(` ${name.padEnd(30)} ${count}x`);
}
} else {
console.log('(no tool uses found)');
}
console.log();
// System entries
const systemEntries = parser.getSystemEntries();
if (systemEntries.length > 0) {
console.log('⚠️ System Entries:');
console.log('─'.repeat(60));
console.log(`Found ${systemEntries.length} system entries`);
systemEntries.slice(0, 3).forEach(entry => {
console.log(` [${entry.level || 'info'}] ${entry.content.substring(0, 80)}...`);
});
if (systemEntries.length > 3) {
console.log(` ... and ${systemEntries.length - 3} more`);
}
console.log();
}
// Summary entries
const summaryEntries = parser.getSummaryEntries();
if (summaryEntries.length > 0) {
console.log('📝 Summary Entries:');
console.log('─'.repeat(60));
console.log(`Found ${summaryEntries.length} summary entries`);
summaryEntries.forEach((entry, i) => {
console.log(`\nSummary ${i + 1}:`);
console.log(entry.summary.substring(0, 200) + '...');
});
console.log();
}
// Queue operations
const queueOps = parser.getQueueOperationEntries();
if (queueOps.length > 0) {
console.log('🔄 Queue Operations:');
console.log('─'.repeat(60));
const enqueues = queueOps.filter(op => op.operation === 'enqueue').length;
const dequeues = queueOps.filter(op => op.operation === 'dequeue').length;
console.log(`Enqueue operations: ${enqueues}`);
console.log(`Dequeue operations: ${dequeues}`);
console.log();
}
console.log('✅ Validation complete!\n');
} catch (error) {
console.error('❌ Error parsing transcript:', error);
process.exit(1);
}
}
main();
+209
View File
@@ -0,0 +1,209 @@
#!/usr/bin/env tsx
/**
* Transcript to Markdown - Complete 1:1 representation
* Shows ALL available context data from a Claude Code transcript
*/
import { TranscriptParser } from '../src/utils/transcript-parser.js';
import type { UserTranscriptEntry, AssistantTranscriptEntry, ToolResultContent } from '../types/transcript.js';
import { writeFileSync } from 'fs';
import { basename } from 'path';
const transcriptPath = process.argv[2];
const maxTurns = process.argv[3] ? parseInt(process.argv[3]) : 20;
if (!transcriptPath) {
console.error('Usage: tsx scripts/transcript-to-markdown.ts <path-to-transcript.jsonl> [max-turns]');
process.exit(1);
}
/**
* Truncate string to max length, adding ellipsis if needed
*/
function truncate(str: string, maxLen: number = 500): string {
if (str.length <= maxLen) return str;
return str.substring(0, maxLen) + '\n... [truncated]';
}
/**
* Format tool result content for display
*/
function formatToolResult(result: ToolResultContent): string {
if (typeof result.content === 'string') {
// Try to parse as JSON for better formatting
try {
const parsed = JSON.parse(result.content);
return JSON.stringify(parsed, null, 2);
} catch {
return truncate(result.content);
}
}
if (Array.isArray(result.content)) {
// Handle array of content items - extract text and parse if JSON
const formatted = result.content.map((item: any) => {
if (item.type === 'text' && item.text) {
try {
const parsed = JSON.parse(item.text);
return JSON.stringify(parsed, null, 2);
} catch {
return item.text;
}
}
return JSON.stringify(item, null, 2);
}).join('\n\n');
return formatted;
}
return '[unknown result type]';
}
const parser = new TranscriptParser(transcriptPath);
const entries = parser.getAllEntries();
const stats = parser.getParseStats();
let output = `# Transcript: ${basename(transcriptPath)}\n\n`;
output += `**Generated:** ${new Date().toLocaleString()}\n`;
output += `**Total Entries:** ${stats.parsedEntries}\n`;
output += `**Entry Types:** ${JSON.stringify(stats.entriesByType, null, 2)}\n`;
output += `**Showing:** First ${maxTurns} conversation turns\n\n`;
output += `---\n\n`;
let turnNumber = 0;
let inTurn = false;
for (const entry of entries) {
// Skip summary and file-history-snapshot entries
if (entry.type === 'summary' || entry.type === 'file-history-snapshot') continue;
// USER MESSAGE
if (entry.type === 'user') {
const userEntry = entry as UserTranscriptEntry;
turnNumber++;
if (turnNumber > maxTurns) break;
inTurn = true;
output += `## Turn ${turnNumber}\n\n`;
output += `### 👤 User\n`;
output += `**Timestamp:** ${userEntry.timestamp}\n`;
output += `**UUID:** ${userEntry.uuid}\n`;
output += `**Session ID:** ${userEntry.sessionId}\n`;
output += `**CWD:** ${userEntry.cwd}\n\n`;
// Extract user message text
if (typeof userEntry.message.content === 'string') {
output += userEntry.message.content + '\n\n';
} else if (Array.isArray(userEntry.message.content)) {
const textBlocks = userEntry.message.content.filter((c) => c.type === 'text');
if (textBlocks.length > 0) {
const text = textBlocks.map((b: any) => b.text).join('\n');
output += text + '\n\n';
}
// Show ACTUAL tool results with their data
const toolResults = userEntry.message.content.filter((c): c is ToolResultContent => c.type === 'tool_result');
if (toolResults.length > 0) {
output += `**Tool Results Submitted (${toolResults.length}):**\n\n`;
for (const result of toolResults) {
output += `- **Tool Use ID:** \`${result.tool_use_id}\`\n`;
if (result.is_error) {
output += ` **ERROR:**\n`;
}
output += ` \`\`\`json\n`;
output += ` ${formatToolResult(result)}\n`;
output += ` \`\`\`\n\n`;
}
}
}
}
// ASSISTANT MESSAGE
if (entry.type === 'assistant' && inTurn) {
const assistantEntry = entry as AssistantTranscriptEntry;
output += `### 🤖 Assistant\n`;
output += `**Timestamp:** ${assistantEntry.timestamp}\n`;
output += `**UUID:** ${assistantEntry.uuid}\n`;
output += `**Model:** ${assistantEntry.message.model}\n`;
output += `**Stop Reason:** ${assistantEntry.message.stop_reason || 'N/A'}\n\n`;
if (!Array.isArray(assistantEntry.message.content)) {
output += `*[No content]*\n\n`;
continue;
}
const content = assistantEntry.message.content;
// 1. Thinking blocks (show first, as they happen first in reasoning)
const thinkingBlocks = content.filter((c) => c.type === 'thinking');
if (thinkingBlocks.length > 0) {
output += `**💭 Thinking:**\n\n`;
for (const block of thinkingBlocks) {
const thinking = (block as any).thinking;
// Format thinking with proper line breaks and indentation
const formattedThinking = thinking
.split('\n')
.map((line: string) => line.trimEnd())
.join('\n');
output += '> ';
output += formattedThinking.replace(/\n/g, '\n> ');
output += '\n\n';
}
}
// 2. Text responses
const textBlocks = content.filter((c) => c.type === 'text');
if (textBlocks.length > 0) {
output += `**Response:**\n\n`;
for (const block of textBlocks) {
output += (block as any).text + '\n\n';
}
}
// 3. Tool uses - show complete input
const toolUseBlocks = content.filter((c) => c.type === 'tool_use');
if (toolUseBlocks.length > 0) {
output += `**🔧 Tools Used (${toolUseBlocks.length}):**\n\n`;
for (const tool of toolUseBlocks) {
const t = tool as any;
output += `- **${t.name}** (ID: \`${t.id}\`)\n`;
output += ` \`\`\`json\n`;
output += ` ${JSON.stringify(t.input, null, 2)}\n`;
output += ` \`\`\`\n\n`;
}
}
// 4. Token usage
if (assistantEntry.message.usage) {
const usage = assistantEntry.message.usage;
output += `**📊 Token Usage:**\n`;
output += `- Input: ${usage.input_tokens || 0}\n`;
output += `- Output: ${usage.output_tokens || 0}\n`;
if (usage.cache_creation_input_tokens) {
output += `- Cache creation: ${usage.cache_creation_input_tokens}\n`;
}
if (usage.cache_read_input_tokens) {
output += `- Cache read: ${usage.cache_read_input_tokens}\n`;
}
output += '\n';
}
output += `---\n\n`;
inTurn = false;
}
}
if (turnNumber < (stats.entriesByType['user'] || 0)) {
output += `\n*... ${(stats.entriesByType['user'] || 0) - turnNumber} more turns not shown*\n`;
}
// Write output
const outputPath = transcriptPath.replace('.jsonl', '-complete.md');
writeFileSync(outputPath, output, 'utf-8');
console.log(`\nComplete transcript written to: ${outputPath}`);
console.log(`Turns shown: ${Math.min(turnNumber, maxTurns)} of ${stats.entriesByType['user'] || 0}\n`);