Performance improvements: Token reduction and enhanced summaries (#101)

* refactor: Reduce continuation prompt token usage by 95 lines

Removed redundant instructions from continuation prompt that were originally
added to mitigate a session continuity issue. That issue has since been
resolved, making these detailed instructions unnecessary on every continuation.

Changes:
- Reduced continuation prompt from ~106 lines to ~11 lines (~95 line reduction)
- Changed "User's Goal:" to "Next Prompt in Session:" (more accurate framing)
- Removed redundant WHAT TO RECORD, WHEN TO SKIP, and OUTPUT FORMAT sections
- Kept concise reminder: "Continue generating observations and progress summaries..."
- Initial prompt still contains all detailed instructions

Impact:
- Significant token savings on every continuation prompt
- Faster context injection with no loss of functionality
- Instructions remain comprehensive in initial prompt

Files modified:
- src/sdk/prompts.ts (buildContinuationPrompt function)
- plugin/scripts/worker-service.cjs (compiled output)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* refactor: Enhance observation and summary prompts for clarity and token efficiency

* Enhance prompt clarity and instructions in prompts.ts

- Added a reminder to think about instructions before starting work.
- Simplified the continuation prompt instruction by removing "for this ongoing session."

* feat: Enhance settings.json with permissions and deny access to sensitive files

refactor: Remove PLAN-full-observation-display.md and PR_SUMMARY.md as they are no longer needed

chore: Delete SECURITY_SUMMARY.md since it is redundant after recent changes

fix: Update worker-service.cjs to streamline observation generation instructions

cleanup: Remove src-analysis.md and src-tree.md for a cleaner codebase

refactor: Modify prompts.ts to clarify instructions for memory processing

* refactor: Remove legacy worker service implementation

* feat: Enhance summary hook to extract last assistant message and improve logging

- Added function to extract the last assistant message from the transcript.
- Updated summary hook to include last assistant message in the summary request.
- Modified SDKSession interface to store last assistant message.
- Adjusted buildSummaryPrompt to utilize last assistant message for generating summaries.
- Updated worker service and session manager to handle last assistant message in summarize requests.
- Introduced silentDebug utility for improved logging and diagnostics throughout the summary process.

* docs: Add comprehensive implementation plan for ROI metrics feature

Added detailed implementation plan covering:
- Token usage capture from Agent SDK
- Database schema changes (migration #8)
- Discovery cost tracking per observation
- Context hook display with ROI metrics
- Testing and rollout strategy

Timeline: ~20 hours over 4 days
Goal: Empirical data for YC application amendment

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* feat: Add transcript processing scripts for analysis and formatting

- Implemented `dump-transcript-readable.ts` to generate a readable markdown dump of transcripts, excluding certain entry types.
- Created `extract-rich-context-examples.ts` to extract and showcase rich context examples from transcripts, highlighting user requests and assistant reasoning.
- Developed `format-transcript-context.ts` to format transcript context into a structured markdown format for improved observation generation.
- Added `test-transcript-parser.ts` for validating data extraction from transcript JSONL files, including statistics and error reporting.
- Introduced `transcript-to-markdown.ts` for a complete representation of transcript data in markdown format, showing all context data.
- Enhanced type definitions in `transcript.ts` to support new features and ensure type safety.
- Built `transcript-parser.ts` to handle parsing of transcript JSONL files, including error handling and data extraction methods.

* Refactor hooks and SDKAgent for improved observation handling

- Updated `new-hook.ts` to clean user prompts by stripping leading slashes for better semantic clarity.
- Enhanced `save-hook.ts` to include additional tools in the SKIP_TOOLS set, preventing unnecessary observations from certain command invocations.
- Modified `prompts.ts` to change the structure of observation prompts, emphasizing the observational role and providing a detailed XML output format for observations.
- Adjusted `SDKAgent.ts` to enforce stricter tool usage restrictions, ensuring the memory agent operates solely as an observer without any tool access.

* feat: Enhance session initialization to accept user prompts and prompt numbers

- Updated `handleSessionInit` in `worker-service.ts` to extract `userPrompt` and `promptNumber` from the request body and pass them to `initializeSession`.
- Modified `initializeSession` in `SessionManager.ts` to handle optional `currentUserPrompt` and `promptNumber` parameters.
- Added logic to update the existing session's `userPrompt` and `lastPromptNumber` if a `currentUserPrompt` is provided.
- Implemented debug logging for session initialization and updates to track user prompts and prompt numbers.

---------

Co-authored-by: Claude <noreply@anthropic.com>
This commit is contained in:
Alex Newman
2025-11-13 18:22:44 -05:00
committed by GitHub
parent ab5d78717f
commit 68290a9121
39 changed files with 4584 additions and 2809 deletions
+5 -1
View File
@@ -77,12 +77,16 @@ async function newHook(input?: UserPromptSubmitInput): Promise<void> {
const port = getWorkerPort();
// Strip leading slash from commands for memory agent
// /review 101 → review 101 (more semantic for observations)
const cleanedPrompt = prompt.startsWith('/') ? prompt.substring(1) : prompt;
try {
// Initialize session via HTTP
const response = await fetch(`http://127.0.0.1:${port}/sessions/${sessionDbId}/init`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ project, userPrompt: prompt, promptNumber }),
body: JSON.stringify({ project, userPrompt: cleanedPrompt, promptNumber }),
signal: AbortSignal.timeout(5000)
});
+5 -1
View File
@@ -20,7 +20,11 @@ export interface PostToolUseInput {
// Tools to skip (low value or too frequent)
const SKIP_TOOLS = new Set([
'ListMcpResourcesTool'
'ListMcpResourcesTool', // MCP infrastructure
'SlashCommand', // Command invocation (observe what it produces, not the call)
'Skill', // Skill invocation (observe what it produces, not the call)
'TodoWrite', // Task management meta-tool
'AskUserQuestion' // User interaction, not substantive work
]);
/**
+103 -8
View File
@@ -9,6 +9,7 @@ import { SessionStore } from '../services/sqlite/SessionStore.js';
import { createHookResponse } from './hook-response.js';
import { logger } from '../utils/logger.js';
import { ensureWorkerRunning, getWorkerPort } from '../shared/worker-utils.js';
import { silentDebug } from '../utils/silent-debug.js';
export interface StopInput {
session_id: string;
@@ -37,12 +38,16 @@ function extractLastUserMessage(transcriptPath: string): string {
for (let i = lines.length - 1; i >= 0; i--) {
try {
const line = JSON.parse(lines[i]);
if (line.role === 'user' && line.content) {
// Claude Code transcript format: {type: "user", message: {role: "user", content: [...]}}
if (line.type === 'user' && line.message?.content) {
const content = line.message.content;
// Extract text content (handle both string and array formats)
if (typeof line.content === 'string') {
return line.content;
} else if (Array.isArray(line.content)) {
const textParts = line.content
if (typeof content === 'string') {
return content;
} else if (Array.isArray(content)) {
const textParts = content
.filter((c: any) => c.type === 'text')
.map((c: any) => c.text);
return textParts.join('\n');
@@ -60,6 +65,63 @@ function extractLastUserMessage(transcriptPath: string): string {
return '';
}
/**
* Extract last assistant message from transcript JSONL file
* Filters out system-reminder tags to avoid polluting summaries
*/
function extractLastAssistantMessage(transcriptPath: string): string {
if (!transcriptPath || !existsSync(transcriptPath)) {
return '';
}
try {
const content = readFileSync(transcriptPath, 'utf-8').trim();
if (!content) {
return '';
}
const lines = content.split('\n');
// Parse JSONL and find last assistant message
for (let i = lines.length - 1; i >= 0; i--) {
try {
const line = JSON.parse(lines[i]);
// Claude Code transcript format: {type: "assistant", message: {role: "assistant", content: [...]}}
if (line.type === 'assistant' && line.message?.content) {
let text = '';
const content = line.message.content;
// Extract text content (handle both string and array formats)
if (typeof content === 'string') {
text = content;
} else if (Array.isArray(content)) {
const textParts = content
.filter((c: any) => c.type === 'text')
.map((c: any) => c.text);
text = textParts.join('\n');
}
// Filter out system-reminder tags and their content
text = text.replace(/<system-reminder>[\s\S]*?<\/system-reminder>/g, '');
// Clean up excessive whitespace
text = text.replace(/\n{3,}/g, '\n\n').trim();
return text;
}
} catch (parseError) {
// Skip malformed lines
continue;
}
}
} catch (error) {
logger.error('HOOK', 'Failed to read transcript', { transcriptPath }, error as Error);
}
return '';
}
/**
* Summary Hook Main Logic
*/
@@ -78,18 +140,50 @@ async function summaryHook(input?: StopInput): Promise<void> {
// Get or create session
const sessionDbId = db.createSDKSession(session_id, '', '');
const promptNumber = db.getPromptCounter(sessionDbId);
// DIAGNOSTIC: Check session and observations
const sessionInfo = db.db.prepare(`
SELECT id, claude_session_id, sdk_session_id, project
FROM sdk_sessions WHERE id = ?
`).get(sessionDbId) as any;
const obsCount = db.db.prepare(`
SELECT COUNT(*) as count
FROM observations
WHERE sdk_session_id = ?
`).get(sessionInfo?.sdk_session_id) as { count: number };
silentDebug('[summary-hook] Session diagnostics', {
claudeSessionId: session_id,
sessionDbId,
sdkSessionId: sessionInfo?.sdk_session_id,
project: sessionInfo?.project,
promptNumber,
observationCount: obsCount?.count || 0,
transcriptPath: input.transcript_path
});
db.close();
const port = getWorkerPort();
// Extract last user message from transcript
// Extract last user AND assistant messages from transcript
const lastUserMessage = extractLastUserMessage(input.transcript_path || '');
const lastAssistantMessage = extractLastAssistantMessage(input.transcript_path || '');
silentDebug('[summary-hook] Extracted messages', {
hasLastUserMessage: !!lastUserMessage,
hasLastAssistantMessage: !!lastAssistantMessage,
lastAssistantPreview: lastAssistantMessage.substring(0, 200),
lastAssistantLength: lastAssistantMessage.length
});
logger.dataIn('HOOK', 'Stop: Requesting summary', {
sessionId: sessionDbId,
workerPort: port,
promptNumber,
hasLastUserMessage: !!lastUserMessage
hasLastUserMessage: !!lastUserMessage,
hasLastAssistantMessage: !!lastAssistantMessage
});
try {
@@ -98,7 +192,8 @@ async function summaryHook(input?: StopInput): Promise<void> {
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
prompt_number: promptNumber,
last_user_message: lastUserMessage
last_user_message: lastUserMessage,
last_assistant_message: lastAssistantMessage
}),
signal: AbortSignal.timeout(2000)
});