Performance improvements: Token reduction and enhanced summaries (#101)

* refactor: Reduce continuation prompt token usage by 95 lines Removed redundant instructions from continuation prompt that were originally added to mitigate a session continuity issue. That issue has since been resolved, making these detailed instructions unnecessary on every continuation. Changes: - Reduced continuation prompt from ~106 lines to ~11 lines (~95 line reduction) - Changed "User's Goal:" to "Next Prompt in Session:" (more accurate framing) - Removed redundant WHAT TO RECORD, WHEN TO SKIP, and OUTPUT FORMAT sections - Kept concise reminder: "Continue generating observations and progress summaries..." - Initial prompt still contains all detailed instructions Impact: - Significant token savings on every continuation prompt - Faster context injection with no loss of functionality - Instructions remain comprehensive in initial prompt Files modified: - src/sdk/prompts.ts (buildContinuationPrompt function) - plugin/scripts/worker-service.cjs (compiled output) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * refactor: Enhance observation and summary prompts for clarity and token efficiency * Enhance prompt clarity and instructions in prompts.ts - Added a reminder to think about instructions before starting work. - Simplified the continuation prompt instruction by removing "for this ongoing session." * feat: Enhance settings.json with permissions and deny access to sensitive files refactor: Remove PLAN-full-observation-display.md and PR_SUMMARY.md as they are no longer needed chore: Delete SECURITY_SUMMARY.md since it is redundant after recent changes fix: Update worker-service.cjs to streamline observation generation instructions cleanup: Remove src-analysis.md and src-tree.md for a cleaner codebase refactor: Modify prompts.ts to clarify instructions for memory processing * refactor: Remove legacy worker service implementation * feat: Enhance summary hook to extract last assistant message and improve logging - Added function to extract the last assistant message from the transcript. - Updated summary hook to include last assistant message in the summary request. - Modified SDKSession interface to store last assistant message. - Adjusted buildSummaryPrompt to utilize last assistant message for generating summaries. - Updated worker service and session manager to handle last assistant message in summarize requests. - Introduced silentDebug utility for improved logging and diagnostics throughout the summary process. * docs: Add comprehensive implementation plan for ROI metrics feature Added detailed implementation plan covering: - Token usage capture from Agent SDK - Database schema changes (migration #8) - Discovery cost tracking per observation - Context hook display with ROI metrics - Testing and rollout strategy Timeline: ~20 hours over 4 days Goal: Empirical data for YC application amendment 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * feat: Add transcript processing scripts for analysis and formatting - Implemented `dump-transcript-readable.ts` to generate a readable markdown dump of transcripts, excluding certain entry types. - Created `extract-rich-context-examples.ts` to extract and showcase rich context examples from transcripts, highlighting user requests and assistant reasoning. - Developed `format-transcript-context.ts` to format transcript context into a structured markdown format for improved observation generation. - Added `test-transcript-parser.ts` for validating data extraction from transcript JSONL files, including statistics and error reporting. - Introduced `transcript-to-markdown.ts` for a complete representation of transcript data in markdown format, showing all context data. - Enhanced type definitions in `transcript.ts` to support new features and ensure type safety. - Built `transcript-parser.ts` to handle parsing of transcript JSONL files, including error handling and data extraction methods. * Refactor hooks and SDKAgent for improved observation handling - Updated `new-hook.ts` to clean user prompts by stripping leading slashes for better semantic clarity. - Enhanced `save-hook.ts` to include additional tools in the SKIP_TOOLS set, preventing unnecessary observations from certain command invocations. - Modified `prompts.ts` to change the structure of observation prompts, emphasizing the observational role and providing a detailed XML output format for observations. - Adjusted `SDKAgent.ts` to enforce stricter tool usage restrictions, ensuring the memory agent operates solely as an observer without any tool access. * feat: Enhance session initialization to accept user prompts and prompt numbers - Updated `handleSessionInit` in `worker-service.ts` to extract `userPrompt` and `promptNumber` from the request body and pass them to `initializeSession`. - Modified `initializeSession` in `SessionManager.ts` to handle optional `currentUserPrompt` and `promptNumber` parameters. - Added logic to update the existing session's `userPrompt` and `lastPromptNumber` if a `currentUserPrompt` is provided. - Implemented debug logging for session initialization and updates to track user prompts and prompt numbers. --------- Co-authored-by: Claude <noreply@anthropic.com>
2025-11-13 18:22:44 -05:00
parent ab5d78717f
commit 68290a9121
39 changed files with 4584 additions and 2809 deletions
@@ -77,12 +77,16 @@ async function newHook(input?: UserPromptSubmitInput): Promise<void> {

  const port = getWorkerPort();

+  // Strip leading slash from commands for memory agent
+  // /review 101 → review 101 (more semantic for observations)
+  const cleanedPrompt = prompt.startsWith('/') ? prompt.substring(1) : prompt;
+
  try {
    // Initialize session via HTTP
    const response = await fetch(`http://127.0.0.1:${port}/sessions/${sessionDbId}/init`, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
-      body: JSON.stringify({ project, userPrompt: prompt, promptNumber }),
+      body: JSON.stringify({ project, userPrompt: cleanedPrompt, promptNumber }),
      signal: AbortSignal.timeout(5000)
    });

@@ -20,7 +20,11 @@ export interface PostToolUseInput {

 // Tools to skip (low value or too frequent)
 const SKIP_TOOLS = new Set([
-  'ListMcpResourcesTool'
+  'ListMcpResourcesTool',  // MCP infrastructure
+  'SlashCommand',          // Command invocation (observe what it produces, not the call)
+  'Skill',                 // Skill invocation (observe what it produces, not the call)
+  'TodoWrite',             // Task management meta-tool
+  'AskUserQuestion'        // User interaction, not substantive work
 ]);

 /**
@@ -9,6 +9,7 @@ import { SessionStore } from '../services/sqlite/SessionStore.js';
 import { createHookResponse } from './hook-response.js';
 import { logger } from '../utils/logger.js';
 import { ensureWorkerRunning, getWorkerPort } from '../shared/worker-utils.js';
+import { silentDebug } from '../utils/silent-debug.js';

 export interface StopInput {
  session_id: string;
@@ -37,12 +38,16 @@ function extractLastUserMessage(transcriptPath: string): string {
    for (let i = lines.length - 1; i >= 0; i--) {
      try {
        const line = JSON.parse(lines[i]);
-        if (line.role === 'user' && line.content) {
+
+        // Claude Code transcript format: {type: "user", message: {role: "user", content: [...]}}
+        if (line.type === 'user' && line.message?.content) {
+          const content = line.message.content;
+
          // Extract text content (handle both string and array formats)
-          if (typeof line.content === 'string') {
-            return line.content;
-          } else if (Array.isArray(line.content)) {
-            const textParts = line.content
+          if (typeof content === 'string') {
+            return content;
+          } else if (Array.isArray(content)) {
+            const textParts = content
              .filter((c: any) => c.type === 'text')
              .map((c: any) => c.text);
            return textParts.join('\n');
@@ -60,6 +65,63 @@ function extractLastUserMessage(transcriptPath: string): string {
  return '';
 }

+/**
+ * Extract last assistant message from transcript JSONL file
+ * Filters out system-reminder tags to avoid polluting summaries
+ */
+function extractLastAssistantMessage(transcriptPath: string): string {
+  if (!transcriptPath || !existsSync(transcriptPath)) {
+    return '';
+  }
+
+  try {
+    const content = readFileSync(transcriptPath, 'utf-8').trim();
+    if (!content) {
+      return '';
+    }
+
+    const lines = content.split('\n');
+
+    // Parse JSONL and find last assistant message
+    for (let i = lines.length - 1; i >= 0; i--) {
+      try {
+        const line = JSON.parse(lines[i]);
+
+        // Claude Code transcript format: {type: "assistant", message: {role: "assistant", content: [...]}}
+        if (line.type === 'assistant' && line.message?.content) {
+          let text = '';
+          const content = line.message.content;
+
+          // Extract text content (handle both string and array formats)
+          if (typeof content === 'string') {
+            text = content;
+          } else if (Array.isArray(content)) {
+            const textParts = content
+              .filter((c: any) => c.type === 'text')
+              .map((c: any) => c.text);
+            text = textParts.join('\n');
+          }
+
+          // Filter out system-reminder tags and their content
+          text = text.replace(/<system-reminder>[\s\S]*?<\/system-reminder>/g, '');
+
+          // Clean up excessive whitespace
+          text = text.replace(/\n{3,}/g, '\n\n').trim();
+
+          return text;
+        }
+      } catch (parseError) {
+        // Skip malformed lines
+        continue;
+      }
+    }
+  } catch (error) {
+    logger.error('HOOK', 'Failed to read transcript', { transcriptPath }, error as Error);
+  }
+
+  return '';
+}
+
 /**
 * Summary Hook Main Logic
 */
@@ -78,18 +140,50 @@ async function summaryHook(input?: StopInput): Promise<void> {
  // Get or create session
  const sessionDbId = db.createSDKSession(session_id, '', '');
  const promptNumber = db.getPromptCounter(sessionDbId);
+
+  // DIAGNOSTIC: Check session and observations
+  const sessionInfo = db.db.prepare(`
+    SELECT id, claude_session_id, sdk_session_id, project
+    FROM sdk_sessions WHERE id = ?
+  `).get(sessionDbId) as any;
+
+  const obsCount = db.db.prepare(`
+    SELECT COUNT(*) as count
+    FROM observations
+    WHERE sdk_session_id = ?
+  `).get(sessionInfo?.sdk_session_id) as { count: number };
+
+  silentDebug('[summary-hook] Session diagnostics', {
+    claudeSessionId: session_id,
+    sessionDbId,
+    sdkSessionId: sessionInfo?.sdk_session_id,
+    project: sessionInfo?.project,
+    promptNumber,
+    observationCount: obsCount?.count || 0,
+    transcriptPath: input.transcript_path
+  });
+
  db.close();

  const port = getWorkerPort();

-  // Extract last user message from transcript
+  // Extract last user AND assistant messages from transcript
  const lastUserMessage = extractLastUserMessage(input.transcript_path || '');
+  const lastAssistantMessage = extractLastAssistantMessage(input.transcript_path || '');
+
+  silentDebug('[summary-hook] Extracted messages', {
+    hasLastUserMessage: !!lastUserMessage,
+    hasLastAssistantMessage: !!lastAssistantMessage,
+    lastAssistantPreview: lastAssistantMessage.substring(0, 200),
+    lastAssistantLength: lastAssistantMessage.length
+  });

  logger.dataIn('HOOK', 'Stop: Requesting summary', {
    sessionId: sessionDbId,
    workerPort: port,
    promptNumber,
-    hasLastUserMessage: !!lastUserMessage
+    hasLastUserMessage: !!lastUserMessage,
+    hasLastAssistantMessage: !!lastAssistantMessage
  });

  try {
@@ -98,7 +192,8 @@ async function summaryHook(input?: StopInput): Promise<void> {
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        prompt_number: promptNumber,
-        last_user_message: lastUserMessage
+        last_user_message: lastUserMessage,
+        last_assistant_message: lastAssistantMessage
      }),
      signal: AbortSignal.timeout(2000)
    });