Performance improvements: Token reduction and enhanced summaries (#101)

* refactor: Reduce continuation prompt token usage by 95 lines Removed redundant instructions from continuation prompt that were originally added to mitigate a session continuity issue. That issue has since been resolved, making these detailed instructions unnecessary on every continuation. Changes: - Reduced continuation prompt from ~106 lines to ~11 lines (~95 line reduction) - Changed "User's Goal:" to "Next Prompt in Session:" (more accurate framing) - Removed redundant WHAT TO RECORD, WHEN TO SKIP, and OUTPUT FORMAT sections - Kept concise reminder: "Continue generating observations and progress summaries..." - Initial prompt still contains all detailed instructions Impact: - Significant token savings on every continuation prompt - Faster context injection with no loss of functionality - Instructions remain comprehensive in initial prompt Files modified: - src/sdk/prompts.ts (buildContinuationPrompt function) - plugin/scripts/worker-service.cjs (compiled output) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * refactor: Enhance observation and summary prompts for clarity and token efficiency * Enhance prompt clarity and instructions in prompts.ts - Added a reminder to think about instructions before starting work. - Simplified the continuation prompt instruction by removing "for this ongoing session." * feat: Enhance settings.json with permissions and deny access to sensitive files refactor: Remove PLAN-full-observation-display.md and PR_SUMMARY.md as they are no longer needed chore: Delete SECURITY_SUMMARY.md since it is redundant after recent changes fix: Update worker-service.cjs to streamline observation generation instructions cleanup: Remove src-analysis.md and src-tree.md for a cleaner codebase refactor: Modify prompts.ts to clarify instructions for memory processing * refactor: Remove legacy worker service implementation * feat: Enhance summary hook to extract last assistant message and improve logging - Added function to extract the last assistant message from the transcript. - Updated summary hook to include last assistant message in the summary request. - Modified SDKSession interface to store last assistant message. - Adjusted buildSummaryPrompt to utilize last assistant message for generating summaries. - Updated worker service and session manager to handle last assistant message in summarize requests. - Introduced silentDebug utility for improved logging and diagnostics throughout the summary process. * docs: Add comprehensive implementation plan for ROI metrics feature Added detailed implementation plan covering: - Token usage capture from Agent SDK - Database schema changes (migration #8) - Discovery cost tracking per observation - Context hook display with ROI metrics - Testing and rollout strategy Timeline: ~20 hours over 4 days Goal: Empirical data for YC application amendment 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * feat: Add transcript processing scripts for analysis and formatting - Implemented `dump-transcript-readable.ts` to generate a readable markdown dump of transcripts, excluding certain entry types. - Created `extract-rich-context-examples.ts` to extract and showcase rich context examples from transcripts, highlighting user requests and assistant reasoning. - Developed `format-transcript-context.ts` to format transcript context into a structured markdown format for improved observation generation. - Added `test-transcript-parser.ts` for validating data extraction from transcript JSONL files, including statistics and error reporting. - Introduced `transcript-to-markdown.ts` for a complete representation of transcript data in markdown format, showing all context data. - Enhanced type definitions in `transcript.ts` to support new features and ensure type safety. - Built `transcript-parser.ts` to handle parsing of transcript JSONL files, including error handling and data extraction methods. * Refactor hooks and SDKAgent for improved observation handling - Updated `new-hook.ts` to clean user prompts by stripping leading slashes for better semantic clarity. - Enhanced `save-hook.ts` to include additional tools in the SKIP_TOOLS set, preventing unnecessary observations from certain command invocations. - Modified `prompts.ts` to change the structure of observation prompts, emphasizing the observational role and providing a detailed XML output format for observations. - Adjusted `SDKAgent.ts` to enforce stricter tool usage restrictions, ensuring the memory agent operates solely as an observer without any tool access. * feat: Enhance session initialization to accept user prompts and prompt numbers - Updated `handleSessionInit` in `worker-service.ts` to extract `userPrompt` and `promptNumber` from the request body and pass them to `initializeSession`. - Modified `initializeSession` in `SessionManager.ts` to handle optional `currentUserPrompt` and `promptNumber` parameters. - Added logic to update the existing session's `userPrompt` and `lastPromptNumber` if a `currentUserPrompt` is provided. - Implemented debug logging for session initialization and updates to track user prompts and prompt numbers. --------- Co-authored-by: Claude <noreply@anthropic.com>
2025-11-13 18:22:44 -05:00
parent ab5d78717f
commit 68290a9121
39 changed files with 4584 additions and 2809 deletions
@@ -17,11 +17,11 @@ export interface ParsedObservation {
 }

 export interface ParsedSummary {
-  request: string;
-  investigated: string;
-  learned: string;
-  completed: string;
-  next_steps: string;
+  request: string | null;
+  investigated: string | null;
+  learned: string | null;
+  completed: string | null;
+  next_steps: string | null;
  notes: string | null;
 }

@@ -18,6 +18,7 @@ export interface SDKSession {
  project: string;
  user_prompt: string;
  last_user_message?: string;
+  last_assistant_message?: string;
 }

 /**
@@ -28,8 +29,12 @@ export function buildInitPrompt(project: string, sessionId: string, userPrompt:

 CRITICAL: Record what was LEARNED/BUILT/FIXED/DEPLOYED/CONFIGURED, not what you (the observer) are doing.

-User's Goal: ${userPrompt}
-Date: ${new Date().toISOString().split('T')[0]}
+You do not have access to tools. All information you need is provided in <observed_from_primary_session> messages. Create observations from what you observe - no investigation needed.
+
+<observed_from_primary_session>
+  <user_request>${userPrompt}</user_request>
+  <requested_at>${new Date().toISOString().split('T')[0]}</requested_at>
+</observed_from_primary_session>

 Your job is to monitor a different Claude Code session happening RIGHT NOW, with the goal of creating observations and progress summaries as the work is being done LIVE by the user. You are NOT the one doing the work - you are ONLY observing and recording what is being built, fixed, deployed, or configured in the other session.

@@ -128,7 +133,11 @@ Output observations using this XML structure:
 </observation>
 \`\`\`

-IMPORTANT! DO NOT do any work other than generate the OBSERVATIONS or PROGRESS SUMMARIES - and remember that you are a memory agent designed to summarize a DIFFERENT claude code session, not this one. Never reference yourself or your own actions. Never output anything other than the XML structures defined for observations and summaries. All other output is ignored and would be better left unsaid.
+IMPORTANT! DO NOT do any work right now other than generating this OBSERVATIONS from tool use messages - and remember that you are a memory agent designed to summarize a DIFFERENT claude code session, not this one. 
+
+Never reference yourself or your own actions. Do not output anything other than the observation content formatted in the XML structure above. All other output is ignored by the system, and the system has been designed to be smart about token usage. Please spend your tokens wisely on useful observations. 
+
+Remember that we record these observations as a way of helping us stay on track with our progress, and to help us keep important decisions and changes at the forefront of our minds! :) Thank you so much for your help!

 MEMORY PROCESSING START
 =======================`;
@@ -154,30 +163,30 @@ export function buildObservationPrompt(obs: Observation): string {
    toolOutput = obs.tool_output;  // If parse fails, use raw value
  }

-  return `<tool_used>
-  <tool_name>${obs.tool_name}</tool_name>
-  <tool_time>${new Date(obs.created_at_epoch).toISOString()}</tool_time>${obs.cwd ? `\n  <tool_cwd>${obs.cwd}</tool_cwd>` : ''}
-  <tool_input>${JSON.stringify(toolInput, null, 2)}</tool_input>
-  <tool_output>${JSON.stringify(toolOutput, null, 2)}</tool_output>
-</tool_used>`;
+  return `<observed_from_primary_session>
+  <what_happened>${obs.tool_name}</what_happened>
+  <occurred_at>${new Date(obs.created_at_epoch).toISOString()}</occurred_at>${obs.cwd ? `\n  <working_directory>${obs.cwd}</working_directory>` : ''}
+  <parameters>${JSON.stringify(toolInput, null, 2)}</parameters>
+  <outcome>${JSON.stringify(toolOutput, null, 2)}</outcome>
+</observed_from_primary_session>`;
 }

 /**
 * Build prompt to generate progress summary
 */
 export function buildSummaryPrompt(session: SDKSession): string {
-  const lastUserMessage = session.last_user_message || '';
+  const lastAssistantMessage = session.last_assistant_message || '';

  return `PROGRESS SUMMARY CHECKPOINT
 ===========================
 Write progress notes of what was done, what was learned, and what's next. This is a checkpoint to capture progress so far. The session is ongoing - you may receive more requests and tool executions after this summary. Write "next_steps" as the current trajectory of work (what's actively being worked on or coming up next), not as post-session future work. Always write at least a minimal summary explaining current progress, even if work is still in early stages, so that users see a summary output tied to each request.

-Last User Message:
-${lastUserMessage}
+Claude's Full Response to User:
+${lastAssistantMessage}

 Respond in this XML format:
 <summary>
-  <request>[Short title related to the last user message above]</request>
+  <request>[Short title capturing the user's request AND the substance of what was discussed/done]</request>
  <investigated>[What has been explored so far? What was examined?]</investigated>
  <learned>[What have you learned about how things work?]</learned>
  <completed>[What work has been completed so far? What has shipped or changed?]</completed>
@@ -185,7 +194,11 @@ Respond in this XML format:
  <notes>[Additional insights or observations about the current progress]</notes>
 </summary>

-IMPORTANT! DO NOT do any work other than generate the PROGRESS SUMMARY  - and remember that you are a memory agent designed to summarize a DIFFERENT claude code session, not this one. Never reference yourself or your own actions. Never output anything other than the XML structures defined for observations and summaries. All other output is ignored and would be better left unsaid.`;
+IMPORTANT! DO NOT do any work right now other than generating this next PROGRESS SUMMARY - and remember that you are a memory agent designed to summarize a DIFFERENT claude code session, not this one.
+
+Never reference yourself or your own actions. Do not output anything other than the summary content formatted in the XML structure above. All other output is ignored by the system, and the system has been designed to be smart about token usage. Please spend your tokens wisely on useful summary content.
+
+Thank you, this summary will be very useful for keeping track of our progress!`;
 }

 /**
@@ -210,43 +223,17 @@ IMPORTANT! DO NOT do any work other than generate the PROGRESS SUMMARY  - and re
 * First prompt: Uses buildInitPrompt instead (promptNumber === 1)
 */
 export function buildContinuationPrompt(userPrompt: string, promptNumber: number, claudeSessionId: string): string {
-  return `This is continuation prompt #${promptNumber} for session ${claudeSessionId} that you're observing. 
+  return `
+Hello memory agent, you are continuing to observe the primary Claude session.

-CRITICAL: Record what was LEARNED/BUILT/FIXED/DEPLOYED/CONFIGURED, not what you (the observer) are doing.
+<observed_from_primary_session>
+  <user_request>${userPrompt}</user_request>
+  <requested_at>${new Date().toISOString().split('T')[0]}</requested_at>
+</observed_from_primary_session>

-User's Goal: ${userPrompt}
-Date: ${new Date().toISOString().split('T')[0]}
+You do not have access to tools. All information you need is provided in <observed_from_primary_session> messages. Create observations from what you observe - no investigation needed.

-Your job is to continue monitoring the different Claude Code session happening RIGHT NOW, with the goal of creating observations and a progress summary as the work is being done LIVE by the user. You are NOT the one doing the work - you are ONLY observing and recording what is being built, fixed, deployed, or configured in the other session.
-
-WHAT TO RECORD
--------------
-Focus on deliverables and capabilities:
- What the system NOW DOES differently (new capabilities)
- What shipped to users/production (features, fixes, configs, docs)
- Changes in technical domains (auth, data, UI, infra, DevOps, docs)
-
-Use verbs like: implemented, fixed, deployed, configured, migrated, optimized, added, refactored
-
-✅ GOOD EXAMPLES (describes what was built):
- "Authentication now supports OAuth2 with PKCE flow"
- "Deployment pipeline runs canary releases with auto-rollback"
- "Database indexes optimized for common query patterns"
-
-❌ BAD EXAMPLES (describes observation process - DO NOT DO THIS):
- "Analyzed authentication implementation and stored findings"
- "Tracked deployment steps and logged outcomes"
- "Monitored database performance and recorded metrics"
-
-WHEN TO SKIP
------------
-Skip routine operations:
- Empty status checks
- Package installations with no errors
- Simple file listings
- Repetitive operations you've already documented
- If file related research comes back as empty or not found
- **No output necessary if skipping.**
+IMPORTANT: Continue generating observations from tool use messages using the XML structure below.

 OUTPUT FORMAT
 -------------
@@ -309,9 +296,10 @@ Output observations using this XML structure:
 </observation>
 \`\`\`

-IMPORTANT! DO NOT do any work other than generate the OBSERVATIONS or PROGRESS SUMMARIES - and remember that you are a memory agent designed to summarize a DIFFERENT claude code session, not this one. Never reference yourself or your own actions. Never output anything other than the XML structures defined for observations and summaries. All other output is ignored and would be better left unsaid.
+Never reference yourself or your own actions. Do not output anything other than the observation content formatted in the XML structure above. All other output is ignored by the system, and the system has been designed to be smart about token usage. Please spend your tokens wisely on useful observations.

-MEMORY PROCESSING START
-=======================`;
+Remember that we record these observations as a way of helping us stay on track with our progress, and to help us keep important decisions and changes at the forefront of our minds! :) Thank you so much for your continued help!

-}
+MEMORY PROCESSING CONTINUED
+===========================`;
+}