Files

T

Alex Newman 501e929138 fix: enhance session continuity by propagating session ID in SDKAgent and adding diagnostic logging

- Updated SDKAgent to include session.claudeSessionId in the options for resuming sessions.
- Added comprehensive logging across multiple files to trace session ID and prompt number flow, aiding in diagnosing session continuity issues.
- Introduced a detailed plan for addressing session continuity regression, outlining phases for logging, testing, and implementing fixes.

2025-12-27 20:03:31 -05:00

20 KiB

Raw Permalink Blame History

Session Continuity Regression Fix - Phased Execution Plan

Project: claude-mem Issue: Session continuity broken - each prompt creates new session instead of continuing existing one Root Cause: Session SDK ID not propagated correctly from new-hook through to SDKAgent History: Recurring issue over 3 months with 7 previous fix attempts that added complexity

Phase 1: Add Diagnostic Logging

Goal: Add comprehensive logging to trace session ID and prompt number flow through the entire system.

Context: Session continuity requires claudeSessionId to flow from hook → SessionStore → SessionManager → SDKAgent. We need to verify this flow is working correctly.

Files to Modify:

src/hooks/new-hook.ts
src/services/worker/http/routes/SessionRoutes.ts
src/services/worker/SessionManager.ts
src/services/worker/SDKAgent.ts

Implementation Steps:

1.1 Add Logging to `src/hooks/new-hook.ts`

Add logging at these locations:

Line ~24 (after receiving hook input):

console.log('[NEW-HOOK] Received hook input:', {
  session_id: hookInput.session_id,
  has_prompt: !!hookInput.prompt,
  cwd: hookInput.cwd
});

Line ~46-47 (before first API call):

console.log('[NEW-HOOK] Calling /api/sessions/init:', {
  claudeSessionId: session_id,
  project,
  prompt_length: prompt?.length
});

Line ~51 (after first API call):

console.log('[NEW-HOOK] Received from /api/sessions/init:', {
  sessionDbId: sessionData.sessionDbId,
  promptNumber: sessionData.promptNumber,
  skipped: sessionData.skipped
});

Line ~68 (before second API call):

console.log('[NEW-HOOK] Calling /sessions/{sessionDbId}/init:', {
  sessionDbId: sessionData.sessionDbId,
  promptNumber: sessionData.promptNumber,
  userPrompt_length: cleanedPrompt?.length
});

1.2 Add Logging to `src/services/worker/http/routes/SessionRoutes.ts`

In handleSessionInitByClaudeId method (~line 483):

console.log('[SESSION-ROUTES] handleSessionInitByClaudeId called:', {
  claudeSessionId,
  project,
  prompt_length: prompt?.length
});

After createSDKSession call (~line 493):

console.log('[SESSION-ROUTES] createSDKSession returned:', {
  sessionDbId,
  claudeSessionId
});

After prompt number calculation (~line 497):

console.log('[SESSION-ROUTES] Calculated promptNumber:', {
  sessionDbId,
  promptNumber,
  currentCount
});

In handleSessionInit method (~line 175):

const { userPrompt, promptNumber } = req.body;
console.log('[SESSION-ROUTES] handleSessionInit called:', {
  sessionDbId,
  promptNumber,
  has_userPrompt: !!userPrompt
});

1.3 Add Logging to `src/services/worker/SessionManager.ts`

In initializeSession method at start (~line 50):

console.log('[SESSION-MANAGER] initializeSession called:', {
  sessionDbId,
  promptNumber,
  has_currentUserPrompt: !!currentUserPrompt
});

When session exists in memory (~line 55):

console.log('[SESSION-MANAGER] Returning cached session:', {
  sessionDbId,
  claudeSessionId: session.claudeSessionId,
  lastPromptNumber: session.lastPromptNumber
});

After fetching from database (~line 87):

console.log('[SESSION-MANAGER] Fetched session from database:', {
  sessionDbId,
  claude_session_id: dbSession.claude_session_id,
  sdk_session_id: dbSession.sdk_session_id
});

When creating new session object (~line 109-116):

console.log('[SESSION-MANAGER] Creating new session object:', {
  sessionDbId,
  claudeSessionId: dbSession.claude_session_id,
  lastPromptNumber: promptNumber || /* fallback value */
});

1.4 Add Logging to `src/services/worker/SDKAgent.ts`

In startSession method (~line 72):

console.log('[SDK-AGENT] Starting SDK query with:', {
  sessionDbId: session.sessionDbId,
  claudeSessionId: session.claudeSessionId,
  resume_parameter: session.claudeSessionId,
  lastPromptNumber: session.lastPromptNumber
});

In createMessageGenerator method (~line 200):

const isInitPrompt = session.lastPromptNumber === 1;
console.log('[SDK-AGENT] Creating message generator:', {
  sessionDbId: session.sessionDbId,
  claudeSessionId: session.claudeSessionId,
  lastPromptNumber: session.lastPromptNumber,
  isInitPrompt,
  promptType: isInitPrompt ? 'INIT' : 'CONTINUATION'
});

Success Criteria:

All 15+ log points added across 4 files
Build succeeds with no TypeScript errors
Worker service restarts successfully

Handoff to Phase 2: After adding logging, build with npm run build-and-sync

Phase 2: Test and Gather Diagnostic Data

Goal: Execute test conversation and collect logs to identify where session ID propagation breaks.

Prerequisites: Phase 1 completed, logging in place, worker service running

Test Procedure:

2.1 Start Fresh Conversation

In a new Claude Code session:

Clear any existing logs: bun ~/.claude/plugins/marketplaces/thedotmack/scripts/worker-service.cjs > /tmp/worker-logs.txt 2>&1 &
Send first prompt: "test prompt 1"
Send second prompt: "test prompt 2"
Send third prompt: "test prompt 3"

2.2 Collect Logs

View worker logs:

tail -f /tmp/worker-logs.txt | grep -E '\[NEW-HOOK\]|\[SESSION-ROUTES\]|\[SESSION-MANAGER\]|\[SDK-AGENT\]'

2.3 Check Database State

Query 1 - Check sessions table:

cd ~/.claude-mem
sqlite3 claude-mem.db "SELECT id, claude_session_id, sdk_session_id, status, started_at FROM sdk_sessions ORDER BY id DESC LIMIT 10;"

Expected: Same claude_session_id for all 3 prompts

Query 2 - Check user prompts table:

sqlite3 claude-mem.db "SELECT claude_session_id, prompt_number, created_at FROM user_prompts ORDER BY created_at DESC LIMIT 10;"

Expected: Same claude_session_id with prompt_number: 1, 2, 3

2.4 Analyze Data Flow

For each prompt (1, 2, 3), trace in logs:

NEW-HOOK receives session_id from Claude Code
SESSION-ROUTES receives claudeSessionId in API call
SESSION-ROUTES creates/gets sessionDbId
SESSION-ROUTES calculates promptNumber
SESSION-MANAGER fetches/creates session with claudeSessionId
SDK-AGENT uses claudeSessionId as resume parameter
SDK-AGENT selects INIT vs CONTINUATION prompt

Key Questions to Answer:

Does session_id from hook stay the same across all 3 prompts?
Does claudeSessionId match across all log entries for same conversation?
Does promptNumber increment: 1, 2, 3?
Does lastPromptNumber match promptNumber in SessionManager?
Does SDK-AGENT receive correct resume parameter on prompts 2+?
Does SDK-AGENT select CONTINUATION prompt for prompts 2+?

Success Criteria:

Logs collected for 3 test prompts
Database queries run and results saved
Data flow analysis completed
Failure point identified

Handoff to Phase 3: Document exact failure point (which log entry shows incorrect value) and move to fix implementation

Phase 3: Implement Fix Based on Findings

Goal: Fix the identified root cause of session continuity failure.

Prerequisites: Phase 2 completed, failure point identified from logs/database

Common Fix Scenarios:

Scenario A: Hook Receives Different `session_id` Each Time

Symptom in Logs:

[NEW-HOOK] Received hook input: { session_id: 'abc-123', ... }  // Prompt 1
[NEW-HOOK] Received hook input: { session_id: 'def-456', ... }  // Prompt 2 - DIFFERENT!

Root Cause: Hook not receiving consistent session ID from Claude Code

Fix Location: This is external to codebase - investigate Claude Code hook configuration or report bug

Action: Create GitHub issue in claude-code repo with evidence

Scenario B: `promptNumber` Not Passed or Calculated Correctly

Symptom in Logs:

[SESSION-ROUTES] Calculated promptNumber: { promptNumber: 1, currentCount: 1 }  // Prompt 2 - WRONG!

Root Cause: User prompt not being saved to database, or count query failing

Fix Location: src/services/worker/http/routes/SessionRoutes.ts line 520

Fix:

// Add error handling around saveUserPrompt
try {
  this.dbManager.getSessionStore().saveUserPrompt(
    claudeSessionId,
    promptNumber,
    cleanedPrompt
  );
  console.log('[SESSION-ROUTES] Successfully saved user prompt:', {
    claudeSessionId,
    promptNumber
  });
} catch (error) {
  console.error('[SESSION-ROUTES] Failed to save user prompt:', error);
  throw new Error(`Failed to save user prompt: ${error.message}`);
}

Scenario C: Session Manager Uses Wrong Fallback Logic

Symptom in Logs:

[SESSION-MANAGER] Creating new session object: { lastPromptNumber: 1 }  // Prompt 2 - WRONG!

Root Cause: Fragile || operator causing incorrect fallback when promptNumber is valid

Fix Location: src/services/worker/SessionManager.ts line 116

Fix:

// Replace fragile || with explicit undefined check
lastPromptNumber: promptNumber !== undefined
  ? promptNumber
  : this.dbManager.getSessionStore().getPromptNumberFromUserPrompts(dbSession.claude_session_id),

Scenario D: Database Session Not Found

Symptom in Logs:

[SESSION-MANAGER] Fetched session from database: { claude_session_id: undefined }

Root Cause: createSDKSession INSERT failed silently, or session was deleted

Fix Location: src/services/sqlite/SessionStore.ts line 1086-1101

Fix:

// Add validation after INSERT OR IGNORE
const result = this.db.prepare(`
  INSERT OR IGNORE INTO sdk_sessions
  (claude_session_id, sdk_session_id, project, user_prompt, started_at, started_at_epoch, status)
  VALUES (?, ?, ?, ?, ?, ?, 'active')
`).run(claudeSessionId, claudeSessionId, project, userPrompt, now, nowEpoch, 'active');

// Verify session exists
const row = this.db.prepare('SELECT id FROM sdk_sessions WHERE claude_session_id = ?')
  .get(claudeSessionId);

if (!row) {
  throw new Error(`Failed to create or retrieve SDK session for claudeSessionId: ${claudeSessionId}`);
}

return row.id;

Scenario E: SDK Agent Receives Empty `claudeSessionId`

Symptom in Logs:

[SDK-AGENT] Starting SDK query with: { claudeSessionId: undefined, resume_parameter: undefined }

Root Cause: SessionManager created session object with missing claudeSessionId

Fix Location: src/services/worker/SessionManager.ts line 109

Fix:

// Add validation before using database values
if (!dbSession.claude_session_id) {
  throw new Error(`Database session ${sessionDbId} has no claude_session_id`);
}

session = {
  sessionDbId,
  claudeSessionId: dbSession.claude_session_id,
  // ... rest of session object
};

Success Criteria:

Fix implemented at identified failure point
Validation added to fail loudly on errors
Build succeeds
Worker service restarts successfully

Handoff to Phase 4: Build and deploy fix, then run verification tests

Phase 4: Verify Fix and Test Session Continuity

Goal: Confirm session continuity is working correctly after fix.

Prerequisites: Phase 3 completed, fix deployed, worker service running

Verification Procedure:

4.1 Run Full Test Conversation

In a fresh Claude Code session:

Prompt 1: "This is test prompt one for session continuity"
Prompt 2: "This is test prompt two, continuing the session"
Prompt 3: "This is test prompt three, still continuing"
Prompt 4: "Final test prompt four"

4.2 Check Logs

Verify in worker logs:

All prompts show same session_id:

[NEW-HOOK] Received hook input: { session_id: 'abc-123' }  // All 4 prompts

Prompt numbers increment:

[SESSION-ROUTES] Calculated promptNumber: { promptNumber: 1 }  // Prompt 1
[SESSION-ROUTES] Calculated promptNumber: { promptNumber: 2 }  // Prompt 2
[SESSION-ROUTES] Calculated promptNumber: { promptNumber: 3 }  // Prompt 3
[SESSION-ROUTES] Calculated promptNumber: { promptNumber: 4 }  // Prompt 4

SDK Agent uses continuation prompts:

[SDK-AGENT] Creating message generator: { promptType: 'INIT' }          // Prompt 1
[SDK-AGENT] Creating message generator: { promptType: 'CONTINUATION' }  // Prompt 2
[SDK-AGENT] Creating message generator: { promptType: 'CONTINUATION' }  // Prompt 3
[SDK-AGENT] Creating message generator: { promptType: 'CONTINUATION' }  // Prompt 4

4.3 Verify Database State

Check sessions table:

sqlite3 ~/.claude-mem/claude-mem.db "SELECT id, claude_session_id, sdk_session_id FROM sdk_sessions ORDER BY id DESC LIMIT 5;"

Expected: Only ONE session record for the 4 prompts, claude_session_id and sdk_session_id are identical

Check user_prompts table:

sqlite3 ~/.claude-mem/claude-mem.db "SELECT claude_session_id, prompt_number, created_at FROM user_prompts ORDER BY created_at DESC LIMIT 5;"

Expected: 4 records with same claude_session_id, prompt_number values: 4, 3, 2, 1

4.4 Functional Test

Verify actual session continuity behavior:

Prompt 1: "My favorite color is blue"
Prompt 2: "What is my favorite color?"
- Expected: Response mentions "blue"
Prompt 3: "Change it to red"
Prompt 4: "What is my favorite color now?"
- Expected: Response mentions "red"

Success Criteria:

Same session_id across all 4 prompts in logs
Prompt numbers increment: 1, 2, 3, 4
INIT prompt only for first prompt
CONTINUATION prompts for prompts 2, 3, 4
Only one session record in database
Four user_prompts records with incremental prompt_number
Functional test shows session continuity working

Handoff to Phase 5: If all criteria pass, proceed to cleanup. If any fail, return to Phase 2 with new diagnostic focus.

Phase 5: Cleanup and Documentation

Goal: Remove excessive logging, update documentation, close issues.

Prerequisites: Phase 4 completed successfully, session continuity verified working

Cleanup Steps:

5.1 Reduce Logging Verbosity (Optional)

You can either:

Keep all diagnostic logging for future debugging (recommended)
Remove logging to reduce noise in production logs
Convert to debug level if logging framework supports it

If removing logging, remove the console.log statements added in Phase 1 from:

src/hooks/new-hook.ts
src/services/worker/http/routes/SessionRoutes.ts
src/services/worker/SessionManager.ts
src/services/worker/SDKAgent.ts

5.2 Update Documentation

If the fix revealed any architectural insights, update:

CLAUDE.md - Add any new gotchas or patterns discovered
README.md - Update if user-facing behavior changed
Code comments - Document the fix rationale

5.3 Create Regression Test (Future Work)

Consider adding automated test:

describe('Session Continuity', () => {
  it('should use same session ID across multiple prompts', async () => {
    // Test that verifies session ID propagation
  });

  it('should increment prompt numbers correctly', async () => {
    // Test that verifies prompt number calculation
  });
});

Search GitHub for related issues:

gh issue list --search "session continuity" --state open
gh issue list --search "session persistence" --state open
gh issue list --search "new session" --state open

Close with comment explaining the fix.

Success Criteria:

Logging cleaned up as desired
Documentation updated
Related GitHub issues closed
No regressions introduced

Quick Reference

Key Files and What They Do

File	Purpose	Critical Lines
`src/hooks/new-hook.ts`	Hook entry point, receives session_id from Claude Code	24, 34, 46-47, 63-68
`src/services/worker/http/routes/SessionRoutes.ts`	HTTP endpoints for session init, calculates prompt numbers	482-533, 171-227
`src/services/sqlite/SessionStore.ts`	Database operations for sessions and user prompts	1086-1101, 1053-1058
`src/services/worker/SessionManager.ts`	In-memory session management, bridges DB and SDK	49-141, esp. 109, 116
`src/services/worker/SDKAgent.ts`	SDK integration, sends resume parameter and prompts	68-77, 195-218, 200-202
`src/sdk/prompts.ts`	Init and continuation prompt templates	30-87, 169-229

Build and Deploy Commands

# Build TypeScript
npm run build

# Sync to marketplace and restart worker
npm run build-and-sync

# Restart worker only
killall bun
bun ~/.claude/plugins/marketplaces/thedotmack/scripts/worker-service.cjs &

# Check worker is running
curl http://localhost:37777/health

Database Queries

# Check sessions
sqlite3 ~/.claude-mem/claude-mem.db "SELECT * FROM sdk_sessions ORDER BY id DESC LIMIT 10;"

# Check user prompts
sqlite3 ~/.claude-mem/claude-mem.db "SELECT * FROM user_prompts ORDER BY created_at DESC LIMIT 10;"

# Count prompts per session
sqlite3 ~/.claude-mem/claude-mem.db "SELECT claude_session_id, COUNT(*) as prompt_count FROM user_prompts GROUP BY claude_session_id ORDER BY prompt_count DESC LIMIT 10;"

Debugging Tips

Check worker is running: curl http://localhost:37777/health
View worker logs: tail -f /tmp/worker-logs.txt
Check hook output: Logs appear in Claude Code's stderr
Database locked: killall bun then restart worker
Stale build: rm -rf plugin/scripts/*.js && npm run build

Phase Execution Checklist

Use this checklist when executing phases in new chat contexts:

Phase 1: Diagnostic Logging

Read this plan document
Read the 4 files to modify
Add all 15+ log points
Build with npm run build-and-sync
Verify worker restarts
Mark phase complete, handoff to Phase 2

Phase 2: Test and Gather Data

Read Phase 2 section
Run 3 test prompts
Collect and save logs
Run database queries
Trace data flow
Identify failure point
Document failure point
Mark phase complete, handoff to Phase 3

Phase 3: Implement Fix

Read Phase 3 section
Review failure point from Phase 2
Select applicable scenario
Implement fix
Add validation
Build and deploy
Mark phase complete, handoff to Phase 4

Phase 4: Verify Fix

Read Phase 4 section
Run 4 test prompts
Check logs for correct behavior
Verify database state
Run functional test
All success criteria pass
Mark phase complete, handoff to Phase 5

Phase 5: Cleanup

Read Phase 5 section
Clean up logging (optional)
Update documentation
Close GitHub issues
Mark phase complete
Session continuity regression FIX COMPLETE ✅

Context for New Chat Sessions

When starting a new phase, provide this context:

I'm working on Phase [X] of the Session Continuity Regression Fix for claude-mem.

Background: Session continuity is broken - each prompt creates a new session instead of continuing. This has been a recurring issue for 3 months. The root cause is that session SDK ID is not being propagated correctly from new-hook through to SDKAgent.

Current Status: [Briefly describe what previous phases accomplished]

This Phase Goal: [Copy the goal from the phase section]

Plan Document: Read /Users/alexnewman/Scripts/claude-mem/PLAN-SESSION-CONTINUITY-FIX.md for full context.

Success Metrics

Overall Fix Success:

Same session ID used across multiple prompts in one conversation
Prompt numbers increment correctly (1, 2, 3, ...)
Init prompt only sent on first prompt
Continuation prompts sent on subsequent prompts
SDK receives correct resume parameter
Only one session record created per conversation
Functional session continuity test passes
No new regressions introduced

Regression Prevention:

Validation added to fail loudly on errors
No silent fallbacks that hide bugs
Database queries verified
Session ID propagation explicitly tested

Last Updated: 2025-12-27 Author: Claude (investigating 3-month recurring session continuity regression)

20 KiB Raw Permalink Blame History

Session Continuity Regression Fix - Phased Execution Plan

Phase 1: Add Diagnostic Logging

1.1 Add Logging to src/hooks/new-hook.ts

1.2 Add Logging to src/services/worker/http/routes/SessionRoutes.ts

1.3 Add Logging to src/services/worker/SessionManager.ts

1.4 Add Logging to src/services/worker/SDKAgent.ts

Phase 2: Test and Gather Diagnostic Data

2.1 Start Fresh Conversation

2.2 Collect Logs

2.3 Check Database State

2.4 Analyze Data Flow

Phase 3: Implement Fix Based on Findings

Scenario A: Hook Receives Different session_id Each Time

Scenario B: promptNumber Not Passed or Calculated Correctly

Scenario C: Session Manager Uses Wrong Fallback Logic

Scenario D: Database Session Not Found

Scenario E: SDK Agent Receives Empty claudeSessionId

Phase 4: Verify Fix and Test Session Continuity

4.1 Run Full Test Conversation

4.2 Check Logs

4.3 Verify Database State

4.4 Functional Test

Phase 5: Cleanup and Documentation

5.1 Reduce Logging Verbosity (Optional)

5.2 Update Documentation

5.3 Create Regression Test (Future Work)

5.4 Close Related Issues

Quick Reference

Key Files and What They Do

Build and Deploy Commands

Database Queries

Debugging Tips

Phase Execution Checklist

Context for New Chat Sessions

Success Metrics

20 KiB

Raw Permalink Blame History

1.1 Add Logging to `src/hooks/new-hook.ts`

1.2 Add Logging to `src/services/worker/http/routes/SessionRoutes.ts`

1.3 Add Logging to `src/services/worker/SessionManager.ts`

1.4 Add Logging to `src/services/worker/SDKAgent.ts`

Scenario A: Hook Receives Different `session_id` Each Time

Scenario B: `promptNumber` Not Passed or Calculated Correctly

Scenario E: SDK Agent Receives Empty `claudeSessionId`