Files
claude-mem/.claude/plans/fix-stale-session-resume-crash.md
T
Alex Newman e1ab73decc feat: Live Context System with Distributed CLAUDE.md Generation (#556)
* docs: add folder index generator plan

RFC for auto-generating folder-level CLAUDE.md files with observation
timelines. Includes IDE symlink support and root CLAUDE.md integration.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: implement folder index generator (Phase 1)

Add automatic CLAUDE.md generation for folders containing observed files.
This enables IDE context providers to access relevant memory observations.

Core modules:
- FolderDiscovery: Extract folders from observation file paths
- FolderTimelineCompiler: Compile chronological timeline per folder
- ClaudeMdGenerator: Write CLAUDE.md with tag-based content replacement
- FolderIndexOrchestrator: Coordinate regeneration on observation save

Integration:
- Event-driven regeneration after observation save in ResponseProcessor
- HTTP endpoints for folder discovery, timeline, and manual generation
- Settings for enabling/configuring folder index behavior

The <claude-mem-context> tag wrapping ensures:
- Manual CLAUDE.md content is preserved
- Auto-generated content won't be recursively observed
- Clean separation between user and system content

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: add updateFolderClaudeMd function to CursorHooksInstaller

Adds function to update CLAUDE.md files for folders touched by observations.
Uses existing /api/search/by-file endpoint, preserves content outside
<claude-mem-context> tags, and writes atomically via temp file + rename.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: hook updateFolderClaudeMd into ResponseProcessor

Calls updateFolderClaudeMd after observation save to update folder-level
CLAUDE.md files. Uses fire-and-forget pattern with error logging.
Extracts file paths from saved observations and workspace path from registry.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: add timeline formatting for folder CLAUDE.md files

Implements formatTimelineForClaudeMd function that transforms API response
into compact markdown table format. Converts emojis to text labels,
handles ditto marks for timestamps, and groups under "Recent" header.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor: remove old folder-index implementation

Deletes redundant folder-index services that were replaced by the simpler
updateFolderClaudeMd approach in CursorHooksInstaller.ts.

Removed:
- src/services/folder-index/ directory (5 files)
- FolderIndexRoutes.ts
- folder-index settings from SettingsDefaultsManager
- folder-index route registration from worker-service

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: add worktree-aware project filtering for unified timelines

Detect git worktrees and show both parent repo and worktree observations
in the session start timeline. When running in a worktree, the context
now includes observations from both projects, interleaved chronologically.

- Add detectWorktree() utility to identify worktree directories
- Add getProjectContext() to return parent + worktree projects
- Update context hook to pass multi-project queries
- Add queryObservationsMulti() and querySummariesMulti() for IN clauses
- Maintain backward compatibility with single-project queries

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

* fix: restructure logging to prove session correctness and reduce noise

Add critical logging at each stage of the session lifecycle to prove the session ID chain (contentSessionId → sessionDbId → memorySessionId) stays aligned. New logs include CREATED, ENQUEUED, CLAIMED, MEMORY_ID_CAPTURED, STORING, and STORED. Move intermediate migration and backfill progress logs to DEBUG level to reduce noise, keeping only essential initialization and completion logs at INFO level.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

* refactor: extract folder CLAUDE.md utils to shared location

Moves folder CLAUDE.md utilities from CursorHooksInstaller to a new
shared utils file. Removes Cursor registry dependency - file paths
from observations are already absolute, no workspace lookup needed.

New file: src/utils/claude-md-utils.ts
- replaceTaggedContent() - preserves user content outside tags
- writeClaudeMdToFolder() - atomic writes with tag preservation
- formatTimelineForClaudeMd() - API response to compact markdown
- updateFolderClaudeMdFiles() - orchestrates folder updates

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: trigger folder CLAUDE.md updates when observations are saved

The folder CLAUDE.md update was previously only triggered in
syncAndBroadcastSummary, but summaries run with observationCount=0
(observations are saved separately). Moved the update logic to
syncAndBroadcastObservations where file paths are available.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* all the claudes

* test: add unit tests for claude-md-utils pure functions

Add 11 tests covering replaceTaggedContent and formatTimelineForClaudeMd:
- replaceTaggedContent: empty content, tag replacement, appending, partial tags
- formatTimelineForClaudeMd: empty input, parsing, ditto marks, session IDs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* test: add integration tests for file operation functions

Add 9 tests for writeClaudeMdToFolder and updateFolderClaudeMdFiles:
- writeClaudeMdToFolder: folder creation, content preservation, nested dirs, atomic writes
- updateFolderClaudeMdFiles: empty skip, fetch/write, deduplication, error handling

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* test: add unit tests for timeline-formatting utilities

Add 14 tests for extractFirstFile and groupByDate functions:
- extractFirstFile: relative paths, fallback to files_read, null handling, invalid JSON
- groupByDate: empty arrays, date grouping, chronological sorting, item preservation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* chore: rebuild plugin scripts with merged features

* docs: add project-specific CLAUDE.md with architecture and development notes

* fix: exclude project root from auto-generated CLAUDE.md updates

Skip folders containing .git directory when auto-updating subfolder
CLAUDE.md files. This ensures:

1. Root CLAUDE.md remains user-managed and untouched by the system
2. SessionStart context injection stays pristine throughout the session
3. Subfolder CLAUDE.md files continue to receive live context updates
4. Cleaner separation between user-authored root docs and auto-generated folder indexes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: prevent crash from resuming stale SDK sessions on worker restart

When the worker restarts, it was incorrectly passing the `resume` parameter
to INIT prompts (lastPromptNumber=1) when a memorySessionId existed from a
previous SDK session. This caused "Claude Code process exited with code 1"
crashes because the SDK tried to resume into a session that no longer exists.

Root cause: The resume condition only checked `hasRealMemorySessionId` but
did not verify that this was a CONTINUATION prompt (lastPromptNumber > 1).

Fix: Add `session.lastPromptNumber > 1` check to the resume condition:
- Before: `...(hasRealMemorySessionId && { resume: session.memorySessionId })`
- After: `...(hasRealMemorySessionId && session.lastPromptNumber > 1 && { resume: ... })`

Also added:
- Enhanced debug logging that warns when skipping resume for INIT prompts
- Unit tests in tests/sdk-agent-resume.test.ts (9 test cases)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: properly handle Chroma MCP connection errors

Previously, ensureCollection() caught ALL errors from chroma_get_collection_info
and assumed they meant "collection doesn't exist", triggering unnecessary
collection creation attempts. Connection errors like "Not connected" or
"MCP error -32000: Connection closed" would cascade into failed creation attempts.

Similarly, queryChroma() would silently return empty results when the MCP call
failed, masking the underlying connection problem.

Changes:
- ensureCollection(): Detect connection errors and re-throw immediately instead
  of attempting collection creation
- queryChroma(): Wrap MCP call in try-catch and throw connection errors instead
  of returning empty results
- Both methods reset connection state (connected=false, client=null) on
  connection errors so subsequent operations can attempt to reconnect

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* pushed

* fix: scope regenerate-claude-md.ts to current working directory

Critical bug fix: The script was querying ALL observations from the database
across ALL projects ever recorded (1396+ folders), then attempting to write
CLAUDE.md files everywhere including other projects, non-existent paths, and
ignored directories.

Changes:
- Use git ls-files to discover folders (respects .gitignore automatically)
- Filter database query to current project only (by folder name)
- Use relative paths for database queries (matches storage format)
- Add --clean flag to remove auto-generated CLAUDE.md files
- Add fallback directory walker for non-git repos

Now correctly scopes to 26 folders with observations instead of 1396+.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs and adjustments

* fix: cleanup mode strips tags instead of deleting files blindly

The cleanup mode was incorrectly deleting entire files that contained
<claude-mem-context> tags. The correct behavior (per original design):

1. Strip the <claude-mem-context>...</claude-mem-context> section
2. If empty after stripping → delete the file
3. If has remaining content → save the stripped version

Now properly preserves user content in CLAUDE.md files while removing
only the auto-generated sections.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* deleted some files

* chore: regenerate folder CLAUDE.md files with fixed script

Regenerated 23 folder CLAUDE.md files using the corrected script that:
- Scopes to current working directory only
- Uses git ls-files to respect .gitignore
- Filters by project name

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Update CLAUDE.md files for January 5, 2026

- Regenerated and staged 23 CLAUDE.md files with a mix of new and modified content.
- Fixed cleanup mode to properly strip tags instead of deleting files blindly.
- Cleaned up empty CLAUDE.md files from various directories, including ~/.claude and ~/Scripts.
- Conducted dry-run cleanup that identified a significant reduction in auto-generated CLAUDE.md files.
- Removed the isAutoGeneratedClaudeMd function due to incorrect file deletion behavior.

* feat: use settings for observation limit in batch regeneration script

Replace hard-coded limit of 10 with configurable CLAUDE_MEM_CONTEXT_OBSERVATIONS
setting (default: 50). This allows users to control how many observations appear
in folder CLAUDE.md files.

Changes:
- Import SettingsDefaultsManager and load settings at script startup
- Use OBSERVATION_LIMIT constant derived from settings at both call sites
- Remove stale default parameter from findObservationsByFolder function

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: use settings for observation limit in event-driven updates

Replace hard-coded limit of 10 in updateFolderClaudeMdFiles with
configurable CLAUDE_MEM_CONTEXT_OBSERVATIONS setting (default: 50).

Changes:
- Import SettingsDefaultsManager and os module
- Load settings at function start (once, not in loop)
- Use limit from settings in API call

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: Implement configurable observation limits and enhance search functionality

- Added configurable observation limits to batch regeneration scripts.
- Enhanced SearchManager to handle folder queries and normalize parameters.
- Introduced methods to check for direct child files in observations and sessions.
- Updated SearchOptions interface to include isFolder flag for filtering.
- Improved code quality with comprehensive reviews and anti-pattern checks.
- Cleaned up auto-generated CLAUDE.md files across various directories.
- Documented recent changes and improvements in CLAUDE.md files.

* build asset

* Project Context from Claude-Mem auto-added (can be auto removed at any time)

* CLAUDE.md updates

* fix: resolve CLAUDE.md files to correct directory in worktree setups

When using git worktrees, CLAUDE.md files were being written relative to
the worker's process.cwd() instead of the actual project directory. This
fix threads the project's cwd from message processing through to the file
writing utilities, ensuring CLAUDE.md files are created in the correct
project directory regardless of where the worker was started.

Changes:
- Add projectRoot parameter to updateFolderClaudeMdFiles for path resolution
- Thread projectRoot through ResponseProcessor call chain
- Track lastCwd from messages in SDKAgent, GeminiAgent, OpenRouterAgent
- Add tests for relative/absolute path handling with projectRoot

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* more project context updates

* context updates

* fix: preserve actual dates in folder CLAUDE.md generation

Previously, formatTimelineForClaudeMd used today's date for all
observations because the API only returned time (e.g., "4:30 PM")
without date information. This caused all historical observations
to appear as if they happened today.

Changes:
- SearchManager.findByFile now groups results by date with headers
  (e.g., "### Jan 4, 2026") matching formatSearchResults behavior
- formatTimelineForClaudeMd now parses these date headers and uses
  the correct date when constructing epochs for date grouping

The timeline dates are critical for claude-mem context - LLMs need
accurate temporal context to understand when work happened.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* build: update worker assets with date parsing fix

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* claude-mem context: Fixed critical date parsing bug in PR #556

* fix: address PR #556 review items

- Use getWorkerHost() instead of hard-coded 127.0.0.1 in claude-md-utils
- Add error message and stack details to FOLDER_INDEX logging
- Add 5 new tests for worktree/projectRoot path resolution

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Refactor CLAUDE documentation across multiple components and tests

- Updated CLAUDE.md files in src/ui/viewer, src/ui/viewer/constants, src/ui/viewer/hooks, tests/server, tests/worker/agents, and plans to reflect recent changes and improvements.
- Removed outdated entries and consolidated recent activities for clarity.
- Enhanced documentation for hooks, settings, and pagination implementations.
- Streamlined test suite documentation for server and worker agents, indicating recent test audits and cleanup efforts.
- Adjusted plans to remove obsolete entries and focus on current implementation strategies.

* docs: comprehensive v9.0 documentation audit and updates

- Add usage/folder-context to docs.json navigation (was documented but hidden!)
- Update introduction.mdx with v9.0 release notes (Live Context, Worktree Support, Windows Fixes)
- Add CLAUDE_MEM_WORKER_HOST setting to configuration.mdx
- Add Folder Context Files section with link to detailed docs
- Document worktree support in folder-context.mdx
- Update terminology from "mem-search skill" to "MCP tools" throughout active docs
- Update Search Pipeline in architecture/overview.mdx
- Update usage/getting-started.mdx with MCP tools terminology
- Update usage/claude-desktop.mdx title and terminology
- Update hooks-architecture.mdx reference

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: add recent activity log for worker CLI with detailed entries

* chore: update CLAUDE.md context files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: add brainstorming report for CLAUDE.md distribution architecture

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-05 22:41:42 -05:00

9.1 KiB

Plan: Fix Stale Session Resume Crash

Problem Summary

The worker crashes repeatedly with "Claude Code process exited with code 1" when attempting to resume into a stale/non-existent SDK session.

Root Cause: In SDKAgent.ts:94, the resume parameter is passed whenever memorySessionId exists in the database, regardless of whether this is an INIT prompt or CONTINUATION prompt. When a worker restarts or re-initializes a session, it loads a stale memorySessionId from a previous SDK session and tries to resume into a session that no longer exists in Claude's context.

Evidence from logs:

[17:30:21.773] Starting SDK query {
  hasRealMemorySessionId=true,           ← DB has old memorySessionId
  resume_parameter=5439891b-...,         ← Trying to resume with it
  lastPromptNumber=1                     ← But this is a NEW SDK session!
}
[17:30:24.450] Generator failed {error=Claude Code process exited with code 1}

Phase 0: Documentation Discovery (COMPLETED)

Allowed APIs (from subagent research)

V1 SDK API (currently used):

// From @anthropic-ai/claude-agent-sdk
function query(options: {
  prompt: string | AsyncIterable<SDKUserMessage>;
  options: {
    model: string;
    resume?: string;  // SESSION ID - only use for CONTINUATION
    disallowedTools?: string[];
    abortController?: AbortController;
    pathToClaudeCodeExecutable?: string;
  }
}): AsyncIterable<SDKMessage>

Resume Parameter Rules (from docs/context/agent-sdk-v2-preview.md and SESSION_ID_ARCHITECTURE.md):

  • resume should only be used when continuing an existing SDK conversation
  • For INIT prompts (first prompt in a fresh SDK session), no resume parameter should be passed
  • Session ID is captured from first SDK message and stored for subsequent prompts

Anti-Patterns to Avoid

  • Passing resume parameter with INIT prompts (causes crash)
  • Using contentSessionId for resume (contaminates user session)
  • Assuming memorySessionId validity without checking prompt context

Phase 1: Fix the Resume Parameter Logic

What to Implement

Modify src/services/worker/SDKAgent.ts line 94 to check BOTH conditions:

  1. hasRealMemorySessionId - memorySessionId exists and is non-null
  2. session.lastPromptNumber > 1 - this is a CONTINUATION, not an INIT prompt

Current Code (line 89-99):

const queryResult = query({
  prompt: messageGenerator,
  options: {
    model: modelId,
    // Resume with captured memorySessionId (null on first prompt, real ID on subsequent)
    ...(hasRealMemorySessionId && { resume: session.memorySessionId }),
    disallowedTools,
    abortController: session.abortController,
    pathToClaudeCodeExecutable: claudePath
  }
});

Fixed Code:

const queryResult = query({
  prompt: messageGenerator,
  options: {
    model: modelId,
    // Only resume if BOTH: (1) we have a memorySessionId AND (2) this isn't the first prompt
    // On worker restart, memorySessionId may exist from a previous SDK session but we
    // need to start fresh since the SDK context was lost
    ...(hasRealMemorySessionId && session.lastPromptNumber > 1 && { resume: session.memorySessionId }),
    disallowedTools,
    abortController: session.abortController,
    pathToClaudeCodeExecutable: claudePath
  }
});

Also Update the Comment at Line 66-68:

// CRITICAL: Only resume if:
// 1. memorySessionId exists (was captured from a previous SDK response)
// 2. lastPromptNumber > 1 (this is a continuation within the same SDK session)
// On worker restart or crash recovery, memorySessionId may exist from a previous
// SDK session but we must NOT resume because the SDK context was lost.
// NEVER use contentSessionId for resume - that would inject messages into the user's transcript!

Verification Checklist

  • grep "hasRealMemorySessionId && session.lastPromptNumber > 1" src/services/worker/SDKAgent.ts returns the fix
  • Build succeeds: npm run build
  • No TypeScript errors

Phase 2: Add Logging for Debugging

What to Implement

Enhance the alignment log at line 81-85 to clearly indicate when resume is skipped due to INIT prompt:

// Debug-level alignment logs for detailed tracing
if (session.lastPromptNumber > 1) {
  const willResume = hasRealMemorySessionId;
  logger.debug('SDK', `[ALIGNMENT] Resume Decision | contentSessionId=${session.contentSessionId} | memorySessionId=${session.memorySessionId} | prompt#=${session.lastPromptNumber} | hasRealMemorySessionId=${hasRealMemorySessionId} | willResume=${willResume} | resumeWith=${willResume ? session.memorySessionId : 'NONE'}`);
} else {
  // INIT prompt - never resume even if memorySessionId exists (stale from previous session)
  const hasStaleMemoryId = hasRealMemorySessionId;
  logger.debug('SDK', `[ALIGNMENT] First Prompt (INIT) | contentSessionId=${session.contentSessionId} | prompt#=${session.lastPromptNumber} | hasStaleMemoryId=${hasStaleMemoryId} | action=START_FRESH | Will capture new memorySessionId from SDK response`);
  if (hasStaleMemoryId) {
    logger.warn('SDK', `Skipping resume for INIT prompt despite existing memorySessionId=${session.memorySessionId} - SDK context was lost (worker restart or crash recovery)`);
  }
}

Verification Checklist

  • Build succeeds: npm run build
  • Log message appears when running with stale session scenario

Phase 3: Add Unit Tests

What to Implement

Create tests in tests/sdk-agent-resume.test.ts following patterns from tests/session_id_usage_validation.test.ts:

import { describe, it, expect, beforeEach, afterEach, mock } from 'bun:test';

describe('SDKAgent Resume Parameter Logic', () => {
  describe('hasRealMemorySessionId check', () => {
    it('should NOT pass resume parameter when lastPromptNumber === 1 even if memorySessionId exists', () => {
      // Scenario: Worker restart with stale memorySessionId
      const session = {
        memorySessionId: 'stale-session-id-from-previous-run',
        lastPromptNumber: 1,  // INIT prompt
      };

      const hasRealMemorySessionId = !!session.memorySessionId;
      const shouldResume = hasRealMemorySessionId && session.lastPromptNumber > 1;

      expect(hasRealMemorySessionId).toBe(true);  // memorySessionId exists
      expect(shouldResume).toBe(false);  // but should NOT resume
    });

    it('should pass resume parameter when lastPromptNumber > 1 AND memorySessionId exists', () => {
      // Scenario: Normal continuation within same SDK session
      const session = {
        memorySessionId: 'valid-session-id',
        lastPromptNumber: 2,  // CONTINUATION prompt
      };

      const hasRealMemorySessionId = !!session.memorySessionId;
      const shouldResume = hasRealMemorySessionId && session.lastPromptNumber > 1;

      expect(hasRealMemorySessionId).toBe(true);
      expect(shouldResume).toBe(true);
    });

    it('should NOT pass resume parameter when memorySessionId is null', () => {
      // Scenario: Fresh session, no captured ID yet
      const session = {
        memorySessionId: null,
        lastPromptNumber: 1,
      };

      const hasRealMemorySessionId = !!session.memorySessionId;
      const shouldResume = hasRealMemorySessionId && session.lastPromptNumber > 1;

      expect(hasRealMemorySessionId).toBe(false);
      expect(shouldResume).toBe(false);
    });
  });
});

Documentation Reference

  • Pattern: tests/session_id_usage_validation.test.ts lines 1-50 for test structure
  • Mock pattern: tests/worker/agents/response-processor.test.ts for session mocking

Verification Checklist

  • Tests pass: bun test tests/sdk-agent-resume.test.ts
  • Test file follows project conventions

Phase 4: Build and Deploy

What to Implement

  1. Build the plugin: npm run build-and-sync
  2. Verify worker restarts with fix applied

Verification Checklist

  • npm run build-and-sync succeeds
  • Worker health check passes: curl http://localhost:37777/api/health
  • No "Claude Code process exited with code 1" errors in logs after restart

Phase 5: Final Verification

Verification Commands

# 1. Verify fix is in place
grep -n "hasRealMemorySessionId && session.lastPromptNumber > 1" src/services/worker/SDKAgent.ts

# 2. Verify no crashes in recent logs
tail -100 ~/.claude-mem/logs/claude-mem-$(date +%Y-%m-%d).log | grep -c "exited with code 1"

# 3. Run tests
bun test tests/sdk-agent-resume.test.ts

# 4. Check for anti-patterns (should return 0 results)
grep -n "hasRealMemorySessionId && { resume" src/services/worker/SDKAgent.ts

Success Criteria

  • Fix in place at SDKAgent.ts:94
  • Zero "exited with code 1" errors related to stale resume
  • All tests pass
  • Worker stable for 10+ minutes without crash loop

Files to Modify

  1. src/services/worker/SDKAgent.ts - Fix resume logic (Phase 1 & 2)
  2. tests/sdk-agent-resume.test.ts - New test file (Phase 3)

Estimated Complexity

  • Phase 1: Low - Single line change with updated condition
  • Phase 2: Low - Enhanced logging
  • Phase 3: Medium - New test file following existing patterns
  • Phase 4-5: Low - Standard build/verify process