fix: prevent zombie process accumulation via PID registry and signal propagation (Issue #737) (#806)
* Fix zombie process accumulation (Issue #737) Problem: Claude haiku subprocesses spawned by the SDK weren't terminating properly, causing zombie process accumulation (user reported 155 processes consuming 51GB RAM). Root causes: 1. SDK's SpawnedProcess interface hides subprocess PIDs 2. deleteSession() didn't verify subprocess exit 3. abort() was fire-and-forget with no confirmation 4. No mechanism to track or clean up orphaned processes Solution: - Add ProcessRegistry module to track spawned Claude subprocesses - Use SDK's spawnClaudeCodeProcess option to capture PIDs via custom spawn - Pass signal parameter to enable AbortController integration - Wait for subprocess exit in deleteSession() with 5s timeout - Escalate to SIGKILL if graceful exit fails - Add orphan reaper running every 5 minutes as safety net Files changed: - src/services/worker/ProcessRegistry.ts (new): PID registry and reaper - src/services/worker/SDKAgent.ts: Use custom spawn to capture PIDs - src/services/worker/SessionManager.ts: Verify subprocess exit on delete - src/services/worker-service.ts: Start/stop orphan reaper Fixes #737 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: address code review feedback - Replace busy-wait polling with event-based proc.once('exit') - Detect and warn about multiple processes per session (race condition) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: bigphoot <bigphoot@gmail.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -20,6 +20,7 @@ import { USER_SETTINGS_PATH } from '../../shared/paths.js';
|
||||
import type { ActiveSession, SDKUserMessage } from '../worker-types.js';
|
||||
import { ModeManager } from '../domain/ModeManager.js';
|
||||
import { processAgentResponse, type WorkerRef } from './agents/index.js';
|
||||
import { createPidCapturingSpawn, getProcessBySession, ensureProcessExit } from './ProcessRegistry.js';
|
||||
|
||||
// Import Agent SDK (assumes it's installed)
|
||||
// @ts-ignore - Agent SDK types may not be available
|
||||
@@ -99,6 +100,7 @@ export class SDKAgent {
|
||||
|
||||
// Run Agent SDK query loop
|
||||
// Only resume if we have a captured memory session ID
|
||||
// Use custom spawn to capture PIDs for zombie process cleanup (Issue #737)
|
||||
const queryResult = query({
|
||||
prompt: messageGenerator,
|
||||
options: {
|
||||
@@ -109,7 +111,9 @@ export class SDKAgent {
|
||||
...(hasRealMemorySessionId && session.lastPromptNumber > 1 && { resume: session.memorySessionId }),
|
||||
disallowedTools,
|
||||
abortController: session.abortController,
|
||||
pathToClaudeCodeExecutable: claudePath
|
||||
pathToClaudeCodeExecutable: claudePath,
|
||||
// Custom spawn function captures PIDs to fix zombie process accumulation
|
||||
spawnClaudeCodeProcess: createPidCapturingSpawn(session.sessionDbId)
|
||||
}
|
||||
});
|
||||
|
||||
|
||||
Reference in New Issue
Block a user