fix(windows): solve zombie port problem with wrapper architecture (#372)
On Windows, Bun doesn't properly release socket handles when the worker process exits, causing "zombie ports" that remain bound even after all processes are dead. This required a system reboot to clear. Solution: Introduce a wrapper process (worker-wrapper.cjs) that: - Spawns the actual worker as a child with IPC channel - On restart/shutdown, uses `taskkill /T /F` to kill the entire process tree - Exits itself, allowing hooks to start fresh The wrapper has no sockets, so Bun's socket cleanup bug doesn't affect it. When the wrapper kills the inner worker tree and exits, the port is properly released and can be immediately reused. Key changes: - New worker-wrapper.ts for Windows process lifecycle management - ProcessManager starts wrapper on Windows, worker directly on Unix - Worker sends IPC messages to wrapper for restart/shutdown - Health endpoint now includes debug info (build ID, managed status, hasIpc) Tested: Restart API now properly releases port and new worker binds to same port. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -43,8 +43,10 @@ export class ProcessManager {
|
||||
// Ensure log directory exists
|
||||
mkdirSync(LOG_DIR, { recursive: true });
|
||||
|
||||
// Get worker script path
|
||||
const workerScript = join(MARKETPLACE_ROOT, 'plugin', 'scripts', 'worker-service.cjs');
|
||||
// On Windows, use the wrapper script to solve zombie port problem
|
||||
// On Unix, use the worker directly
|
||||
const scriptName = process.platform === 'win32' ? 'worker-wrapper.cjs' : 'worker-service.cjs';
|
||||
const workerScript = join(MARKETPLACE_ROOT, 'plugin', 'scripts', scriptName);
|
||||
|
||||
if (!existsSync(workerScript)) {
|
||||
return { success: false, error: `Worker script not found at ${workerScript}` };
|
||||
@@ -86,6 +88,10 @@ export class ProcessManager {
|
||||
// Note: windowsHide: true doesn't work with detached: true (Bun inherits Node.js process spawning semantics)
|
||||
// See: https://github.com/nodejs/node/issues/21825 and PR #315 for detailed testing
|
||||
//
|
||||
// On Windows, we start worker-wrapper.cjs which manages the actual worker-service.cjs.
|
||||
// This solves the zombie port problem: the wrapper has no sockets, so when it kills
|
||||
// and respawns the inner worker, the socket is properly released.
|
||||
//
|
||||
// Security: All paths (bunPath, script, MARKETPLACE_ROOT) are application-controlled system paths,
|
||||
// not user input. If an attacker could modify these paths, they would already have full filesystem
|
||||
// access including direct access to ~/.claude-mem/claude-mem.db. Nevertheless, we properly escape
|
||||
@@ -168,8 +174,21 @@ export class ProcessManager {
|
||||
if (!info) return true;
|
||||
|
||||
try {
|
||||
process.kill(info.pid, 'SIGTERM');
|
||||
await this.waitForExit(info.pid, timeout);
|
||||
if (process.platform === 'win32') {
|
||||
// On Windows, use taskkill /T /F to kill entire process tree
|
||||
// This ensures the wrapper AND all its children (inner worker, MCP, ChromaSync) are killed
|
||||
// which is necessary to properly release the socket and avoid zombie ports
|
||||
const { execSync } = await import('child_process');
|
||||
try {
|
||||
execSync(`taskkill /PID ${info.pid} /T /F`, { timeout: 10000, stdio: 'ignore' });
|
||||
} catch {
|
||||
// Process may already be dead
|
||||
}
|
||||
} else {
|
||||
// On Unix, use signals
|
||||
process.kill(info.pid, 'SIGTERM');
|
||||
await this.waitForExit(info.pid, timeout);
|
||||
}
|
||||
} catch {
|
||||
try {
|
||||
process.kill(info.pid, 'SIGKILL');
|
||||
|
||||
Reference in New Issue
Block a user