Merge branch 'fix/windows-console-popups'

This commit is contained in:
Alex Newman
2025-12-16 17:05:05 -05:00
11 changed files with 250 additions and 72 deletions
+125
View File
@@ -0,0 +1,125 @@
⏺ PR #335 Review Summary: Queue Monitoring UI + Windows Fixes
What PR #335 Does
Additions: 3,724 lines
Author: ToxMox
Main Changes:
1. Switch worker runtime from Bun → Node.js (all platforms)
2. Add persistent message queue in SQLite (pending_messages table)
3. Add queue monitoring UI with drawer interface
4. Add watchdog service for auto-recovery (30s polling)
5. Fix Windows console popups using PowerShell Start-Process -WindowStyle Hidden
Initial Concerns (My Review)
Raised red flags about:
- Lack of evidence for "zombie socket" issue (no GitHub issue, only ToxMox reported)
- Over-engineering: Full persistent queue + watchdog + retry logic + UI for unproven problems
- Mixing multiple concerns in one PR (should be 3-4 separate PRs)
- No automated tests for complex state machine logic
- Global runtime change (Bun→Node) affects all platforms for Windows-specific issue
- Command injection vulnerability in PowerShell string (ProcessManager.ts:67)
What We Discovered
1. Problems ARE Real & Documented
- Found detailed analysis in PR #315 comments by ToxMox
- Zombie socket issue has upstream Bun GitHub issues linked:
- oven-sh/bun#12127, #5774, #8786
- windowsHide: true doesn't work with detached: true (Node.js bug #21825)
- SDK subprocess hangs documented with testing details
2. Queue UI Has Real Value
- You saw video demo and said it's "gorgeous and helpful"
- Provides visibility into worker activity
- Recovery controls prevent user frustration
- Real-time updates via existing SSE infrastructure
3. Architecture Makes Sense
Why persistent worker is needed:
- Real-time UI at http://localhost:37777 requires persistent process
- SSE (Server-Sent Events) for live updates
- Can't do on-demand worker if UI needs to be always available
Why queue in database is justified:
- Transactional consistency (save observation + enqueue atomically)
- Relational queries (JOIN with sessions/projects)
- Foreign key cascades (session deleted → queue entries auto-cleaned)
- Already have SQLite infrastructure
4. Storage Optimization Strategy
Smart cleanup approach (your insight):
- Keep full data while pending/processing (needed for retry)
- Clear payloads immediately on completion: Set tool_input, tool_response, last_user_message, last_assistant_message to NULL
- Keep lightweight metadata for "Recently Processed" UI
- Eventually delete old completed records (after 1hr or >100 count)
- Result: 100x storage reduction while keeping UI history
Rejected approach: Linking to transcript files (overly complex, YAGNI)
Critical Insight: Bun → Node Switch Likely Unnecessary
Your final assessment:
"honestly thats more an llm hallucinating an overengineered solution based on incorrect data that probably could be solved by just killing the process correctly"
The real issue is probably:
- Missing cleanup handlers (server.close() before exit)
- Process killed too fast (SIGKILL before cleanup finishes)
- Not receiving SIGTERM properly
- No registered signal handlers for graceful shutdown
Simple fix to try FIRST:
const server = app.listen(port);
async function cleanup() {
server.close(); // Close server
sessionManager.abortAll(); // Stop active work
db.close(); // Close DB
process.exit(0);
}
process.on('SIGTERM', cleanup);
process.on('SIGINT', cleanup);
Final Assessment
PR #335 is mostly solid with real benefits, BUT:
✅ What's Good:
- Queue UI provides real value
- Persistent queue in DB is architecturally justified
- Auto-recovery prevents stuck sessions
- Problems are real and documented
- Comprehensive solution to multiple pain points
⚠️ What Needs Validation:
1. Bun zombie socket issue - Only ToxMox reported, not validated by you
2. Proper cleanup handlers - Try fixing process termination before switching runtimes
3. Platform-specific runtime - If Bun issue is real, use Node only on Windows, keep Bun on Mac/Linux
🔧 What Needs Fixing:
1. Add tests - Zero automated tests for complex state machine
2. Fix command injection - ProcessManager.ts:67 PowerShell string interpolation
3. Implement payload cleanup - Clear heavy data immediately on completion
4. Try simple fix first - Proper signal handlers before runtime switch
Action Items for Next Session
1. Ask ToxMox: Did you try proper cleanup handlers before switching runtimes?
2. Suggest: Platform-specific runtime (Bun on Unix, Node on Windows only if needed)
3. Request: Reproduction steps for zombie socket issue
4. Require: Basic tests before merge
5. Fix: Command injection vulnerability
6. Consider: Splitting into separate PRs (optional, not required)
Key Takeaway
The PR solves real problems with solid architecture, but the Bun→Node switch is likely over-engineered. Try proper process cleanup first.
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
Binary file not shown.
+82 -29
View File
@@ -1,7 +1,7 @@
import { existsSync, readFileSync, writeFileSync, unlinkSync, mkdirSync } from 'fs';
import { createWriteStream } from 'fs';
import { join } from 'path';
import { spawn } from 'child_process';
import { spawn, spawnSync } from 'child_process';
import { homedir } from 'os';
import { DATA_DIR } from '../../shared/paths.js';
import { getBunPath, isBunAvailable } from '../../utils/bun-path.js';
@@ -52,7 +52,7 @@ export class ProcessManager {
const logFile = this.getLogFilePath();
// Use Bun on all platforms
// Use Bun on all platforms with PowerShell workaround for Windows console popups
return this.startWithBun(workerScript, logFile, port);
}
@@ -60,6 +60,15 @@ export class ProcessManager {
return isBunAvailable();
}
/**
* Escapes a string for safe use in PowerShell single-quoted strings.
* In PowerShell single quotes, the only special character is the single quote itself,
* which must be doubled to escape it.
*/
private static escapePowerShellString(str: string): string {
return str.replace(/'/g, "''");
}
private static async startWithBun(script: string, logFile: string, port: number): Promise<{ success: boolean; pid?: number; error?: string }> {
const bunPath = getBunPath();
if (!bunPath) {
@@ -68,40 +77,84 @@ export class ProcessManager {
error: 'Bun is required but not found in PATH or common installation paths. Install from https://bun.sh'
};
}
try {
const isWindows = process.platform === 'win32';
const child = spawn(bunPath, [script], {
detached: true,
stdio: ['ignore', 'pipe', 'pipe'],
env: { ...process.env, CLAUDE_MEM_WORKER_PORT: String(port) },
cwd: MARKETPLACE_ROOT,
// Hide console window on Windows
...(isWindows && { windowsHide: true })
});
if (isWindows) {
// Windows: Use PowerShell Start-Process with -WindowStyle Hidden
// This properly hides the console window (affects both Bun and Node.js)
// Note: windowsHide: true doesn't work with detached: true (Bun inherits Node.js process spawning semantics)
// See: https://github.com/nodejs/node/issues/21825 and PR #315 for detailed testing
//
// Security: All paths (bunPath, script, MARKETPLACE_ROOT) are application-controlled system paths,
// not user input. If an attacker could modify these paths, they would already have full filesystem
// access including direct access to ~/.claude-mem/claude-mem.db. Nevertheless, we properly escape
// all values for PowerShell to follow security best practices.
const escapedBunPath = this.escapePowerShellString(bunPath);
const escapedScript = this.escapePowerShellString(script);
const escapedWorkDir = this.escapePowerShellString(MARKETPLACE_ROOT);
const envVars = `$env:CLAUDE_MEM_WORKER_PORT='${port}'`;
const psCommand = `${envVars}; Start-Process -FilePath '${escapedBunPath}' -ArgumentList '${escapedScript}' -WorkingDirectory '${escapedWorkDir}' -WindowStyle Hidden -PassThru | Select-Object -ExpandProperty Id`;
// Write logs
const logStream = createWriteStream(logFile, { flags: 'a' });
child.stdout?.pipe(logStream);
child.stderr?.pipe(logStream);
const result = spawnSync('powershell', ['-Command', psCommand], {
stdio: 'pipe',
timeout: 10000,
windowsHide: true
});
child.unref();
if (result.status !== 0) {
return {
success: false,
error: `PowerShell spawn failed: ${result.stderr?.toString() || 'unknown error'}`
};
}
if (!child.pid) {
return { success: false, error: 'Failed to get PID from spawned process' };
const pid = parseInt(result.stdout.toString().trim(), 10);
if (isNaN(pid)) {
return { success: false, error: 'Failed to get PID from PowerShell' };
}
// Write PID file
this.writePidFile({
pid,
port,
startedAt: new Date().toISOString(),
version: process.env.npm_package_version || 'unknown'
});
// Wait for health
return this.waitForHealth(pid, port);
} else {
// Unix: Use standard spawn with detached
const child = spawn(bunPath, [script], {
detached: true,
stdio: ['ignore', 'pipe', 'pipe'],
env: { ...process.env, CLAUDE_MEM_WORKER_PORT: String(port) },
cwd: MARKETPLACE_ROOT
});
// Write logs
const logStream = createWriteStream(logFile, { flags: 'a' });
child.stdout?.pipe(logStream);
child.stderr?.pipe(logStream);
child.unref();
if (!child.pid) {
return { success: false, error: 'Failed to get PID from spawned process' };
}
// Write PID file
this.writePidFile({
pid: child.pid,
port,
startedAt: new Date().toISOString(),
version: process.env.npm_package_version || 'unknown'
});
// Wait for health
return this.waitForHealth(child.pid, port);
}
// Write PID file
this.writePidFile({
pid: child.pid,
port,
startedAt: new Date().toISOString(),
version: process.env.npm_package_version || 'unknown'
});
// Wait for health
return this.waitForHealth(child.pid, port);
} catch (error) {
return {
success: false,