23591db589
Fixes blank console windows appearing on Windows 11 when spawning the worker process. ## Problem On Windows, `windowsHide: true` doesn't work when combined with `detached: true` in child_process.spawn(). This is a Node.js limitation (nodejs/node#21825) that **Bun also inherits** because Bun uses Node.js process spawning semantics. Result: Blank console windows with "claude" title appear during claude-mem operations. ## Solution Use PowerShell's `Start-Process -WindowStyle Hidden` on Windows to properly hide console windows. Unix platforms continue using standard `spawn()` with `detached: true`. ## Testing Validated by ToxMox on Windows 11 in PR #315: - windowsHide approach: ❌ Still shows blank consoles - PowerShell approach: ✅ Properly hides windows ## Implementation ```typescript // Windows: PowerShell workaround Start-Process -FilePath 'bun' -ArgumentList '${script}' -WindowStyle Hidden // Unix: Standard spawn (works fine) spawn(bunPath, [script], { detached: true }) ``` ## Notes - Affects BOTH Bun and Node.js runtimes on Windows - This is a **high-priority fix** for Windows users - Keeps Bun runtime (doesn't address zombie socket issue) - Zombie socket issue is separate and requires different solution ## References - Issue: #304 (Multiple visible console windows on Windows 11) - Testing: PR #315 (ToxMox's detailed analysis) - Node.js bug: nodejs/node#21825 - Extracted from: PR #335 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
125 lines
4.9 KiB
Markdown
125 lines
4.9 KiB
Markdown
⏺ PR #335 Review Summary: Queue Monitoring UI + Windows Fixes
|
|
|
|
What PR #335 Does
|
|
|
|
Additions: 3,724 lines
|
|
Author: ToxMox
|
|
Main Changes:
|
|
1. Switch worker runtime from Bun → Node.js (all platforms)
|
|
2. Add persistent message queue in SQLite (pending_messages table)
|
|
3. Add queue monitoring UI with drawer interface
|
|
4. Add watchdog service for auto-recovery (30s polling)
|
|
5. Fix Windows console popups using PowerShell Start-Process -WindowStyle Hidden
|
|
|
|
Initial Concerns (My Review)
|
|
|
|
Raised red flags about:
|
|
- Lack of evidence for "zombie socket" issue (no GitHub issue, only ToxMox reported)
|
|
- Over-engineering: Full persistent queue + watchdog + retry logic + UI for unproven problems
|
|
- Mixing multiple concerns in one PR (should be 3-4 separate PRs)
|
|
- No automated tests for complex state machine logic
|
|
- Global runtime change (Bun→Node) affects all platforms for Windows-specific issue
|
|
- Command injection vulnerability in PowerShell string (ProcessManager.ts:67)
|
|
|
|
What We Discovered
|
|
|
|
1. Problems ARE Real & Documented
|
|
|
|
- Found detailed analysis in PR #315 comments by ToxMox
|
|
- Zombie socket issue has upstream Bun GitHub issues linked:
|
|
- oven-sh/bun#12127, #5774, #8786
|
|
- windowsHide: true doesn't work with detached: true (Node.js bug #21825)
|
|
- SDK subprocess hangs documented with testing details
|
|
|
|
2. Queue UI Has Real Value
|
|
|
|
- You saw video demo and said it's "gorgeous and helpful"
|
|
- Provides visibility into worker activity
|
|
- Recovery controls prevent user frustration
|
|
- Real-time updates via existing SSE infrastructure
|
|
|
|
3. Architecture Makes Sense
|
|
|
|
Why persistent worker is needed:
|
|
- Real-time UI at http://localhost:37777 requires persistent process
|
|
- SSE (Server-Sent Events) for live updates
|
|
- Can't do on-demand worker if UI needs to be always available
|
|
|
|
Why queue in database is justified:
|
|
- Transactional consistency (save observation + enqueue atomically)
|
|
- Relational queries (JOIN with sessions/projects)
|
|
- Foreign key cascades (session deleted → queue entries auto-cleaned)
|
|
- Already have SQLite infrastructure
|
|
|
|
4. Storage Optimization Strategy
|
|
|
|
Smart cleanup approach (your insight):
|
|
- Keep full data while pending/processing (needed for retry)
|
|
- Clear payloads immediately on completion: Set tool_input, tool_response, last_user_message, last_assistant_message to NULL
|
|
- Keep lightweight metadata for "Recently Processed" UI
|
|
- Eventually delete old completed records (after 1hr or >100 count)
|
|
- Result: 100x storage reduction while keeping UI history
|
|
|
|
Rejected approach: Linking to transcript files (overly complex, YAGNI)
|
|
|
|
Critical Insight: Bun → Node Switch Likely Unnecessary
|
|
|
|
Your final assessment:
|
|
"honestly thats more an llm hallucinating an overengineered solution based on incorrect data that probably could be solved by just killing the process correctly"
|
|
|
|
The real issue is probably:
|
|
- Missing cleanup handlers (server.close() before exit)
|
|
- Process killed too fast (SIGKILL before cleanup finishes)
|
|
- Not receiving SIGTERM properly
|
|
- No registered signal handlers for graceful shutdown
|
|
|
|
Simple fix to try FIRST:
|
|
const server = app.listen(port);
|
|
|
|
async function cleanup() {
|
|
server.close(); // Close server
|
|
sessionManager.abortAll(); // Stop active work
|
|
db.close(); // Close DB
|
|
process.exit(0);
|
|
}
|
|
|
|
process.on('SIGTERM', cleanup);
|
|
process.on('SIGINT', cleanup);
|
|
|
|
Final Assessment
|
|
|
|
PR #335 is mostly solid with real benefits, BUT:
|
|
|
|
✅ What's Good:
|
|
|
|
- Queue UI provides real value
|
|
- Persistent queue in DB is architecturally justified
|
|
- Auto-recovery prevents stuck sessions
|
|
- Problems are real and documented
|
|
- Comprehensive solution to multiple pain points
|
|
|
|
⚠️ What Needs Validation:
|
|
|
|
1. Bun zombie socket issue - Only ToxMox reported, not validated by you
|
|
2. Proper cleanup handlers - Try fixing process termination before switching runtimes
|
|
3. Platform-specific runtime - If Bun issue is real, use Node only on Windows, keep Bun on Mac/Linux
|
|
|
|
🔧 What Needs Fixing:
|
|
|
|
1. Add tests - Zero automated tests for complex state machine
|
|
2. Fix command injection - ProcessManager.ts:67 PowerShell string interpolation
|
|
3. Implement payload cleanup - Clear heavy data immediately on completion
|
|
4. Try simple fix first - Proper signal handlers before runtime switch
|
|
|
|
Action Items for Next Session
|
|
|
|
1. Ask ToxMox: Did you try proper cleanup handlers before switching runtimes?
|
|
2. Suggest: Platform-specific runtime (Bun on Unix, Node on Windows only if needed)
|
|
3. Request: Reproduction steps for zombie socket issue
|
|
4. Require: Basic tests before merge
|
|
5. Fix: Command injection vulnerability
|
|
6. Consider: Splitting into separate PRs (optional, not required)
|
|
|
|
Key Takeaway
|
|
|
|
The PR solves real problems with solid architecture, but the Bun→Node switch is likely over-engineered. Try proper process cleanup first. |