fix(windows): Windows platform stabilization improvements (#378)

* chore: bump version to 7.3.6 in package.json

* Enhance worker readiness checks and MCP connection handling

- Updated health check endpoint to /api/readiness for better initialization tracking.
- Increased timeout for health checks and worker startup retries, especially for Windows.
- Added initialization flags to track MCP readiness and overall worker initialization status.
- Implemented a timeout guard for MCP connection to prevent hanging.
- Adjusted logging to reflect readiness state and errors more accurately.

* fix(windows): use Bun PATH detection in worker wrapper

Phase 2/8: Fix Bun PATH Detection in Worker Wrapper

- Import getBunPath() in worker-wrapper.ts for Bun detection
- Add Bun path resolution before spawning inner worker process
- Update spawn call to use detected Bun path instead of process.execPath
- Add logging to bun-path.ts when PATH detection succeeds
- Add logging when fallback paths are used
- Add Windows-specific validation for .exe extension
- Log warning with searched paths when Bun not found
- Fail fast with clear error message if Bun cannot be detected

This ensures worker-wrapper uses the correct Bun executable on Windows
even when Bun is not in PATH, fixing issue #371 where users reported
"Bun not in PATH" errors despite Bun being installed.

Addresses: #371

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix(windows): standardize child process spawning with windowsHide

Phase 3/8: Standardize Child Process Spawning (Windows)

Changes:
- Added windowsHide flag to ChromaSync MCP subprocess spawn
- Added Windows-specific process tracking (childPid) in ChromaSync
- Force-kill subprocess on Windows before closing transport to prevent zombie processes
- Updated cleanupOrphanedProcesses() to support Windows using PowerShell Get-CimInstance
- Use taskkill /T /F for proper process tree cleanup on Windows
- Audited BranchManager - confirmed windowsHide already present on all spawn calls

This prevents PowerShell windows from appearing during ChromaSync operations
and ensures proper cleanup of subprocess trees on Windows.

Addresses: #363, #361, #367, #371, #373, #374

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix(windows): enhance socket cleanup with recursive process tree management

Phase 4/8: Enhanced Socket Cleanup & Process Tree Management

Changes:
- Added recursive process tree enumeration in worker-wrapper.ts for Windows
- Enhanced killInner() to enumerate all descendants before killing
- Added fallback individual process kill if taskkill /T fails
- Added 10s timeout to ChromaSync.close() in DatabaseManager to prevent hangs
- Force nullify ChromaSync even on close failure to prevent resource leaks
- Improved logging to show full process tree during cleanup

This ensures complete cleanup of all child processes (ChromaSync MCP subprocess,
Python processes, etc.) preventing socket leaks and CLOSE_WAIT states.

Addresses: #363, #361

* fix(windows): consolidate project name extraction with drive root handling

Phase 5/8: Project Name Extraction Consolidation

- Created shared getProjectName() utility in src/utils/project-name.ts
- Handles edge case: drive roots (C:\, J:\) now return "drive-X" format
- Handles edge case: null/undefined/empty cwd now returns "unknown-project"
- Fixed missing null check bug in new-hook.ts
- Replaced duplicated path.basename(cwd) logic in:
  - src/hooks/context-hook.ts
  - src/hooks/new-hook.ts
  - src/services/context-generator.ts

Addresses: #374

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix(windows): increase timeouts and improve error messages

Phase 6/8: Increase Timeouts & Improve Error Messages

- Enhanced logger.ts with platform prefix (WIN32/DARWIN) and PID in all logs
- Added comprehensive Windows troubleshooting to ProcessManager error messages
- Enhanced Bun detection error message with Windows-specific troubleshooting
- All error messages now include GitHub issue numbers and docs links
- Windows timeout already increased to 2.0x multiplier in previous phases

Changes:
- src/utils/logger.ts: Added platform prefix and PID to all log output
- src/services/process/ProcessManager.ts: Enhanced error messages with troubleshooting steps
- src/utils/bun-path.ts: Added Windows-specific Bun detection error guidance

Addresses: #363, #361, #367, #371, #373, #374

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix(windows): add comprehensive Windows CI testing

Phase 7/8: Add Windows CI Testing

- Create automated Windows testing workflow
- Test worker startup/shutdown cycles
- Verify Bun PATH detection on Windows
- Test rapid restart scenarios
- Validate port cleanup after shutdown
- Check for zombie processes
- Run on all pushes and PRs to main/fix/feature branches

Addresses: #363, #361, #367, #371, #373, #374

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* ci(windows): remove build steps from Windows CI workflow

Build files are already included in the plugin folder, so npm install
and npm run build are unnecessary steps in the CI workflow.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* revert: remove Windows CI workflow

The CI workflow cannot be properly implemented in the current architecture
due to limitations in testing the worker service in CI environments.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* security: add PID validation and improve ChromaSync timeout handling

Address critical security and reliability issues identified in PR review:

**Security Fixes:**
- Add PID validation before all PowerShell/taskkill command execution
- Validate PIDs are positive integers to prevent command injection
- Apply validation in worker-wrapper.ts, worker-service.ts, and ChromaSync.ts

**Reliability Improvements:**
- Add timeout handling to ChromaSync client.close() (10s timeout)
- Add timeout handling to ChromaSync transport.close() (5s timeout)
- Implement force-kill fallback when ChromaSync close operations timeout
- Prevents hanging on shutdown and ensures subprocess cleanup

**Implementation Details:**
- PID validation checks: Number.isInteger(pid) && pid > 0
- Applied before all execSync taskkill calls on Windows
- Applied in process enumeration (Get-CimInstance) PowerShell commands
- ChromaSync.close() uses Promise.race for timeout enforcement
- Graceful degradation with force-kill fallback on timeout

Addresses PR #378 review feedback

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Refactor ChromaSync client and transport closure logic

- Removed timeout handling for closing the Chroma client and transport.
- Simplified error logging for client and transport closure.
- Ensured subprocess cleanup logic is more straightforward.

* fix(worker): streamline Windows process management and cleanup

* revert: remove speculative LLM-generated complexity

Reverts defensive code that was added speculatively without user-reported issues:

- ChromaSync: Remove PID extraction and explicit taskkill (wrapper handles this)
- worker-wrapper: Restore simple taskkill /T /F (validated in v7.3.5)
- DatabaseManager: Remove Promise.race timeout wrapper
- hook-constants: Restore original timeout values
- logger: Remove platform/PID additions to every log line
- bun-path: Remove speculative logging

Keeps only changes that map to actual GitHub issues:
- #374: Drive root project name fix (getProjectName utility)
- #363: Readiness endpoint and Windows orphan cleanup
- #367: windowsHide on ChromaSync transport

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
Alex Newman
2025-12-17 18:44:04 -05:00
committed by GitHub
parent 40a71d3250
commit bff10d49c9
20 changed files with 457 additions and 216 deletions
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long