* fix: add session lifecycle guards to prevent runaway API spend (#1590)
Three root causes allowed 30+ subprocess accumulation over 36 hours:
1. SIGTERM-killed processes (code 143) triggered crash recovery and
immediately respawned — now detected and treated as intentional
termination (aborts controller so wasAborted=true in .finally).
2. No wall-clock limit: sessions ran for 13+ hours continuously
spending tokens — now refuses new generators after 4 hours and
drains the pending queue to prevent further spawning.
3. Duplicate --resume processes for the same session UUID — now
killed and unregistered before a new spawn is registered.
Generated by Claude Code
Vibe coded by ousamabenyounes
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: use normalized errorMsg in logger.error payload and annotate SIGTERM override
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: use persisted createdAt for wall-clock guard and bind abortController locally to prevent stale abort
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* chore: re-trigger CodeRabbit review after rate limit reset
* fix: defer process unregistration until exit and align boundary test with strict > (#1693)
- ProcessRegistry: don't unregister PID immediately after SIGTERM — let the
existing 'exit' handler clean up when the process actually exits, preventing
tracking loss for still-live processes.
- Test: align wall-clock boundary test with production's strict `>` operator
(exactly 4h is NOT terminated, only >4h is).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>