fix: Issue Blowout 2026 — 25 bugs across worker, hooks, security, and search (#2080)
* fix: resolve search, database, and docker bugs (#1913, #1916, #1956, #1957, #2048) - Fix concept/concepts param mismatch in SearchManager.normalizeParams (#1916) - Add FTS5 keyword fallback when ChromaDB is unavailable (#1913, #2048) - Add periodic WAL checkpoint and journal_size_limit to prevent unbounded WAL growth (#1956) - Add periodic clearFailed() to purge stale pending_messages (#1957) - Fix nounset-safe TTY_ARGS expansion in docker/claude-mem/run.sh Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: prevent silent data loss on non-XML responses, add queue info to /health (#1867, #1874) - ResponseProcessor: mark messages as failed (with retry) instead of confirming when the LLM returns non-XML garbage (auth errors, rate limits) (#1874) - Health endpoint: include activeSessions count for queue liveness monitoring (#1867) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: cache isFts5Available() at construction time Addresses Greptile review: avoid DDL probe (CREATE + DROP) on every text query. Result is now cached in _fts5Available at construction. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve worker stability bugs — pool deadlock, MCP loopback, restart guard (#1868, #1876, #2053) - Replace flat consecutiveRestarts counter with time-windowed RestartGuard: only counts restarts within 60s window (cap=10), decays after 5min of success. Prevents stranding pending messages on long-running sessions. (#2053) - Add idle session eviction to pool slot allocation: when all slots are full, evict the idlest session (no pending work, oldest activity) to free a slot for new requests, preventing 60s timeout deadlock. (#1868) - Fix MCP loopback self-check: use process.execPath instead of bare 'node' which fails on non-interactive PATH. Fix crash misclassification by removing false "Generator exited unexpectedly" error log on normal completion. (#1876) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve hooks reliability bugs — summarize exit code, session-init health wait (#1896, #1901, #1903, #1907) - Wrap summarize hook's workerHttpRequest in try/catch to prevent exit code 2 (blocking error) on network failures or malformed responses. Session exit no longer blocks on worker errors. (#1901) - Add health-check wait loop to UserPromptSubmit session-init command in hooks.json. On Linux/WSL where hook ordering fires UserPromptSubmit before SessionStart, session-init now waits up to 10s for worker health before proceeding. Also wrap session-init HTTP call in try/catch. (#1907) - Close #1896 as already-fixed: mtime comparison at file-context.ts:255-267 bypasses truncation when file is newer than latest observation. - Close #1903 as no-repro: hooks.json correctly declares all hook events. Issue was Claude Code 12.0.1/macOS platform event-dispatch bug. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: security hardening — bearer auth, path validation, rate limits, per-user port (#1932, #1933, #1934, #1935, #1936) - Add bearer token auth to all API endpoints: auto-generated 32-byte token stored at ~/.claude-mem/worker-auth-token (mode 0600). All hook, MCP, viewer, and OpenCode requests include Authorization header. Health/readiness endpoints exempt for polling. (#1932, #1933) - Add path traversal protection: watch.context.path validated against project root and ~/.claude-mem/ before write. Rejects ../../../etc style attacks. (#1934) - Reduce JSON body limit from 50MB to 5MB. Add in-memory rate limiter (300 req/min/IP) to prevent abuse. (#1935) - Derive default worker port from UID (37700 + uid%100) to prevent cross-user data leakage on multi-user macOS. Windows falls back to 37777. Shell hooks use same formula via id -u. (#1936) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve search project filtering and import Chroma sync (#1911, #1912, #1914, #1918) - Fix per-type search endpoints to pass project filter to Chroma queries and SQLite hydration. searchObservations/Sessions/UserPrompts now use $or clause matching project + merged_into_project. (#1912) - Fix timeline/search methods to pass project to Chroma anchor queries. Prevents cross-project result leakage when project param omitted. (#1911) - Sync imported observations to ChromaDB after FTS rebuild. Import endpoint now calls chromaSync.syncObservation() for each imported row, making them visible to MCP search(). (#1914) - Fix session-init cwd fallback to match context.ts (process.cwd()). Prevents project key mismatch that caused "no previous sessions" on fresh sessions. (#1918) - Fix sync-marketplace restart to include auth token and per-user port. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve all CodeRabbit and Greptile review comments on PR #2080 - Fix run.sh comment mismatch (no-op flag vs empty array) - Gate session-init on health check success (prevent running when worker unreachable) - Fix date_desc ordering ignored in FTS session search - Age-scope failed message purge (1h retention) instead of clearing all - Anchor RestartGuard decay to real successes (null init, not Date.now()) - Add recordSuccess() calls in ResponseProcessor and completion path - Prevent caller headers from overriding bearer auth token - Add lazy cleanup for rate limiter map to prevent unbounded growth - Bound post-import Chroma sync with concurrency limit of 8 - Add doc_type:'observation' filter to Chroma queries feeding observation hydration - Add FTS fallback to all specialized search handlers (observations, sessions, prompts, timeline) - Add response.ok check and error handling in viewer saveSettings Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve CodeRabbit round-2 review comments - Use failure timestamp (COALESCE) instead of created_at_epoch for stale purge - Downgrade _fts5Available flag when FTS table creation fails - Escape FTS5 MATCH input by quoting user queries as literal phrases - Escape LIKE metacharacters (%, _, \) in prompt text search - Add response.ok check in initial settings load (matches save flow) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve CodeRabbit round-3 review comments - Include failed_at_epoch in COALESCE for age-scoped purge - Re-throw FTS5 errors so callers can distinguish failure from no-results - Wrap all FTS fallback calls in SearchManager with try/catch Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -115,10 +115,15 @@ function notifySlotAvailable(): void {
|
||||
* Wait for a pool slot to become available (promise-based, not polling)
|
||||
* @param maxConcurrent Max number of concurrent agents
|
||||
* @param timeoutMs Max time to wait before giving up
|
||||
* @param evictIdleSession Optional callback to evict an idle session when all slots are full (#1868)
|
||||
*/
|
||||
const TOTAL_PROCESS_HARD_CAP = 10;
|
||||
|
||||
export async function waitForSlot(maxConcurrent: number, timeoutMs: number = 60_000): Promise<void> {
|
||||
export async function waitForSlot(
|
||||
maxConcurrent: number,
|
||||
timeoutMs: number = 60_000,
|
||||
evictIdleSession?: () => boolean
|
||||
): Promise<void> {
|
||||
// Hard cap: refuse to spawn if too many processes exist regardless of pool accounting
|
||||
const activeCount = getActiveCount();
|
||||
if (activeCount >= TOTAL_PROCESS_HARD_CAP) {
|
||||
@@ -127,6 +132,17 @@ export async function waitForSlot(maxConcurrent: number, timeoutMs: number = 60_
|
||||
|
||||
if (activeCount < maxConcurrent) return;
|
||||
|
||||
// Try to evict an idle session before waiting (#1868)
|
||||
// Idle sessions hold pool slots during their 3-min idle timeout, blocking new sessions
|
||||
// that would timeout after 60s. Eviction aborts the idle session asynchronously —
|
||||
// the freed slot is picked up by the waiter mechanism below.
|
||||
if (evictIdleSession) {
|
||||
const evicted = evictIdleSession();
|
||||
if (evicted) {
|
||||
logger.info('PROCESS', 'Evicted idle session to free pool slot for waiting request');
|
||||
}
|
||||
}
|
||||
|
||||
logger.info('PROCESS', `Pool limit reached (${activeCount}/${maxConcurrent}), waiting for slot...`);
|
||||
|
||||
return new Promise<void>((resolve, reject) => {
|
||||
|
||||
@@ -0,0 +1,70 @@
|
||||
/**
|
||||
* Time-windowed restart guard.
|
||||
* Prevents tight-loop restarts (bug) while allowing legitimate occasional restarts
|
||||
* over long sessions. Replaces the flat consecutiveRestarts counter that stranded
|
||||
* pending messages after just 3 restarts over any timeframe (#2053).
|
||||
*/
|
||||
|
||||
const RESTART_WINDOW_MS = 60_000; // Only count restarts within last 60 seconds
|
||||
const MAX_WINDOWED_RESTARTS = 10; // 10 restarts in 60s = runaway loop
|
||||
const DECAY_AFTER_SUCCESS_MS = 5 * 60_000; // Clear history after 5min of uninterrupted success
|
||||
|
||||
export class RestartGuard {
|
||||
private restartTimestamps: number[] = [];
|
||||
private lastSuccessfulProcessing: number | null = null;
|
||||
|
||||
/**
|
||||
* Record a restart and check if the guard should trip.
|
||||
* @returns true if the restart is ALLOWED, false if it should be BLOCKED
|
||||
*/
|
||||
recordRestart(): boolean {
|
||||
const now = Date.now();
|
||||
|
||||
// Decay: clear history only after real success + 5min of uninterrupted success
|
||||
if (this.lastSuccessfulProcessing !== null
|
||||
&& now - this.lastSuccessfulProcessing >= DECAY_AFTER_SUCCESS_MS) {
|
||||
this.restartTimestamps = [];
|
||||
this.lastSuccessfulProcessing = null;
|
||||
}
|
||||
|
||||
// Prune old timestamps outside the window
|
||||
this.restartTimestamps = this.restartTimestamps.filter(
|
||||
ts => now - ts < RESTART_WINDOW_MS
|
||||
);
|
||||
|
||||
// Record this restart
|
||||
this.restartTimestamps.push(now);
|
||||
|
||||
// Check if we've exceeded the cap within the window
|
||||
return this.restartTimestamps.length <= MAX_WINDOWED_RESTARTS;
|
||||
}
|
||||
|
||||
/**
|
||||
* Call when a message is successfully processed to update the success timestamp.
|
||||
*/
|
||||
recordSuccess(): void {
|
||||
this.lastSuccessfulProcessing = Date.now();
|
||||
}
|
||||
|
||||
/**
|
||||
* Get the number of restarts in the current window (for logging).
|
||||
*/
|
||||
get restartsInWindow(): number {
|
||||
const now = Date.now();
|
||||
return this.restartTimestamps.filter(ts => now - ts < RESTART_WINDOW_MS).length;
|
||||
}
|
||||
|
||||
/**
|
||||
* Get the window size in ms (for logging).
|
||||
*/
|
||||
get windowMs(): number {
|
||||
return RESTART_WINDOW_MS;
|
||||
}
|
||||
|
||||
/**
|
||||
* Get the max allowed restarts (for logging).
|
||||
*/
|
||||
get maxRestarts(): number {
|
||||
return MAX_WINDOWED_RESTARTS;
|
||||
}
|
||||
}
|
||||
@@ -90,9 +90,11 @@ export class SDKAgent {
|
||||
}
|
||||
|
||||
// Wait for agent pool slot (configurable via CLAUDE_MEM_MAX_CONCURRENT_AGENTS)
|
||||
// Pass idle session eviction callback to prevent pool deadlock (#1868):
|
||||
// idle sessions hold slots during 3-min idle wait, blocking new sessions
|
||||
const settings = SettingsDefaultsManager.loadFromFile(USER_SETTINGS_PATH);
|
||||
const maxConcurrent = parseInt(settings.CLAUDE_MEM_MAX_CONCURRENT_AGENTS, 10) || 2;
|
||||
await waitForSlot(maxConcurrent);
|
||||
await waitForSlot(maxConcurrent, 60_000, () => this.sessionManager.evictIdlestSession());
|
||||
|
||||
// Build isolated environment from ~/.claude-mem/.env
|
||||
// This prevents Issue #733: random ANTHROPIC_API_KEY from project .env files
|
||||
|
||||
@@ -67,8 +67,20 @@ export class SearchManager {
|
||||
return await this.chromaSync.queryChroma(query, limit, whereFilter);
|
||||
}
|
||||
|
||||
private async searchChromaForTimeline(query: string, ninetyDaysAgo: number): Promise<ObservationSearchResult[]> {
|
||||
const chromaResults = await this.queryChroma(query, 100);
|
||||
private async searchChromaForTimeline(query: string, ninetyDaysAgo: number, project?: string): Promise<ObservationSearchResult[]> {
|
||||
// Build where filter scoped to observations only + project if provided
|
||||
let whereFilter: Record<string, any> = { doc_type: 'observation' };
|
||||
if (project) {
|
||||
const projectFilter = {
|
||||
$or: [
|
||||
{ project },
|
||||
{ merged_into_project: project }
|
||||
]
|
||||
};
|
||||
whereFilter = { $and: [whereFilter, projectFilter] };
|
||||
}
|
||||
|
||||
const chromaResults = await this.queryChroma(query, 100, whereFilter);
|
||||
logger.debug('SEARCH', 'Chroma returned semantic matches for timeline', { matchCount: chromaResults?.ids?.length ?? 0 });
|
||||
|
||||
if (chromaResults?.ids && chromaResults.ids.length > 0) {
|
||||
@@ -78,7 +90,7 @@ export class SearchManager {
|
||||
});
|
||||
|
||||
if (recentIds.length > 0) {
|
||||
return this.sessionStore.getObservationsByIds(recentIds, { orderBy: 'date_desc', limit: 1 });
|
||||
return this.sessionStore.getObservationsByIds(recentIds, { orderBy: 'date_desc', limit: 1, project });
|
||||
}
|
||||
}
|
||||
return [];
|
||||
@@ -286,14 +298,20 @@ export class SearchManager {
|
||||
// ChromaDB not initialized - fall back to FTS5 keyword search (#1913, #2048)
|
||||
else if (query) {
|
||||
logger.debug('SEARCH', 'ChromaDB not initialized — falling back to FTS5 keyword search', {});
|
||||
if (searchObservations) {
|
||||
observations = this.sessionSearch.searchObservations(query, { ...options, type: obs_type, concepts, files });
|
||||
}
|
||||
if (searchSessions) {
|
||||
sessions = this.sessionSearch.searchSessions(query, options);
|
||||
}
|
||||
if (searchPrompts) {
|
||||
prompts = this.sessionSearch.searchUserPrompts(query, options);
|
||||
try {
|
||||
if (searchObservations) {
|
||||
observations = this.sessionSearch.searchObservations(query, { ...options, type: obs_type, concepts, files });
|
||||
}
|
||||
if (searchSessions) {
|
||||
sessions = this.sessionSearch.searchSessions(query, options);
|
||||
}
|
||||
if (searchPrompts) {
|
||||
prompts = this.sessionSearch.searchUserPrompts(query, options);
|
||||
}
|
||||
} catch (ftsError) {
|
||||
const errorObject = ftsError instanceof Error ? ftsError : new Error(String(ftsError));
|
||||
logger.error('WORKER', 'FTS5 fallback search failed', {}, errorObject);
|
||||
chromaFailed = true;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -469,13 +487,25 @@ export class SearchManager {
|
||||
logger.debug('SEARCH', 'Using hybrid semantic search for timeline query', {});
|
||||
const ninetyDaysAgo = Date.now() - SEARCH_CONSTANTS.RECENCY_WINDOW_MS;
|
||||
try {
|
||||
results = await this.searchChromaForTimeline(query, ninetyDaysAgo);
|
||||
results = await this.searchChromaForTimeline(query, ninetyDaysAgo, project);
|
||||
} catch (chromaError) {
|
||||
const errorObject = chromaError instanceof Error ? chromaError : new Error(String(chromaError));
|
||||
logger.error('WORKER', 'Chroma search failed for timeline, continuing without semantic results', {}, errorObject);
|
||||
}
|
||||
}
|
||||
|
||||
// FTS fallback when Chroma is unavailable or returned no results
|
||||
if (results.length === 0) {
|
||||
try {
|
||||
const ftsResults = this.sessionSearch.searchObservations(query, { project, limit: 1 });
|
||||
if (ftsResults.length > 0) {
|
||||
results = ftsResults;
|
||||
}
|
||||
} catch (ftsError) {
|
||||
logger.warn('SEARCH', 'FTS fallback failed for timeline', {}, ftsError instanceof Error ? ftsError : undefined);
|
||||
}
|
||||
}
|
||||
|
||||
if (results.length === 0) {
|
||||
return {
|
||||
content: [{
|
||||
@@ -927,26 +957,55 @@ export class SearchManager {
|
||||
if (this.chromaSync) {
|
||||
logger.debug('SEARCH', 'Using hybrid semantic search (Chroma + SQLite)', {});
|
||||
|
||||
// Build Chroma where filter with doc_type and project scope
|
||||
let whereFilter: Record<string, any> = { doc_type: 'observation' };
|
||||
if (options.project) {
|
||||
const projectFilter = {
|
||||
$or: [
|
||||
{ project: options.project },
|
||||
{ merged_into_project: options.project }
|
||||
]
|
||||
};
|
||||
whereFilter = { $and: [whereFilter, projectFilter] };
|
||||
}
|
||||
|
||||
// Step 1: Chroma semantic search (top 100)
|
||||
const chromaResults = await this.queryChroma(query, 100);
|
||||
logger.debug('SEARCH', 'Chroma returned semantic matches', { matchCount: chromaResults.ids.length });
|
||||
try {
|
||||
const chromaResults = await this.queryChroma(query, 100, whereFilter);
|
||||
logger.debug('SEARCH', 'Chroma returned semantic matches', { matchCount: chromaResults.ids.length });
|
||||
|
||||
if (chromaResults.ids.length > 0) {
|
||||
// Step 2: Filter by recency (90 days)
|
||||
const ninetyDaysAgo = Date.now() - SEARCH_CONSTANTS.RECENCY_WINDOW_MS;
|
||||
const recentIds = chromaResults.ids.filter((_id, idx) => {
|
||||
const meta = chromaResults.metadatas[idx];
|
||||
return meta && meta.created_at_epoch > ninetyDaysAgo;
|
||||
});
|
||||
if (chromaResults.ids.length > 0) {
|
||||
// Step 2: Filter by recency (90 days)
|
||||
const ninetyDaysAgo = Date.now() - SEARCH_CONSTANTS.RECENCY_WINDOW_MS;
|
||||
const recentIds = chromaResults.ids.filter((_id, idx) => {
|
||||
const meta = chromaResults.metadatas[idx];
|
||||
return meta && meta.created_at_epoch > ninetyDaysAgo;
|
||||
});
|
||||
|
||||
logger.debug('SEARCH', 'Results within 90-day window', { count: recentIds.length });
|
||||
logger.debug('SEARCH', 'Results within 90-day window', { count: recentIds.length });
|
||||
|
||||
// Step 3: Hydrate from SQLite in temporal order
|
||||
if (recentIds.length > 0) {
|
||||
const limit = options.limit || 20;
|
||||
results = this.sessionStore.getObservationsByIds(recentIds, { orderBy: 'date_desc', limit });
|
||||
logger.debug('SEARCH', 'Hydrated observations from SQLite', { count: results.length });
|
||||
// Step 3: Hydrate from SQLite in temporal order
|
||||
if (recentIds.length > 0) {
|
||||
const limit = options.limit || 20;
|
||||
results = this.sessionStore.getObservationsByIds(recentIds, { orderBy: 'date_desc', limit, project: options.project });
|
||||
logger.debug('SEARCH', 'Hydrated observations from SQLite', { count: results.length });
|
||||
}
|
||||
}
|
||||
} catch (chromaError) {
|
||||
const errorObject = chromaError instanceof Error ? chromaError : new Error(String(chromaError));
|
||||
logger.error('WORKER', 'Chroma search failed for observations, falling back to FTS', {}, errorObject);
|
||||
}
|
||||
}
|
||||
|
||||
// FTS fallback when Chroma is unavailable or returned no results
|
||||
if (results.length === 0) {
|
||||
try {
|
||||
const ftsResults = this.sessionSearch.searchObservations(query, options);
|
||||
if (ftsResults.length > 0) {
|
||||
results = ftsResults;
|
||||
}
|
||||
} catch (ftsError) {
|
||||
logger.warn('SEARCH', 'FTS fallback failed for observations', {}, ftsError instanceof Error ? ftsError : undefined);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -984,26 +1043,55 @@ export class SearchManager {
|
||||
if (this.chromaSync) {
|
||||
logger.debug('SEARCH', 'Using hybrid semantic search for sessions', {});
|
||||
|
||||
// Build Chroma where filter with doc_type and project scope
|
||||
let whereFilter: Record<string, any> = { doc_type: 'session_summary' };
|
||||
if (options.project) {
|
||||
const projectFilter = {
|
||||
$or: [
|
||||
{ project: options.project },
|
||||
{ merged_into_project: options.project }
|
||||
]
|
||||
};
|
||||
whereFilter = { $and: [whereFilter, projectFilter] };
|
||||
}
|
||||
|
||||
// Step 1: Chroma semantic search (top 100)
|
||||
const chromaResults = await this.queryChroma(query, 100, { doc_type: 'session_summary' });
|
||||
logger.debug('SEARCH', 'Chroma returned semantic matches for sessions', { matchCount: chromaResults.ids.length });
|
||||
try {
|
||||
const chromaResults = await this.queryChroma(query, 100, whereFilter);
|
||||
logger.debug('SEARCH', 'Chroma returned semantic matches for sessions', { matchCount: chromaResults.ids.length });
|
||||
|
||||
if (chromaResults.ids.length > 0) {
|
||||
// Step 2: Filter by recency (90 days)
|
||||
const ninetyDaysAgo = Date.now() - SEARCH_CONSTANTS.RECENCY_WINDOW_MS;
|
||||
const recentIds = chromaResults.ids.filter((_id, idx) => {
|
||||
const meta = chromaResults.metadatas[idx];
|
||||
return meta && meta.created_at_epoch > ninetyDaysAgo;
|
||||
});
|
||||
if (chromaResults.ids.length > 0) {
|
||||
// Step 2: Filter by recency (90 days)
|
||||
const ninetyDaysAgo = Date.now() - SEARCH_CONSTANTS.RECENCY_WINDOW_MS;
|
||||
const recentIds = chromaResults.ids.filter((_id, idx) => {
|
||||
const meta = chromaResults.metadatas[idx];
|
||||
return meta && meta.created_at_epoch > ninetyDaysAgo;
|
||||
});
|
||||
|
||||
logger.debug('SEARCH', 'Results within 90-day window', { count: recentIds.length });
|
||||
logger.debug('SEARCH', 'Results within 90-day window', { count: recentIds.length });
|
||||
|
||||
// Step 3: Hydrate from SQLite in temporal order
|
||||
if (recentIds.length > 0) {
|
||||
const limit = options.limit || 20;
|
||||
results = this.sessionStore.getSessionSummariesByIds(recentIds, { orderBy: 'date_desc', limit });
|
||||
logger.debug('SEARCH', 'Hydrated sessions from SQLite', { count: results.length });
|
||||
// Step 3: Hydrate from SQLite in temporal order
|
||||
if (recentIds.length > 0) {
|
||||
const limit = options.limit || 20;
|
||||
results = this.sessionStore.getSessionSummariesByIds(recentIds, { orderBy: 'date_desc', limit, project: options.project });
|
||||
logger.debug('SEARCH', 'Hydrated sessions from SQLite', { count: results.length });
|
||||
}
|
||||
}
|
||||
} catch (chromaError) {
|
||||
const errorObject = chromaError instanceof Error ? chromaError : new Error(String(chromaError));
|
||||
logger.error('WORKER', 'Chroma search failed for sessions, falling back to FTS', {}, errorObject);
|
||||
}
|
||||
}
|
||||
|
||||
// FTS fallback when Chroma is unavailable or returned no results
|
||||
if (results.length === 0) {
|
||||
try {
|
||||
const ftsResults = this.sessionSearch.searchSessions(query, options);
|
||||
if (ftsResults.length > 0) {
|
||||
results = ftsResults;
|
||||
}
|
||||
} catch (ftsError) {
|
||||
logger.warn('SEARCH', 'FTS fallback failed for sessions', {}, ftsError instanceof Error ? ftsError : undefined);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1041,26 +1129,55 @@ export class SearchManager {
|
||||
if (this.chromaSync) {
|
||||
logger.debug('SEARCH', 'Using hybrid semantic search for user prompts', {});
|
||||
|
||||
// Build Chroma where filter with doc_type and project scope
|
||||
let whereFilter: Record<string, any> = { doc_type: 'user_prompt' };
|
||||
if (options.project) {
|
||||
const projectFilter = {
|
||||
$or: [
|
||||
{ project: options.project },
|
||||
{ merged_into_project: options.project }
|
||||
]
|
||||
};
|
||||
whereFilter = { $and: [whereFilter, projectFilter] };
|
||||
}
|
||||
|
||||
// Step 1: Chroma semantic search (top 100)
|
||||
const chromaResults = await this.queryChroma(query, 100, { doc_type: 'user_prompt' });
|
||||
logger.debug('SEARCH', 'Chroma returned semantic matches for prompts', { matchCount: chromaResults.ids.length });
|
||||
try {
|
||||
const chromaResults = await this.queryChroma(query, 100, whereFilter);
|
||||
logger.debug('SEARCH', 'Chroma returned semantic matches for prompts', { matchCount: chromaResults.ids.length });
|
||||
|
||||
if (chromaResults.ids.length > 0) {
|
||||
// Step 2: Filter by recency (90 days)
|
||||
const ninetyDaysAgo = Date.now() - SEARCH_CONSTANTS.RECENCY_WINDOW_MS;
|
||||
const recentIds = chromaResults.ids.filter((_id, idx) => {
|
||||
const meta = chromaResults.metadatas[idx];
|
||||
return meta && meta.created_at_epoch > ninetyDaysAgo;
|
||||
});
|
||||
if (chromaResults.ids.length > 0) {
|
||||
// Step 2: Filter by recency (90 days)
|
||||
const ninetyDaysAgo = Date.now() - SEARCH_CONSTANTS.RECENCY_WINDOW_MS;
|
||||
const recentIds = chromaResults.ids.filter((_id, idx) => {
|
||||
const meta = chromaResults.metadatas[idx];
|
||||
return meta && meta.created_at_epoch > ninetyDaysAgo;
|
||||
});
|
||||
|
||||
logger.debug('SEARCH', 'Results within 90-day window', { count: recentIds.length });
|
||||
logger.debug('SEARCH', 'Results within 90-day window', { count: recentIds.length });
|
||||
|
||||
// Step 3: Hydrate from SQLite in temporal order
|
||||
if (recentIds.length > 0) {
|
||||
const limit = options.limit || 20;
|
||||
results = this.sessionStore.getUserPromptsByIds(recentIds, { orderBy: 'date_desc', limit });
|
||||
logger.debug('SEARCH', 'Hydrated user prompts from SQLite', { count: results.length });
|
||||
// Step 3: Hydrate from SQLite in temporal order
|
||||
if (recentIds.length > 0) {
|
||||
const limit = options.limit || 20;
|
||||
results = this.sessionStore.getUserPromptsByIds(recentIds, { orderBy: 'date_desc', limit, project: options.project });
|
||||
logger.debug('SEARCH', 'Hydrated user prompts from SQLite', { count: results.length });
|
||||
}
|
||||
}
|
||||
} catch (chromaError) {
|
||||
const errorObject = chromaError instanceof Error ? chromaError : new Error(String(chromaError));
|
||||
logger.error('WORKER', 'Chroma search failed for user prompts, falling back to FTS', {}, errorObject);
|
||||
}
|
||||
}
|
||||
|
||||
// FTS fallback when Chroma is unavailable or returned no results
|
||||
if (results.length === 0 && query) {
|
||||
try {
|
||||
const ftsResults = this.sessionSearch.searchUserPrompts(query, options);
|
||||
if (ftsResults.length > 0) {
|
||||
results = ftsResults;
|
||||
}
|
||||
} catch (ftsError) {
|
||||
logger.warn('SEARCH', 'FTS fallback failed for user prompts', {}, ftsError instanceof Error ? ftsError : undefined);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1702,23 +1819,53 @@ export class SearchManager {
|
||||
// Use hybrid search if available
|
||||
if (this.chromaSync) {
|
||||
logger.debug('SEARCH', 'Using hybrid semantic search for timeline query', {});
|
||||
const chromaResults = await this.queryChroma(query, 100);
|
||||
logger.debug('SEARCH', 'Chroma returned semantic matches for timeline', { matchCount: chromaResults.ids.length });
|
||||
|
||||
if (chromaResults.ids.length > 0) {
|
||||
// Filter by recency (90 days)
|
||||
const ninetyDaysAgo = Date.now() - SEARCH_CONSTANTS.RECENCY_WINDOW_MS;
|
||||
const recentIds = chromaResults.ids.filter((_id, idx) => {
|
||||
const meta = chromaResults.metadatas[idx];
|
||||
return meta && meta.created_at_epoch > ninetyDaysAgo;
|
||||
});
|
||||
// Build Chroma where filter scoped to observations + project if provided
|
||||
let whereFilter: Record<string, any> = { doc_type: 'observation' };
|
||||
if (project) {
|
||||
const projectFilter = {
|
||||
$or: [
|
||||
{ project },
|
||||
{ merged_into_project: project }
|
||||
]
|
||||
};
|
||||
whereFilter = { $and: [whereFilter, projectFilter] };
|
||||
}
|
||||
|
||||
logger.debug('SEARCH', 'Results within 90-day window', { count: recentIds.length });
|
||||
try {
|
||||
const chromaResults = await this.queryChroma(query, 100, whereFilter);
|
||||
logger.debug('SEARCH', 'Chroma returned semantic matches for timeline', { matchCount: chromaResults.ids.length });
|
||||
|
||||
if (recentIds.length > 0) {
|
||||
results = this.sessionStore.getObservationsByIds(recentIds, { orderBy: 'date_desc', limit: mode === 'auto' ? 1 : limit });
|
||||
logger.debug('SEARCH', 'Hydrated observations from SQLite', { count: results.length });
|
||||
if (chromaResults.ids.length > 0) {
|
||||
// Filter by recency (90 days)
|
||||
const ninetyDaysAgo = Date.now() - SEARCH_CONSTANTS.RECENCY_WINDOW_MS;
|
||||
const recentIds = chromaResults.ids.filter((_id, idx) => {
|
||||
const meta = chromaResults.metadatas[idx];
|
||||
return meta && meta.created_at_epoch > ninetyDaysAgo;
|
||||
});
|
||||
|
||||
logger.debug('SEARCH', 'Results within 90-day window', { count: recentIds.length });
|
||||
|
||||
if (recentIds.length > 0) {
|
||||
results = this.sessionStore.getObservationsByIds(recentIds, { orderBy: 'date_desc', limit: mode === 'auto' ? 1 : limit, project });
|
||||
logger.debug('SEARCH', 'Hydrated observations from SQLite', { count: results.length });
|
||||
}
|
||||
}
|
||||
} catch (chromaError) {
|
||||
const errorObject = chromaError instanceof Error ? chromaError : new Error(String(chromaError));
|
||||
logger.error('WORKER', 'Chroma search failed for timeline by query, falling back to FTS', {}, errorObject);
|
||||
}
|
||||
}
|
||||
|
||||
// FTS fallback when Chroma is unavailable or returned no results
|
||||
if (results.length === 0) {
|
||||
try {
|
||||
const ftsResults = this.sessionSearch.searchObservations(query, { project, limit: mode === 'auto' ? 1 : limit });
|
||||
if (ftsResults.length > 0) {
|
||||
results = ftsResults;
|
||||
}
|
||||
} catch (ftsError) {
|
||||
logger.warn('SEARCH', 'FTS fallback failed for timeline by query', {}, ftsError instanceof Error ? ftsError : undefined);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -17,6 +17,7 @@ import { SessionQueueProcessor } from '../queue/SessionQueueProcessor.js';
|
||||
import { getProcessBySession, ensureProcessExit } from './ProcessRegistry.js';
|
||||
import { getSupervisor } from '../../supervisor/index.js';
|
||||
import { MAX_CONSECUTIVE_SUMMARY_FAILURES } from '../../sdk/prompts.js';
|
||||
import { RestartGuard } from './RestartGuard.js';
|
||||
|
||||
/** Idle threshold before a stuck generator (zombie subprocess) is force-killed. */
|
||||
export const MAX_GENERATOR_IDLE_MS = 5 * 60 * 1000; // 5 minutes
|
||||
@@ -224,7 +225,8 @@ export class SessionManager {
|
||||
earliestPendingTimestamp: null,
|
||||
conversationHistory: [], // Initialize empty - will be populated by agents
|
||||
currentProvider: null, // Will be set when generator starts
|
||||
consecutiveRestarts: 0, // Track consecutive restart attempts to prevent infinite loops
|
||||
consecutiveRestarts: 0, // DEPRECATED: use restartGuard. Kept for logging compat.
|
||||
restartGuard: new RestartGuard(),
|
||||
processingMessageIds: [], // CLAIM-CONFIRM: Track message IDs for confirmProcessed()
|
||||
lastGeneratorActivity: Date.now(), // Initialize for stale detection (Issue #1099)
|
||||
consecutiveSummaryFailures: 0, // Circuit breaker for summary retry loop (#1633)
|
||||
@@ -465,6 +467,44 @@ export class SessionManager {
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Evict the idlest session to free a pool slot (#1868).
|
||||
* An "idle" session has an active generator but no pending work — it's sitting
|
||||
* in the 3-min idle wait before subprocess cleanup. Evicting it triggers abort
|
||||
* which kills the subprocess and frees the pool slot for a waiting new session.
|
||||
* @returns true if a session was evicted, false if no idle sessions found
|
||||
*/
|
||||
evictIdlestSession(): boolean {
|
||||
let idlestSessionId: number | null = null;
|
||||
let oldestActivity = Infinity;
|
||||
|
||||
for (const [sessionDbId, session] of this.sessions) {
|
||||
if (!session.generatorPromise) continue; // No generator = no slot held
|
||||
const pendingCount = this.getPendingStore().getPendingCount(sessionDbId);
|
||||
if (pendingCount > 0) continue; // Has work to do, don't evict
|
||||
|
||||
// Pick the session with the oldest lastGeneratorActivity (idlest)
|
||||
if (session.lastGeneratorActivity < oldestActivity) {
|
||||
oldestActivity = session.lastGeneratorActivity;
|
||||
idlestSessionId = sessionDbId;
|
||||
}
|
||||
}
|
||||
|
||||
if (idlestSessionId === null) return false;
|
||||
|
||||
const session = this.sessions.get(idlestSessionId);
|
||||
if (!session) return false;
|
||||
|
||||
logger.info('SESSION', 'Evicting idle session to free pool slot for new request (#1868)', {
|
||||
sessionDbId: idlestSessionId,
|
||||
idleDurationMs: Date.now() - oldestActivity
|
||||
});
|
||||
|
||||
session.idleTimedOut = true;
|
||||
session.abortController.abort();
|
||||
return true;
|
||||
}
|
||||
|
||||
/**
|
||||
* Reap sessions with no active generator and no pending work that have been idle too long.
|
||||
* Also reaps sessions whose generator has been stuck (no lastGeneratorActivity update) for
|
||||
|
||||
@@ -207,6 +207,8 @@ export async function processAgentResponse(
|
||||
}
|
||||
if (session.processingMessageIds.length > 0) {
|
||||
logger.debug('QUEUE', `CONFIRMED_BATCH | sessionDbId=${session.sessionDbId} | count=${session.processingMessageIds.length} | ids=[${session.processingMessageIds.join(',')}]`);
|
||||
// Record successful processing so restart guard decay is anchored to real successes
|
||||
session.restartGuard?.recordSuccess();
|
||||
}
|
||||
// Clear the tracking array after confirmation
|
||||
session.processingMessageIds = [];
|
||||
|
||||
@@ -9,6 +9,7 @@ import express, { Request, Response, NextFunction, RequestHandler } from 'expres
|
||||
import cors from 'cors';
|
||||
import path from 'path';
|
||||
import { getPackageRoot } from '../../../shared/paths.js';
|
||||
import { getAuthToken } from '../../../shared/auth-token.js';
|
||||
import { logger } from '../../../utils/logger.js';
|
||||
|
||||
/**
|
||||
@@ -21,8 +22,8 @@ export function createMiddleware(
|
||||
): RequestHandler[] {
|
||||
const middlewares: RequestHandler[] = [];
|
||||
|
||||
// JSON parsing with 50mb limit
|
||||
middlewares.push(express.json({ limit: '50mb' }));
|
||||
// JSON parsing with 5mb limit (#1935)
|
||||
middlewares.push(express.json({ limit: '5mb' }));
|
||||
|
||||
// CORS - restrict to localhost origins only
|
||||
middlewares.push(cors({
|
||||
@@ -42,6 +43,39 @@ export function createMiddleware(
|
||||
credentials: false
|
||||
}));
|
||||
|
||||
// Simple in-memory rate limiter (#1935)
|
||||
const requestCounts = new Map<string, { count: number; resetAt: number }>();
|
||||
const RATE_LIMIT_WINDOW_MS = 60_000; // 1 minute
|
||||
const RATE_LIMIT_MAX_REQUESTS = 300; // 300 requests per minute per IP
|
||||
|
||||
const rateLimiter: RequestHandler = (req, res, next) => {
|
||||
const clientIp = req.ip || 'unknown';
|
||||
const now = Date.now();
|
||||
let entry = requestCounts.get(clientIp);
|
||||
|
||||
if (!entry || now >= entry.resetAt) {
|
||||
entry = { count: 0, resetAt: now + RATE_LIMIT_WINDOW_MS };
|
||||
requestCounts.set(clientIp, entry);
|
||||
}
|
||||
|
||||
// Lazy cleanup: remove expired entries when map grows large
|
||||
if (requestCounts.size > 100) {
|
||||
for (const [ip, e] of requestCounts) {
|
||||
if (now >= e.resetAt) requestCounts.delete(ip);
|
||||
}
|
||||
}
|
||||
|
||||
entry.count++;
|
||||
if (entry.count > RATE_LIMIT_MAX_REQUESTS) {
|
||||
res.status(429).json({ error: 'Rate limit exceeded' });
|
||||
return;
|
||||
}
|
||||
|
||||
next();
|
||||
};
|
||||
|
||||
middlewares.push(rateLimiter);
|
||||
|
||||
// HTTP request/response logging
|
||||
middlewares.push((req: Request, res: Response, next: NextFunction) => {
|
||||
// Skip logging for static assets, health checks, and polling endpoints
|
||||
@@ -106,6 +140,27 @@ export function requireLocalhost(req: Request, res: Response, next: NextFunction
|
||||
next();
|
||||
}
|
||||
|
||||
/**
|
||||
* Bearer token auth middleware (#1932/#1933).
|
||||
* Requires Authorization: Bearer <token> on all API requests.
|
||||
* Token is auto-generated and stored in DATA_DIR/worker-auth-token.
|
||||
*/
|
||||
export function requireAuth(req: Request, res: Response, next: NextFunction): void {
|
||||
const authHeader = req.headers.authorization;
|
||||
if (!authHeader || !authHeader.startsWith('Bearer ')) {
|
||||
res.status(401).json({ error: 'Missing or invalid Authorization header' });
|
||||
return;
|
||||
}
|
||||
|
||||
const token = authHeader.slice('Bearer '.length);
|
||||
if (token !== getAuthToken()) {
|
||||
res.status(401).json({ error: 'Invalid bearer token' });
|
||||
return;
|
||||
}
|
||||
|
||||
next();
|
||||
}
|
||||
|
||||
/**
|
||||
* Summarize request body for logging
|
||||
* Used to avoid logging sensitive data or large payloads
|
||||
|
||||
@@ -382,11 +382,13 @@ export class DataRoutes extends BaseRouteHandler {
|
||||
}
|
||||
|
||||
// Import observations (depends on sessions)
|
||||
const importedObservations: Array<{ id: number; obs: typeof observations[0] }> = [];
|
||||
if (Array.isArray(observations)) {
|
||||
for (const obs of observations) {
|
||||
const result = store.importObservation(obs);
|
||||
if (result.imported) {
|
||||
stats.observationsImported++;
|
||||
importedObservations.push({ id: result.id, obs });
|
||||
} else {
|
||||
stats.observationsSkipped++;
|
||||
}
|
||||
@@ -398,6 +400,53 @@ export class DataRoutes extends BaseRouteHandler {
|
||||
if (stats.observationsImported > 0) {
|
||||
store.rebuildObservationsFTSIndex();
|
||||
}
|
||||
|
||||
// Sync imported observations to ChromaDB for vector search.
|
||||
// Fire-and-forget: Chroma sync failure should not block the import response.
|
||||
// Bounded concurrency to prevent overwhelming Chroma on large imports.
|
||||
const chromaSync = this.dbManager.getChromaSync();
|
||||
if (chromaSync && importedObservations.length > 0) {
|
||||
const CHROMA_SYNC_CONCURRENCY = 8;
|
||||
const safeParseJson = (val: string | null): string[] => {
|
||||
if (!val) return [];
|
||||
try { return JSON.parse(val); } catch { return []; }
|
||||
};
|
||||
|
||||
const syncOne = async ({ id, obs }: { id: number; obs: any }) => {
|
||||
const parsedObs = {
|
||||
type: obs.type || 'discovery',
|
||||
title: obs.title || null,
|
||||
subtitle: obs.subtitle || null,
|
||||
facts: safeParseJson(obs.facts),
|
||||
narrative: obs.narrative || null,
|
||||
concepts: safeParseJson(obs.concepts),
|
||||
files_read: safeParseJson(obs.files_read),
|
||||
files_modified: safeParseJson(obs.files_modified),
|
||||
};
|
||||
|
||||
await chromaSync.syncObservation(
|
||||
id,
|
||||
obs.memory_session_id,
|
||||
obs.project,
|
||||
parsedObs,
|
||||
obs.prompt_number || 0,
|
||||
obs.created_at_epoch,
|
||||
obs.discovery_tokens || 0
|
||||
).catch(err => {
|
||||
logger.error('CHROMA', 'Import ChromaDB sync failed', { id }, err as Error);
|
||||
});
|
||||
};
|
||||
|
||||
// Fire-and-forget: process in batches but don't block the response
|
||||
(async () => {
|
||||
for (let i = 0; i < importedObservations.length; i += CHROMA_SYNC_CONCURRENCY) {
|
||||
const batch = importedObservations.slice(i, i + CHROMA_SYNC_CONCURRENCY);
|
||||
await Promise.all(batch.map(syncOne));
|
||||
}
|
||||
})().catch(err => {
|
||||
logger.error('CHROMA', 'Import ChromaDB batch sync failed', {}, err as Error);
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
// Import prompts (depends on sessions)
|
||||
|
||||
@@ -24,6 +24,7 @@ import { USER_SETTINGS_PATH } from '../../../../shared/paths.js';
|
||||
import { getProcessBySession, ensureProcessExit } from '../../ProcessRegistry.js';
|
||||
import { getProjectContext } from '../../../../utils/project-name.js';
|
||||
import { normalizePlatformSource } from '../../../../shared/platform-source.js';
|
||||
import { RestartGuard } from '../../RestartGuard.js';
|
||||
|
||||
export class SessionRoutes extends BaseRouteHandler {
|
||||
private completionHandler: SessionCompletionHandler;
|
||||
@@ -279,9 +280,10 @@ export class SessionRoutes extends BaseRouteHandler {
|
||||
|
||||
if (wasAborted) {
|
||||
logger.info('SESSION', `Generator aborted`, { sessionId: sessionDbId });
|
||||
} else {
|
||||
logger.error('SESSION', `Generator exited unexpectedly`, { sessionId: sessionDbId });
|
||||
}
|
||||
// Don't log "exited unexpectedly" here — a non-abort exit is normal when
|
||||
// the SDK subprocess completes its work. The crash-recovery block below
|
||||
// checks pendingCount to distinguish real crashes from clean exits (#1876).
|
||||
|
||||
session.generatorPromise = null;
|
||||
session.currentProvider = null;
|
||||
@@ -290,7 +292,6 @@ export class SessionRoutes extends BaseRouteHandler {
|
||||
// Crash recovery: If not aborted and still has work, restart (with limit)
|
||||
if (!wasAborted) {
|
||||
const pendingStore = this.sessionManager.getPendingMessageStore();
|
||||
const MAX_CONSECUTIVE_RESTARTS = 3;
|
||||
|
||||
let pendingCount: number;
|
||||
try {
|
||||
@@ -309,14 +310,18 @@ export class SessionRoutes extends BaseRouteHandler {
|
||||
return;
|
||||
}
|
||||
|
||||
session.consecutiveRestarts = (session.consecutiveRestarts || 0) + 1;
|
||||
// Windowed restart guard: only blocks tight-loop restarts, not spread-out ones (#2053)
|
||||
if (!session.restartGuard) session.restartGuard = new RestartGuard();
|
||||
const restartAllowed = session.restartGuard.recordRestart();
|
||||
session.consecutiveRestarts = (session.consecutiveRestarts || 0) + 1; // Keep for logging
|
||||
|
||||
if (session.consecutiveRestarts > MAX_CONSECUTIVE_RESTARTS) {
|
||||
logger.error('SESSION', `CRITICAL: Generator restart limit exceeded - stopping to prevent runaway costs`, {
|
||||
if (!restartAllowed) {
|
||||
logger.error('SESSION', `CRITICAL: Restart guard tripped — too many restarts in window, stopping to prevent runaway costs`, {
|
||||
sessionId: sessionDbId,
|
||||
pendingCount,
|
||||
consecutiveRestarts: session.consecutiveRestarts,
|
||||
maxRestarts: MAX_CONSECUTIVE_RESTARTS,
|
||||
restartsInWindow: session.restartGuard.restartsInWindow,
|
||||
windowMs: session.restartGuard.windowMs,
|
||||
maxRestarts: session.restartGuard.maxRestarts,
|
||||
action: 'Generator will NOT restart. Check logs for root cause. Messages remain in pending state.'
|
||||
});
|
||||
// Don't restart - abort to prevent further API calls
|
||||
@@ -328,7 +333,8 @@ export class SessionRoutes extends BaseRouteHandler {
|
||||
sessionId: sessionDbId,
|
||||
pendingCount,
|
||||
consecutiveRestarts: session.consecutiveRestarts,
|
||||
maxRestarts: MAX_CONSECUTIVE_RESTARTS
|
||||
restartsInWindow: session.restartGuard!.restartsInWindow,
|
||||
maxRestarts: session.restartGuard!.maxRestarts
|
||||
});
|
||||
|
||||
// Abort OLD controller before replacing to prevent child process leaks
|
||||
|
||||
@@ -10,6 +10,7 @@ import path from 'path';
|
||||
import { readFileSync, existsSync } from 'fs';
|
||||
import { logger } from '../../../../utils/logger.js';
|
||||
import { getPackageRoot } from '../../../../shared/paths.js';
|
||||
import { getAuthToken } from '../../../../shared/auth-token.js';
|
||||
import { SSEBroadcaster } from '../../SSEBroadcaster.js';
|
||||
import { DatabaseManager } from '../../DatabaseManager.js';
|
||||
import { SessionManager } from '../../SessionManager.js';
|
||||
@@ -66,7 +67,10 @@ export class ViewerRoutes extends BaseRouteHandler {
|
||||
throw new Error('Viewer UI not found at any expected location');
|
||||
}
|
||||
|
||||
const html = readFileSync(viewerPath, 'utf-8');
|
||||
let html = readFileSync(viewerPath, 'utf-8');
|
||||
// Inject auth token so viewer can authenticate API requests (#1932/#1933)
|
||||
const tokenScript = `<script>window.__CLAUDE_MEM_AUTH_TOKEN__="${getAuthToken()}";</script>`;
|
||||
html = html.replace('</head>', `${tokenScript}</head>`);
|
||||
res.setHeader('Content-Type', 'text/html');
|
||||
res.send(html);
|
||||
});
|
||||
|
||||
Reference in New Issue
Block a user