Files
claude-mem/src/servers/mcp-server.ts
T
Alex Newman 80a8c90a1a feat: add embedded Process Supervisor for unified process lifecycle (#1370)
* feat: add embedded Process Supervisor for unified process lifecycle management

Consolidates scattered process management (ProcessManager, GracefulShutdown,
HealthMonitor, ProcessRegistry) into a unified src/supervisor/ module.

New: ProcessRegistry with JSON persistence, env sanitizer (strips CLAUDECODE_*
vars), graceful shutdown cascade (SIGTERM → 5s wait → SIGKILL with tree-kill
on Windows), PID file liveness validation, and singleton Supervisor API.

Fixes #1352 (worker inherits CLAUDECODE env causing nested sessions)
Fixes #1356 (zombie TCP socket after Windows reboot)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add session-scoped process reaping to supervisor

Adds reapSession(sessionId) to ProcessRegistry for killing session-tagged
processes on session end. SessionManager.deleteSession() now triggers reaping.
Tightens orphan reaper interval from 60s to 30s.

Fixes #1351 (MCP server processes leak on session end)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add Unix domain socket support for worker communication

Introduces socket-manager.ts for UDS-based worker communication, eliminating
port 37777 collisions between concurrent sessions. Worker listens on
~/.claude-mem/sockets/worker.sock by default with TCP fallback.

All hook handlers, MCP server, health checks, and admin commands updated to
use socket-aware workerHttpRequest(). Backwards compatible — settings can
force TCP mode via CLAUDE_MEM_WORKER_TRANSPORT=tcp.

Fixes #1346 (port 37777 collision across concurrent sessions)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: remove in-process worker fallback from hook command

Removes the fallback path where hook scripts started WorkerService in-process,
making the worker a grandchild of Claude Code (killed by sandbox). Hooks now
always delegate to ensureWorkerStarted() which spawns a fully detached daemon.

Fixes #1249 (grandchild process killed by sandbox)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add health checker and /api/admin/doctor endpoint

Adds 30-second periodic health sweep that prunes dead processes from the
supervisor registry and cleans stale socket files. Adds /api/admin/doctor
endpoint exposing supervisor state, process liveness, and environment health.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add comprehensive supervisor test suite

64 tests covering all supervisor modules: process registry (18 tests),
env sanitizer (8), shutdown cascade (10), socket manager (15), health
checker (5), and supervisor API (6). Includes persistence, isolation,
edge cases, and cross-module integration scenarios.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: revert Unix domain socket transport, restore TCP on port 37777

The socket-manager introduced UDS as default transport, but this broke
the HTTP server's TCP accessibility (viewer UI, curl, external monitoring).
Since there's only ever one worker process handling all sessions, the
port collision rationale for UDS doesn't apply. Reverts to TCP-only,
removing ~900 lines of unnecessary complexity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: remove dead code found in pre-landing review

Remove unused `acceptingSpawns` field from Supervisor class (written but
never read — assertCanSpawn uses stopPromise instead) and unused
`buildWorkerUrl` import from context handler.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* updated gitignore

* fix: address PR review feedback - downgrade HTTP logging, clean up gitignore, harden supervisor

- Downgrade request/response HTTP logging from info to debug to reduce noise
- Remove unused getWorkerPort imports, use buildWorkerUrl helper
- Export ENV_PREFIXES/ENV_EXACT_MATCHES from env-sanitizer, reuse in Server.ts
- Fix isPidAlive(0) returning true (should be false)
- Add shutdownInitiated flag to prevent signal handler race condition
- Make validateWorkerPidFile testable with pidFilePath option
- Remove unused dataDir from ShutdownCascadeOptions
- Upgrade reapSession log from debug to warn
- Rename zombiePidFiles to deadProcessPids (returns actual PIDs)
- Clean up gitignore: remove duplicate datasets/, stale ~*/ and http*/ patterns
- Fix tests to use temp directories instead of relying on real PID file

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 14:49:23 -07:00

456 lines
14 KiB
TypeScript

/**
* Claude-mem MCP Search Server - Thin HTTP Wrapper
*
* Refactored from 2,718 lines to ~600-800 lines
* Delegates all business logic to Worker HTTP API at localhost:37777
* Maintains MCP protocol handling and tool schemas
*/
// Version injected at build time by esbuild define
declare const __DEFAULT_PACKAGE_VERSION__: string;
const packageVersion = typeof __DEFAULT_PACKAGE_VERSION__ !== 'undefined' ? __DEFAULT_PACKAGE_VERSION__ : '0.0.0-dev';
// Import logger first
import { logger } from '../utils/logger.js';
// CRITICAL: Redirect console to stderr BEFORE other imports
// MCP uses stdio transport where stdout is reserved for JSON-RPC protocol messages.
// Any logs to stdout break the protocol (Claude Desktop parses "[2025..." as JSON array).
const _originalLog = console['log'];
console['log'] = (...args: any[]) => {
logger.error('CONSOLE', 'Intercepted console output (MCP protocol protection)', undefined, { args });
};
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
CallToolRequestSchema,
ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
import { workerHttpRequest } from '../shared/worker-utils.js';
import { searchCodebase, formatSearchResults } from '../services/smart-file-read/search.js';
import { parseFile, formatFoldedView, unfoldSymbol } from '../services/smart-file-read/parser.js';
import { readFile } from 'node:fs/promises';
import { resolve } from 'node:path';
/**
* Map tool names to Worker HTTP endpoints
*/
const TOOL_ENDPOINT_MAP: Record<string, string> = {
'search': '/api/search',
'timeline': '/api/timeline'
};
/**
* Call Worker HTTP API endpoint (uses socket or TCP automatically)
*/
async function callWorkerAPI(
endpoint: string,
params: Record<string, any>
): Promise<{ content: Array<{ type: 'text'; text: string }>; isError?: boolean }> {
logger.debug('SYSTEM', '→ Worker API', undefined, { endpoint, params });
try {
const searchParams = new URLSearchParams();
// Convert params to query string
for (const [key, value] of Object.entries(params)) {
if (value !== undefined && value !== null) {
searchParams.append(key, String(value));
}
}
const apiPath = `${endpoint}?${searchParams}`;
const response = await workerHttpRequest(apiPath);
if (!response.ok) {
const errorText = await response.text();
throw new Error(`Worker API error (${response.status}): ${errorText}`);
}
const data = await response.json() as { content: Array<{ type: 'text'; text: string }>; isError?: boolean };
logger.debug('SYSTEM', '← Worker API success', undefined, { endpoint });
// Worker returns { content: [...] } format directly
return data;
} catch (error) {
logger.error('SYSTEM', '← Worker API error', { endpoint }, error as Error);
return {
content: [{
type: 'text' as const,
text: `Error calling Worker API: ${error instanceof Error ? error.message : String(error)}`
}],
isError: true
};
}
}
/**
* Call Worker HTTP API with POST body
*/
async function callWorkerAPIPost(
endpoint: string,
body: Record<string, any>
): Promise<{ content: Array<{ type: 'text'; text: string }>; isError?: boolean }> {
logger.debug('HTTP', 'Worker API request (POST)', undefined, { endpoint });
try {
const response = await workerHttpRequest(endpoint, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(body)
});
if (!response.ok) {
const errorText = await response.text();
throw new Error(`Worker API error (${response.status}): ${errorText}`);
}
const data = await response.json();
logger.debug('HTTP', 'Worker API success (POST)', undefined, { endpoint });
// Wrap raw data in MCP format
return {
content: [{
type: 'text' as const,
text: JSON.stringify(data, null, 2)
}]
};
} catch (error) {
logger.error('HTTP', 'Worker API error (POST)', { endpoint }, error as Error);
return {
content: [{
type: 'text' as const,
text: `Error calling Worker API: ${error instanceof Error ? error.message : String(error)}`
}],
isError: true
};
}
}
/**
* Verify Worker is accessible
*/
async function verifyWorkerConnection(): Promise<boolean> {
try {
const response = await workerHttpRequest('/api/health');
return response.ok;
} catch (error) {
// Expected during worker startup or if worker is down
logger.debug('SYSTEM', 'Worker health check failed', {}, error as Error);
return false;
}
}
/**
* Tool definitions with HTTP-based handlers
* Minimal descriptions - use help() tool with operation parameter for detailed docs
*/
const tools = [
{
name: '__IMPORTANT',
description: `3-LAYER WORKFLOW (ALWAYS FOLLOW):
1. search(query) → Get index with IDs (~50-100 tokens/result)
2. timeline(anchor=ID) → Get context around interesting results
3. get_observations([IDs]) → Fetch full details ONLY for filtered IDs
NEVER fetch full details without filtering first. 10x token savings.`,
inputSchema: {
type: 'object',
properties: {}
},
handler: async () => ({
content: [{
type: 'text' as const,
text: `# Memory Search Workflow
**3-Layer Pattern (ALWAYS follow this):**
1. **Search** - Get index of results with IDs
\`search(query="...", limit=20, project="...")\`
Returns: Table with IDs, titles, dates (~50-100 tokens/result)
2. **Timeline** - Get context around interesting results
\`timeline(anchor=<ID>, depth_before=3, depth_after=3)\`
Returns: Chronological context showing what was happening
3. **Fetch** - Get full details ONLY for relevant IDs
\`get_observations(ids=[...])\` # ALWAYS batch for 2+ items
Returns: Complete details (~500-1000 tokens/result)
**Why:** 10x token savings. Never fetch full details without filtering first.`
}]
})
},
{
name: 'search',
description: 'Step 1: Search memory. Returns index with IDs. Params: query, limit, project, type, obs_type, dateStart, dateEnd, offset, orderBy',
inputSchema: {
type: 'object',
properties: {},
additionalProperties: true
},
handler: async (args: any) => {
const endpoint = TOOL_ENDPOINT_MAP['search'];
return await callWorkerAPI(endpoint, args);
}
},
{
name: 'timeline',
description: 'Step 2: Get context around results. Params: anchor (observation ID) OR query (finds anchor automatically), depth_before, depth_after, project',
inputSchema: {
type: 'object',
properties: {},
additionalProperties: true
},
handler: async (args: any) => {
const endpoint = TOOL_ENDPOINT_MAP['timeline'];
return await callWorkerAPI(endpoint, args);
}
},
{
name: 'get_observations',
description: 'Step 3: Fetch full details for filtered IDs. Params: ids (array of observation IDs, required), orderBy, limit, project',
inputSchema: {
type: 'object',
properties: {
ids: {
type: 'array',
items: { type: 'number' },
description: 'Array of observation IDs to fetch (required)'
}
},
required: ['ids'],
additionalProperties: true
},
handler: async (args: any) => {
return await callWorkerAPIPost('/api/observations/batch', args);
}
},
{
name: 'smart_search',
description: 'Search codebase for symbols, functions, classes using tree-sitter AST parsing. Returns folded structural views with token counts. Use path parameter to scope the search.',
inputSchema: {
type: 'object',
properties: {
query: {
type: 'string',
description: 'Search term — matches against symbol names, file names, and file content'
},
path: {
type: 'string',
description: 'Root directory to search (default: current working directory)'
},
max_results: {
type: 'number',
description: 'Maximum results to return (default: 20)'
},
file_pattern: {
type: 'string',
description: 'Substring filter for file paths (e.g. ".ts", "src/services")'
}
},
required: ['query']
},
handler: async (args: any) => {
const rootDir = resolve(args.path || process.cwd());
const result = await searchCodebase(rootDir, args.query, {
maxResults: args.max_results || 20,
filePattern: args.file_pattern
});
const formatted = formatSearchResults(result, args.query);
return {
content: [{ type: 'text' as const, text: formatted }]
};
}
},
{
name: 'smart_unfold',
description: 'Expand a specific symbol (function, class, method) from a file. Returns the full source code of just that symbol. Use after smart_search or smart_outline to read specific code.',
inputSchema: {
type: 'object',
properties: {
file_path: {
type: 'string',
description: 'Path to the source file'
},
symbol_name: {
type: 'string',
description: 'Name of the symbol to unfold (function, class, method, etc.)'
}
},
required: ['file_path', 'symbol_name']
},
handler: async (args: any) => {
const filePath = resolve(args.file_path);
const content = await readFile(filePath, 'utf-8');
const unfolded = unfoldSymbol(content, filePath, args.symbol_name);
if (unfolded) {
return {
content: [{ type: 'text' as const, text: unfolded }]
};
}
// Symbol not found — show available symbols
const parsed = parseFile(content, filePath);
if (parsed.symbols.length > 0) {
const available = parsed.symbols.map(s => ` - ${s.name} (${s.kind})`).join('\n');
return {
content: [{
type: 'text' as const,
text: `Symbol "${args.symbol_name}" not found in ${args.file_path}.\n\nAvailable symbols:\n${available}`
}]
};
}
return {
content: [{
type: 'text' as const,
text: `Could not parse ${args.file_path}. File may be unsupported or empty.`
}]
};
}
},
{
name: 'smart_outline',
description: 'Get structural outline of a file — shows all symbols (functions, classes, methods, types) with signatures but bodies folded. Much cheaper than reading the full file.',
inputSchema: {
type: 'object',
properties: {
file_path: {
type: 'string',
description: 'Path to the source file'
}
},
required: ['file_path']
},
handler: async (args: any) => {
const filePath = resolve(args.file_path);
const content = await readFile(filePath, 'utf-8');
const parsed = parseFile(content, filePath);
if (parsed.symbols.length > 0) {
return {
content: [{ type: 'text' as const, text: formatFoldedView(parsed) }]
};
}
return {
content: [{
type: 'text' as const,
text: `Could not parse ${args.file_path}. File may use an unsupported language or be empty.`
}]
};
}
}
];
// Create the MCP server
const server = new Server(
{
name: 'claude-mem',
version: packageVersion,
},
{
capabilities: {
tools: {}, // Exposes tools capability (handled by ListToolsRequestSchema and CallToolRequestSchema)
},
}
);
// Register tools/list handler
server.setRequestHandler(ListToolsRequestSchema, async () => {
return {
tools: tools.map(tool => ({
name: tool.name,
description: tool.description,
inputSchema: tool.inputSchema
}))
};
});
// Register tools/call handler
server.setRequestHandler(CallToolRequestSchema, async (request) => {
const tool = tools.find(t => t.name === request.params.name);
if (!tool) {
throw new Error(`Unknown tool: ${request.params.name}`);
}
try {
return await tool.handler(request.params.arguments || {});
} catch (error) {
logger.error('SYSTEM', 'Tool execution failed', { tool: request.params.name }, error as Error);
return {
content: [{
type: 'text' as const,
text: `Tool execution failed: ${error instanceof Error ? error.message : String(error)}`
}],
isError: true
};
}
});
// Parent heartbeat: self-exit when parent dies (ppid=1 on Unix means orphaned)
// Prevents orphaned MCP server processes when Claude Code exits unexpectedly
const HEARTBEAT_INTERVAL_MS = 30_000;
let heartbeatTimer: ReturnType<typeof setInterval> | null = null;
function startParentHeartbeat() {
// ppid-based orphan detection only works on Unix
if (process.platform === 'win32') return;
const initialPpid = process.ppid;
heartbeatTimer = setInterval(() => {
if (process.ppid === 1 || process.ppid !== initialPpid) {
logger.info('SYSTEM', 'Parent process died, self-exiting to prevent orphan', {
initialPpid,
currentPpid: process.ppid
});
cleanup();
}
}, HEARTBEAT_INTERVAL_MS);
// Don't let the heartbeat timer keep the process alive
if (heartbeatTimer.unref) heartbeatTimer.unref();
}
// Cleanup function — synchronous to ensure consistent behavior whether called
// from signal handlers, heartbeat interval, or awaited in async context
function cleanup() {
if (heartbeatTimer) clearInterval(heartbeatTimer);
logger.info('SYSTEM', 'MCP server shutting down');
process.exit(0);
}
// Register cleanup handlers for graceful shutdown
process.on('SIGTERM', cleanup);
process.on('SIGINT', cleanup);
// Start the server
async function main() {
// Start the MCP server
const transport = new StdioServerTransport();
await server.connect(transport);
logger.info('SYSTEM', 'Claude-mem search server started');
// Start parent heartbeat to detect orphaned MCP servers
startParentHeartbeat();
// Check Worker availability in background
setTimeout(async () => {
const workerAvailable = await verifyWorkerConnection();
if (!workerAvailable) {
logger.error('SYSTEM', 'Worker not available', undefined, {});
logger.error('SYSTEM', 'Tools will fail until Worker is started');
logger.error('SYSTEM', 'Start Worker with: npm run worker:restart');
} else {
logger.info('SYSTEM', 'Worker available', undefined, {});
}
}, 0);
}
main().catch((error) => {
logger.error('SYSTEM', 'Fatal error', undefined, error);
// Exit gracefully: Windows Terminal won't keep tab open on exit 0
// The wrapper/plugin will handle restart logic if needed
process.exit(0);
});