feat: disable subagent summaries, label subagent observations (#2073)
* feat: disable subagent summaries and label subagent observations Detect Claude Code subagent hook context via `agent_id`/`agent_type` on stdin, short-circuit the Stop-hook summary path when present, and thread the subagent identity end-to-end onto observation rows (new `agent_type` and `agent_id` columns, migration 010 at version 27). Main-session rows remain NULL; content-hash dedup is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address PR #2073 review feedback - Narrow summarize subagent guard to agentId only so --agent-started main sessions still own their summary (agentType alone is main-session). - Remove now-dead agentId/agentType spreads from the summarize POST body. - Always overwrite pendingAgentId/pendingAgentType in SDK/Gemini/OpenRouter agents (clears stale subagent identity on main-session messages after a subagent message in the same batch). - Add idx_observations_agent_id index in migration 010 + the mirror migration in SessionStore + the runner. - Replace console.log in migration010 with logger.debug. - Update summarize test: agentType alone no longer short-circuits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address CodeRabbit + claude-review iteration 4 feedback - SessionRoutes.handleSummarizeByClaudeId: narrow worker-side guard to agentId only (matches hook-side). agentType alone = --agent main session, which still owns its summary. - ResponseProcessor: wrap storeObservations in try/finally so pendingAgentId/Type clear even if storage throws. Prevents stale subagent identity from leaking into the next batch on error. - SessionStore.importObservation + bulk.importObservation: persist agent_type/agent_id so backup/import round-trips preserve subagent attribution. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * polish: claude-review iteration 5 cleanup - Use ?? not || for nullable subagent fields in PendingMessageStore (prevents treating empty string as null). - Simplify observation.ts body spread — include fields unconditionally; JSON.stringify drops undefined anyway. - Narrow any[] to Array<{ name: string }> in migration010 column checks. - Add trailing newline to migrations.ts. - Document in observations/store.ts why the dedup hash intentionally excludes agent fields. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * polish: claude-review iteration 7 feedback - claude-code adapter: add 128-char safety cap on agent_id/agent_type so a malformed Claude Code payload cannot balloon DB rows. Empty strings now also treated as absent. - migration010: state-aware debug log lists only columns actually added; idempotent re-runs log "already present; ensured indexes". - Add 3 adapter tests covering the length cap boundary and empty-string rejection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * perf: skip subagent summary before worker bootstrap Move the agentId short-circuit above ensureWorkerRunning() so a Stop hook fired inside a subagent does not trigger worker startup just to return early. Addresses CodeRabbit nit on summarize.ts:36-47. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,120 @@
|
||||
/**
|
||||
* Tests for Claude Code adapter subagent field extraction.
|
||||
*
|
||||
* Validates that normalizeInput picks up the `agent_id` / `agent_type`
|
||||
* fields from Claude Code hook stdin and that the type guard rejects
|
||||
* non-string values. These fields are the discriminator for subagent
|
||||
* context; they are undefined in main-session payloads.
|
||||
*
|
||||
* Sources:
|
||||
* - Adapter: src/cli/adapters/claude-code.ts
|
||||
* - Types: src/cli/types.ts
|
||||
*/
|
||||
import { describe, it, expect } from 'bun:test';
|
||||
import { claudeCodeAdapter } from '../../../src/cli/adapters/claude-code.js';
|
||||
|
||||
describe('claudeCodeAdapter.normalizeInput — subagent fields', () => {
|
||||
it('extracts agentId and agentType when both are present', () => {
|
||||
const normalized = claudeCodeAdapter.normalizeInput({
|
||||
session_id: 's1',
|
||||
cwd: '/tmp',
|
||||
agent_id: 'agent-abc',
|
||||
agent_type: 'Explore',
|
||||
});
|
||||
|
||||
expect(normalized.sessionId).toBe('s1');
|
||||
expect(normalized.cwd).toBe('/tmp');
|
||||
expect(normalized.agentId).toBe('agent-abc');
|
||||
expect(normalized.agentType).toBe('Explore');
|
||||
});
|
||||
|
||||
it('leaves agentId and agentType undefined when fields are absent (main-session payload)', () => {
|
||||
const normalized = claudeCodeAdapter.normalizeInput({
|
||||
session_id: 's1',
|
||||
cwd: '/tmp',
|
||||
});
|
||||
|
||||
expect(normalized.sessionId).toBe('s1');
|
||||
expect(normalized.agentId).toBeUndefined();
|
||||
expect(normalized.agentType).toBeUndefined();
|
||||
});
|
||||
|
||||
it('rejects non-string agent_id via type guard (returns undefined)', () => {
|
||||
const normalized = claudeCodeAdapter.normalizeInput({
|
||||
session_id: 's1',
|
||||
cwd: '/tmp',
|
||||
agent_id: 42,
|
||||
});
|
||||
|
||||
expect(normalized.agentId).toBeUndefined();
|
||||
});
|
||||
|
||||
it('rejects non-string agent_type via type guard (returns undefined)', () => {
|
||||
const normalized = claudeCodeAdapter.normalizeInput({
|
||||
session_id: 's1',
|
||||
cwd: '/tmp',
|
||||
agent_type: { kind: 'Explore' },
|
||||
});
|
||||
|
||||
expect(normalized.agentType).toBeUndefined();
|
||||
});
|
||||
|
||||
it('extracts agentId alone even when agent_type is missing', () => {
|
||||
const normalized = claudeCodeAdapter.normalizeInput({
|
||||
session_id: 's1',
|
||||
cwd: '/tmp',
|
||||
agent_id: 'agent-only',
|
||||
});
|
||||
|
||||
expect(normalized.agentId).toBe('agent-only');
|
||||
expect(normalized.agentType).toBeUndefined();
|
||||
});
|
||||
|
||||
it('handles null/undefined raw input gracefully (SessionStart hook)', () => {
|
||||
const normalizedNull = claudeCodeAdapter.normalizeInput(null);
|
||||
const normalizedUndef = claudeCodeAdapter.normalizeInput(undefined);
|
||||
|
||||
expect(normalizedNull.agentId).toBeUndefined();
|
||||
expect(normalizedNull.agentType).toBeUndefined();
|
||||
expect(normalizedUndef.agentId).toBeUndefined();
|
||||
expect(normalizedUndef.agentType).toBeUndefined();
|
||||
});
|
||||
|
||||
it('drops agent fields that exceed the 128-char safety cap', () => {
|
||||
const oversized = 'a'.repeat(129);
|
||||
const normalized = claudeCodeAdapter.normalizeInput({
|
||||
session_id: 's1',
|
||||
cwd: '/tmp',
|
||||
agent_id: oversized,
|
||||
agent_type: oversized,
|
||||
});
|
||||
|
||||
expect(normalized.agentId).toBeUndefined();
|
||||
expect(normalized.agentType).toBeUndefined();
|
||||
});
|
||||
|
||||
it('keeps agent fields exactly at the 128-char boundary', () => {
|
||||
const atLimit = 'a'.repeat(128);
|
||||
const normalized = claudeCodeAdapter.normalizeInput({
|
||||
session_id: 's1',
|
||||
cwd: '/tmp',
|
||||
agent_id: atLimit,
|
||||
agent_type: atLimit,
|
||||
});
|
||||
|
||||
expect(normalized.agentId).toBe(atLimit);
|
||||
expect(normalized.agentType).toBe(atLimit);
|
||||
});
|
||||
|
||||
it('drops empty-string agent fields (treat as absent)', () => {
|
||||
const normalized = claudeCodeAdapter.normalizeInput({
|
||||
session_id: 's1',
|
||||
cwd: '/tmp',
|
||||
agent_id: '',
|
||||
agent_type: '',
|
||||
});
|
||||
|
||||
expect(normalized.agentId).toBeUndefined();
|
||||
expect(normalized.agentType).toBeUndefined();
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,143 @@
|
||||
/**
|
||||
* Tests for subagent-context short-circuit in summarizeHandler.
|
||||
*
|
||||
* Validates that when the Stop hook fires inside a Claude Code subagent
|
||||
* (identified by `agentId` or `agentType` on NormalizedHookInput), the
|
||||
* summarize handler exits before calling the worker — subagents must not
|
||||
* own the session summary.
|
||||
*
|
||||
* Sources:
|
||||
* - Handler: src/cli/handlers/summarize.ts
|
||||
* - Mock pattern: tests/hooks/context-reinjection-guard.test.ts
|
||||
*/
|
||||
import { describe, it, expect, beforeEach, afterEach, spyOn, mock } from 'bun:test';
|
||||
import { homedir } from 'os';
|
||||
import { join } from 'path';
|
||||
|
||||
// Mock modules that touch the filesystem / network at import time.
|
||||
// MUST be declared before the handler is imported.
|
||||
mock.module('../../../src/shared/SettingsDefaultsManager.js', () => ({
|
||||
SettingsDefaultsManager: {
|
||||
get: (key: string) => {
|
||||
if (key === 'CLAUDE_MEM_DATA_DIR') return join(homedir(), '.claude-mem');
|
||||
return '';
|
||||
},
|
||||
getInt: () => 0,
|
||||
loadFromFile: () => ({ CLAUDE_MEM_EXCLUDED_PROJECTS: [] }),
|
||||
},
|
||||
}));
|
||||
|
||||
// workerHttpRequest is the only worker entry point we must NOT call in
|
||||
// subagent context. It throws so we can assert "never called" by proving
|
||||
// the handler returns success anyway.
|
||||
const workerCallLog: Array<{ path: string; options: any }> = [];
|
||||
mock.module('../../../src/shared/worker-utils.js', () => ({
|
||||
ensureWorkerRunning: () => Promise.resolve(true),
|
||||
getWorkerPort: () => 37777,
|
||||
workerHttpRequest: (apiPath: string, options?: any) => {
|
||||
workerCallLog.push({ path: apiPath, options });
|
||||
throw new Error(
|
||||
`workerHttpRequest MUST NOT be called in subagent context (called with ${apiPath})`
|
||||
);
|
||||
},
|
||||
}));
|
||||
|
||||
// Suppress logger during tests
|
||||
import { logger } from '../../../src/utils/logger.js';
|
||||
|
||||
let loggerSpies: ReturnType<typeof spyOn>[] = [];
|
||||
|
||||
beforeEach(() => {
|
||||
workerCallLog.length = 0;
|
||||
loggerSpies = [
|
||||
spyOn(logger, 'info').mockImplementation(() => {}),
|
||||
spyOn(logger, 'debug').mockImplementation(() => {}),
|
||||
spyOn(logger, 'warn').mockImplementation(() => {}),
|
||||
spyOn(logger, 'error').mockImplementation(() => {}),
|
||||
spyOn(logger, 'failure').mockImplementation(() => {}),
|
||||
spyOn(logger, 'dataIn').mockImplementation(() => {}),
|
||||
];
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
loggerSpies.forEach(spy => spy.mockRestore());
|
||||
});
|
||||
|
||||
describe('summarizeHandler — subagent short-circuit', () => {
|
||||
it('skips summary and returns SUCCESS when agentId is set', async () => {
|
||||
const { summarizeHandler } = await import('../../../src/cli/handlers/summarize.js');
|
||||
|
||||
const result = await summarizeHandler.execute({
|
||||
sessionId: 'session-abc',
|
||||
cwd: '/tmp',
|
||||
platform: 'claude-code',
|
||||
transcriptPath: '/tmp/does-not-matter.jsonl',
|
||||
agentId: 'agent-abc',
|
||||
});
|
||||
|
||||
expect(result.continue).toBe(true);
|
||||
expect(result.suppressOutput).toBe(true);
|
||||
expect(result.exitCode).toBe(0);
|
||||
// Guard fires BEFORE any worker HTTP request. If workerHttpRequest were
|
||||
// called, our mock would have thrown — reaching this expect proves it.
|
||||
expect(workerCallLog.length).toBe(0);
|
||||
});
|
||||
|
||||
it('does NOT skip when only agentType is set (--agent main session still owns its summary)', async () => {
|
||||
// agent_type without agent_id is how Claude Code signals a main session started
|
||||
// with --agent. These are main sessions, not Task-spawned subagents, so the
|
||||
// summary path must proceed. Here the transcript path is missing so the handler
|
||||
// falls through to the existing no-transcriptPath return — the key assertion is
|
||||
// that the subagent guard did NOT short-circuit (handler reached the normal path).
|
||||
const { summarizeHandler } = await import('../../../src/cli/handlers/summarize.js');
|
||||
|
||||
const result = await summarizeHandler.execute({
|
||||
sessionId: 'session-def',
|
||||
cwd: '/tmp',
|
||||
platform: 'claude-code',
|
||||
agentType: 'Explore',
|
||||
// transcriptPath intentionally omitted
|
||||
});
|
||||
|
||||
expect(result.continue).toBe(true);
|
||||
expect(result.exitCode).toBe(0);
|
||||
expect(workerCallLog.length).toBe(0);
|
||||
});
|
||||
|
||||
it('skips summary when both agentId and agentType are set', async () => {
|
||||
const { summarizeHandler } = await import('../../../src/cli/handlers/summarize.js');
|
||||
|
||||
const result = await summarizeHandler.execute({
|
||||
sessionId: 'session-both',
|
||||
cwd: '/tmp',
|
||||
platform: 'claude-code',
|
||||
transcriptPath: '/tmp/does-not-matter.jsonl',
|
||||
agentId: 'agent-xyz',
|
||||
agentType: 'Plan',
|
||||
});
|
||||
|
||||
expect(result.continue).toBe(true);
|
||||
expect(result.suppressOutput).toBe(true);
|
||||
expect(result.exitCode).toBe(0);
|
||||
expect(workerCallLog.length).toBe(0);
|
||||
});
|
||||
|
||||
it('falls through to existing no-transcriptPath guard in main-session context', async () => {
|
||||
// Neither agentId nor agentType → NOT a subagent. Handler should
|
||||
// proceed past the subagent guard and hit the existing
|
||||
// "no transcriptPath" early return. Worker must still not be called.
|
||||
const { summarizeHandler } = await import('../../../src/cli/handlers/summarize.js');
|
||||
|
||||
const result = await summarizeHandler.execute({
|
||||
sessionId: 'session-main',
|
||||
cwd: '/tmp',
|
||||
platform: 'claude-code',
|
||||
// transcriptPath intentionally omitted
|
||||
});
|
||||
|
||||
expect(result.continue).toBe(true);
|
||||
expect(result.suppressOutput).toBe(true);
|
||||
expect(result.exitCode).toBe(0);
|
||||
expect(workerCallLog.length).toBe(0);
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,161 @@
|
||||
/**
|
||||
* Tests for storeObservation subagent labeling (agent_type, agent_id).
|
||||
*
|
||||
* Validates:
|
||||
* 1. Rows carry agent_type / agent_id when set on ObservationInput.
|
||||
* 2. Omitted subagent fields store as NULL (main-session rows).
|
||||
* 3. Dedup is intentionally UNAFFECTED by agent_type — the content hash
|
||||
* covers (memory_session_id, title, narrative) only, so two observations
|
||||
* with the same semantic identity but different originating subagents
|
||||
* dedup to the same row. This preserves stable observation identity
|
||||
* across main-session and subagent contexts and is the documented
|
||||
* intended behavior per Phase 4 anti-pattern guard in the plan.
|
||||
*
|
||||
* Sources:
|
||||
* - Store: src/services/sqlite/observations/store.ts
|
||||
* - Types: src/services/sqlite/observations/types.ts
|
||||
* - Test pattern: tests/sqlite/observations.test.ts
|
||||
*/
|
||||
import { describe, it, expect, beforeEach, afterEach } from 'bun:test';
|
||||
import { ClaudeMemDatabase } from '../../../../src/services/sqlite/Database.js';
|
||||
import { storeObservation } from '../../../../src/services/sqlite/Observations.js';
|
||||
import {
|
||||
createSDKSession,
|
||||
updateMemorySessionId,
|
||||
} from '../../../../src/services/sqlite/Sessions.js';
|
||||
import type { ObservationInput } from '../../../../src/services/sqlite/observations/types.js';
|
||||
import type { Database } from 'bun:sqlite';
|
||||
|
||||
describe('storeObservation — subagent labeling', () => {
|
||||
let db: Database;
|
||||
|
||||
beforeEach(() => {
|
||||
db = new ClaudeMemDatabase(':memory:').db;
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
db.close();
|
||||
});
|
||||
|
||||
function createObservationInput(overrides: Partial<ObservationInput> = {}): ObservationInput {
|
||||
return {
|
||||
type: 'discovery',
|
||||
title: 'Test Observation',
|
||||
subtitle: 'Subtitle',
|
||||
facts: ['fact1'],
|
||||
narrative: 'Narrative body',
|
||||
concepts: ['concept1'],
|
||||
files_read: ['/path/to/file1.ts'],
|
||||
files_modified: [],
|
||||
...overrides,
|
||||
};
|
||||
}
|
||||
|
||||
function createSessionWithMemoryId(
|
||||
contentSessionId: string,
|
||||
memorySessionId: string,
|
||||
project = 'test-project'
|
||||
): string {
|
||||
const sessionId = createSDKSession(db, contentSessionId, project, 'initial prompt');
|
||||
updateMemorySessionId(db, sessionId, memorySessionId);
|
||||
return memorySessionId;
|
||||
}
|
||||
|
||||
it('stores agent_type and agent_id when provided', () => {
|
||||
const memorySessionId = createSessionWithMemoryId('content-sub-1', 'mem-sub-1');
|
||||
const input = createObservationInput({
|
||||
agent_type: 'Explore',
|
||||
agent_id: 'agent-abc',
|
||||
});
|
||||
|
||||
const result = storeObservation(db, memorySessionId, 'test-project', input);
|
||||
|
||||
const row = db
|
||||
.prepare('SELECT agent_type, agent_id FROM observations WHERE id = ?')
|
||||
.get(result.id) as { agent_type: string | null; agent_id: string | null };
|
||||
|
||||
expect(row).not.toBeNull();
|
||||
expect(row.agent_type).toBe('Explore');
|
||||
expect(row.agent_id).toBe('agent-abc');
|
||||
});
|
||||
|
||||
it('stores NULL for agent_type and agent_id when fields are omitted (main-session row)', () => {
|
||||
const memorySessionId = createSessionWithMemoryId('content-main-1', 'mem-main-1');
|
||||
const input = createObservationInput();
|
||||
// input has no agent_type / agent_id
|
||||
|
||||
const result = storeObservation(db, memorySessionId, 'test-project', input);
|
||||
|
||||
const row = db
|
||||
.prepare('SELECT agent_type, agent_id FROM observations WHERE id = ?')
|
||||
.get(result.id) as { agent_type: string | null; agent_id: string | null };
|
||||
|
||||
expect(row).not.toBeNull();
|
||||
expect(row.agent_type).toBeNull();
|
||||
expect(row.agent_id).toBeNull();
|
||||
});
|
||||
|
||||
it('stores agent_type only when agent_id is absent', () => {
|
||||
const memorySessionId = createSessionWithMemoryId('content-partial-1', 'mem-partial-1');
|
||||
const input = createObservationInput({
|
||||
agent_type: 'Plan',
|
||||
// agent_id intentionally omitted
|
||||
});
|
||||
|
||||
const result = storeObservation(db, memorySessionId, 'test-project', input);
|
||||
|
||||
const row = db
|
||||
.prepare('SELECT agent_type, agent_id FROM observations WHERE id = ?')
|
||||
.get(result.id) as { agent_type: string | null; agent_id: string | null };
|
||||
|
||||
expect(row.agent_type).toBe('Plan');
|
||||
expect(row.agent_id).toBeNull();
|
||||
});
|
||||
|
||||
it('dedup is NOT affected by agent fields — second insert with different agent_type returns existing id', () => {
|
||||
// INTENDED BEHAVIOR (per plan Phase 4 anti-pattern guard):
|
||||
// The content hash covers (memory_session_id, title, narrative) only.
|
||||
// Two observations with identical title + narrative but different
|
||||
// agent_type must dedup to the same row so observation identity is
|
||||
// stable across main-session and subagent contexts.
|
||||
const memorySessionId = createSessionWithMemoryId('content-dedup-1', 'mem-dedup-1');
|
||||
|
||||
const first = storeObservation(
|
||||
db,
|
||||
memorySessionId,
|
||||
'test-project',
|
||||
createObservationInput({
|
||||
title: 'Identical Title',
|
||||
narrative: 'Identical narrative body.',
|
||||
agent_type: 'Explore',
|
||||
agent_id: 'agent-first',
|
||||
})
|
||||
);
|
||||
|
||||
const second = storeObservation(
|
||||
db,
|
||||
memorySessionId,
|
||||
'test-project',
|
||||
createObservationInput({
|
||||
title: 'Identical Title',
|
||||
narrative: 'Identical narrative body.',
|
||||
agent_type: 'Plan',
|
||||
agent_id: 'agent-second',
|
||||
})
|
||||
);
|
||||
|
||||
// Second insert is deduped → same id, no new row, original agent fields preserved.
|
||||
expect(second.id).toBe(first.id);
|
||||
|
||||
const rowCount = db
|
||||
.prepare('SELECT COUNT(*) as n FROM observations WHERE memory_session_id = ?')
|
||||
.get(memorySessionId) as { n: number };
|
||||
expect(rowCount.n).toBe(1);
|
||||
|
||||
const row = db
|
||||
.prepare('SELECT agent_type, agent_id FROM observations WHERE id = ?')
|
||||
.get(first.id) as { agent_type: string | null; agent_id: string | null };
|
||||
expect(row.agent_type).toBe('Explore');
|
||||
expect(row.agent_id).toBe('agent-first');
|
||||
});
|
||||
});
|
||||
Reference in New Issue
Block a user