feat: semantic context injection via Chroma on UserPromptSubmit (#1568)

* feat: semantic context injection via Chroma on every UserPromptSubmit

On each prompt, queries ChromaDB for the top-N most relevant past
observations and injects them as additionalContext. Replaces the
recency-based "last N observations" approach with relevance-based
semantic search.

Changes:
- session-init.ts: After session init, query /api/context/semantic
  with user's prompt text. If results found, return as
  hookSpecificOutput with hookEventName 'UserPromptSubmit'.
- SearchRoutes.ts: New GET /api/context/semantic endpoint that queries
  SearchManager with format='json' and formats results as markdown.
- SettingsDefaultsManager.ts: New settings CLAUDE_MEM_SEMANTIC_INJECT
  (default: true) and CLAUDE_MEM_SEMANTIC_INJECT_LIMIT (default: 5).

Key behaviors:
- Fires on every UserPromptSubmit (not just SessionStart)
- Minimum prompt length: 20 chars (skips "ok", "yes", etc.)
- Skips media-only prompts
- Graceful degradation: if worker/Chroma unavailable, no injection
- Survives /clear: re-injects on next prompt (not session-bound)
- Uses workerHttpRequest (v10.6.3 API, not raw fetch)

Production data (23 days, 3,400+ observations):
- Before: 8 most recent observations (often irrelevant to current topic)
- After: 5 most relevant observations (semantic match)
- Token cost: ~1800 → ~800-1200 per injection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address CodeRabbit review on PR #1568

- session-init: don't skip semantic injection when contextInjected=true
  (only skip agent re-init, semantic lookup must run every prompt)
- session-init: normalize SEMANTIC_INJECT toggle via String().toLowerCase()
- semantic endpoint: change from GET to POST to avoid URL-length limits
  and prompt exposure in access logs. Handler accepts both body and query
  for backwards compatibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Alessandro Costa <alessandro@claudio.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Alessandro Costa
2026-04-04 19:16:46 -03:00
committed by GitHub
parent 64cce2bf10
commit 876cc4d837
3 changed files with 105 additions and 5 deletions
@@ -38,6 +38,7 @@ export class SearchRoutes extends BaseRouteHandler {
app.get('/api/context/timeline', this.handleGetContextTimeline.bind(this));
app.get('/api/context/preview', this.handleContextPreview.bind(this));
app.get('/api/context/inject', this.handleContextInject.bind(this));
app.post('/api/context/semantic', this.handleSemanticContext.bind(this));
// Timeline and help endpoints
app.get('/api/timeline/by-query', this.handleGetTimelineByQuery.bind(this));
@@ -246,6 +247,54 @@ export class SearchRoutes extends BaseRouteHandler {
res.send(contextText);
});
/**
* Semantic context search for per-prompt injection
* GET /api/context/semantic?q=<prompt>&project=<project>&limit=5
*
* Queries Chroma for observations semantically similar to the user's prompt.
* Returns compact markdown for injection as additionalContext.
*/
private handleSemanticContext = this.wrapHandler(async (req: Request, res: Response): Promise<void> => {
const query = (req.body?.q || req.query.q) as string;
const project = (req.body?.project || req.query.project) as string;
const limit = parseInt(String(req.body?.limit || req.query.limit || '5'), 10);
if (!query || query.length < 20) {
res.json({ context: '', count: 0 });
return;
}
try {
const result = await this.searchManager.search({
query,
type: 'observations',
project,
limit: String(limit),
format: 'json'
});
const observations = (result as any)?.observations || [];
if (!observations.length) {
res.json({ context: '', count: 0 });
return;
}
// Format as compact markdown for context injection
const lines: string[] = ['## Relevant Past Work (semantic match)\n'];
for (const obs of observations.slice(0, limit)) {
const date = obs.created_at?.slice(0, 10) || '';
lines.push(`### ${obs.title || 'Observation'} (${date})`);
if (obs.narrative) lines.push(obs.narrative);
lines.push('');
}
res.json({ context: lines.join('\n'), count: observations.length });
} catch (error) {
logger.error('SEARCH', 'Semantic context query failed', {}, error as Error);
res.json({ context: '', count: 0 });
}
});
/**
* Get timeline by query (search first, then get timeline around best match)
* GET /api/timeline/by-query?query=...&mode=auto&depth_before=10&depth_after=10