feat: semantic context injection via Chroma on UserPromptSubmit (#1568)

* feat: semantic context injection via Chroma on every UserPromptSubmit

On each prompt, queries ChromaDB for the top-N most relevant past
observations and injects them as additionalContext. Replaces the
recency-based "last N observations" approach with relevance-based
semantic search.

Changes:
- session-init.ts: After session init, query /api/context/semantic
  with user's prompt text. If results found, return as
  hookSpecificOutput with hookEventName 'UserPromptSubmit'.
- SearchRoutes.ts: New GET /api/context/semantic endpoint that queries
  SearchManager with format='json' and formats results as markdown.
- SettingsDefaultsManager.ts: New settings CLAUDE_MEM_SEMANTIC_INJECT
  (default: true) and CLAUDE_MEM_SEMANTIC_INJECT_LIMIT (default: 5).

Key behaviors:
- Fires on every UserPromptSubmit (not just SessionStart)
- Minimum prompt length: 20 chars (skips "ok", "yes", etc.)
- Skips media-only prompts
- Graceful degradation: if worker/Chroma unavailable, no injection
- Survives /clear: re-injects on next prompt (not session-bound)
- Uses workerHttpRequest (v10.6.3 API, not raw fetch)

Production data (23 days, 3,400+ observations):
- Before: 8 most recent observations (often irrelevant to current topic)
- After: 5 most relevant observations (semantic match)
- Token cost: ~1800 → ~800-1200 per injection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address CodeRabbit review on PR #1568

- session-init: don't skip semantic injection when contextInjected=true
  (only skip agent re-init, semantic lookup must run every prompt)
- session-init: normalize SEMANTIC_INJECT toggle via String().toLowerCase()
- semantic endpoint: change from GET to POST to avoid URL-length limits
  and prompt exposure in access logs. Handler accepts both body and query
  for backwards compatibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Alessandro Costa <alessandro@claudio.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Alessandro Costa
2026-04-04 19:16:46 -03:00
committed by GitHub
parent 64cce2bf10
commit 876cc4d837
3 changed files with 105 additions and 5 deletions
+6
View File
@@ -54,6 +54,9 @@ export interface SettingsDefaults {
// Exclusion Settings
CLAUDE_MEM_EXCLUDED_PROJECTS: string; // Comma-separated glob patterns for excluded project paths
CLAUDE_MEM_FOLDER_MD_EXCLUDE: string; // JSON array of folder paths to exclude from CLAUDE.md generation
// Semantic Context Injection (per-prompt via Chroma)
CLAUDE_MEM_SEMANTIC_INJECT: string; // 'true' | 'false' - inject relevant observations on each prompt
CLAUDE_MEM_SEMANTIC_INJECT_LIMIT: string; // Max observations to inject per prompt
// Chroma Vector Database Configuration
CLAUDE_MEM_CHROMA_ENABLED: string; // 'true' | 'false' - set to 'false' for SQLite-only mode
CLAUDE_MEM_CHROMA_MODE: string; // 'local' | 'remote'
@@ -113,6 +116,9 @@ export class SettingsDefaultsManager {
// Exclusion Settings
CLAUDE_MEM_EXCLUDED_PROJECTS: '', // Comma-separated glob patterns for excluded project paths
CLAUDE_MEM_FOLDER_MD_EXCLUDE: '[]', // JSON array of folder paths to exclude from CLAUDE.md generation
// Semantic Context Injection (per-prompt via Chroma vector search)
CLAUDE_MEM_SEMANTIC_INJECT: 'true', // Inject relevant past observations on every UserPromptSubmit
CLAUDE_MEM_SEMANTIC_INJECT_LIMIT: '5', // Top-N most relevant observations to inject per prompt
// Chroma Vector Database Configuration
CLAUDE_MEM_CHROMA_ENABLED: 'true', // Set to 'false' to disable Chroma and use SQLite-only search
CLAUDE_MEM_CHROMA_MODE: 'local', // 'local' uses persistent chroma-mcp via uvx, 'remote' connects to existing server