Core implementation: - Added Chroma MCP client integration to search-server.ts - Implemented queryChroma() helper with Python dict parsing - Added VECTOR_DB_DIR constant to paths.ts - Added SessionStore.getObservationsByIds() method Search handlers updated: - search_observations: Semantic-first with 90-day temporal filter - find_by_concept/type/file: Metadata-first, semantic-enhanced ranking - All handlers fall back to FTS5 if Chroma unavailable Technical details: - Direct MCP client usage (no abstractions) - Regex parsing of Chroma Python dict responses - Semantic ranking preserved in final results - Graceful degradation to FTS5-only search 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
6.1 KiB
Prompt for Next Session: Hybrid Search Implementation
Copy this entire prompt into a new Claude Code session to continue the hybrid search feature implementation.
Context
I'm working on the claude-mem project (persistent memory system for Claude Code). I have an experimental branch experiment/chroma-mcp that attempted to add semantic search via ChromaDB, but it has implementation issues and was done in the wrong order.
Current Status:
- ✅ Experiment validated: Semantic search (Chroma) + temporal filtering (SQLite) works
- ✅ Chroma collection
cm__claude-memhas 2,800+ documents synced - ✅ Search quality tests show semantic search provides value
- ❌ Production implementation has issues (dead code, uncommitted fixes, wrong process)
- ✅ Feature plan written and ready to execute
Your Task:
Follow the feature implementation plan in FEATURE_PLAN_HYBRID_SEARCH.md to implement hybrid search correctly from the ground up.
Immediate Actions
-
Read the feature plan:
Read: /Users/alexnewman/Scripts/claude-mem/FEATURE_PLAN_HYBRID_SEARCH.md -
Understand the experiment results:
- The experiment scripts work correctly
- Chroma semantic search is functional
- We just need to implement it properly in production
-
Execute Phase 1 of the plan:
- Create new
feature/hybrid-searchbranch frommain - Port working experiment scripts from
experiment/chroma-mcp - Clean up any dead code references
- Create new
Key Principles for This Implementation
- Start clean: New branch from
main, no baggage from failed attempt - No abstractions: Direct MCP client usage, no ChromaOrchestrator wrapper
- Validate at each step: Don't commit until you've tested it works
- Proper parsing: Chroma MCP returns Python dicts, not JSON - use regex parsing
- Temporal boundaries: 90-day filter prevents stale semantic matches
Files You'll Need to Work With
Core Implementation:
src/servers/search-server.ts- Add hybrid search workflowssrc/services/sync/ChromaSync.ts- NEW: Auto-sync observations to Chromasrc/services/worker-service.ts- Integrate auto-syncsrc/shared/paths.ts- Add VECTOR_DB_DIR constant
Experiment Files (keep these, they work):
experiment/chroma-sync-experiment.ts- Manual sync toolexperiment/chroma-search-test.ts- Search quality validator
Files to DELETE (dead code from failed attempt):
src/services/chroma/ChromaOrchestrator.ts- Broken wrapper, never usedtest-chroma-connection.ts- Uses broken ChromaOrchestratorplugin/scripts/search-server.cjs- Stale CommonJS build
Validation Checklist
Before committing any code, verify:
# 1. Build succeeds
npm run build
# 2. Sync works
npx tsx experiment/chroma-sync-experiment.ts
# 3. Search works
npx tsx experiment/chroma-search-test.ts
# 4. MCP server starts
node plugin/scripts/search-server.js
# (Ctrl+C to stop)
# 5. No dead code
grep -r "ChromaOrchestrator" src/ # Should return nothing
# 6. No stale builds
ls plugin/scripts/search-server.cjs # Should not exist
# 7. Git status clean
git status # No uncommitted changes to production files
Implementation Workflow (from Phase 3 of plan)
Step 1: Add queryChroma Helper
In src/servers/search-server.ts, add a helper function that:
- Takes:
query: string, limit: number, whereFilter?: object - Calls:
chromaClient.callTool({ name: 'chroma_query_documents', ... }) - Parses: Python dict response with regex (see lines 256-318 in current branch for example)
- Returns:
{ ids: number[], distances: number[], metadatas: any[] }
Step 2: Initialize Chroma Client
In main() function:
const chromaTransport = new StdioClientTransport({
command: 'uvx',
args: ['chroma-mcp', '--client-type', 'persistent', '--data-dir', VECTOR_DB_DIR]
});
chromaClient = new Client({ name: 'claude-mem-search-chroma-client', version: '1.0.0' }, { capabilities: {} });
await chromaClient.connect(chromaTransport);
Step 3: Update search_observations Handler
Replace FTS5 keyword search with:
- Chroma semantic search (top 100)
- Filter by recency (90 days)
- Hydrate from SQLite in temporal order
- Return results
Step 4: Update Metadata Search Handlers
For find_by_concept, find_by_type, find_by_file:
- SQLite metadata filter first
- Chroma semantic ranking second
- Preserve semantic rank order in results
Expected Timeline
- Phase 1 (Clean Start): 15 minutes
- Phase 2 (Architecture Review): Already done, read the plan
- Phase 3 (Implementation): 2-3 hours
- Phase 4 (Validation): 1 hour
- Phase 5 (Documentation): 1 hour
- Phase 6 (Deployment): 30 minutes
Total: ~5-6 hours
Questions to Ask Me
If you encounter any issues:
- "The Chroma MCP client isn't connecting" → Check if
uvx chroma-mcpis available - "Parsing errors from Chroma responses" → Show me the response format, I'll help fix regex
- "Not sure about the search workflow logic" → Reference Phase 2.2 in the plan
- "Should I commit now?" → Only if validation checklist passes
- "Merge to main or PR?" → I'll decide, just get to Phase 6 first
Success Criteria
Don't merge until ALL of these are true:
- ✅ Sync experiment completes without errors
- ✅ Search test shows Chroma returning relevant results
- ✅ MCP server starts and responds to queries
- ✅ Fallback to FTS5 works if Chroma unavailable
- ✅ No breaking changes to MCP tool interfaces
- ✅ Documentation updated (CLAUDE.md + release notes)
- ✅ No uncommitted changes in git status
- ✅ No dead code (ChromaOrchestrator removed)
- ✅ No stale build artifacts (.cjs files deleted)
Start Here
1. Read the feature plan:
Read: /Users/alexnewman/Scripts/claude-mem/FEATURE_PLAN_HYBRID_SEARCH.md
2. Create the feature branch:
Bash: git checkout main && git pull && git checkout -b feature/hybrid-search
3. Begin Phase 1 of the plan (porting experiment scripts)
4. Work through each phase systematically, validating at each step
5. Ask me questions if anything is unclear
Let's build this correctly, from the ground up. Take your time and validate at each step.