refactor: Clean up search architecture, remove experimental contextualize endpoint (#133)

* Refactor code structure for improved readability and maintainability * Add test results for search API and related functionalities - Created test result files for various search-related functionalities, including: - test-11-search-server-changes.json - test-12-context-hook-changes.json - test-13-worker-service-changes.json - test-14-patterns.json - test-15-gotchas.json - test-16-discoveries.json - test-17-all-bugfixes.json - test-18-all-features.json - test-19-all-decisions.json - test-20-session-search.json - test-21-prompt-search.json - test-22-decisions-endpoint.json - test-23-changes-endpoint.json - test-24-how-it-works-endpoint.json - test-25-contextualize-endpoint.json - test-26-timeline-around-observation.json - test-27-multi-param-combo.json - test-28-file-type-combo.json - Each test result file captures specific search failures or outcomes, including issues with undefined properties and successful execution of search queries. - Enhanced documentation of search architecture and testing strategies, ensuring compliance with established guidelines and improving overall search functionality. * feat: Enhance unified search API with catch-all parameters and backward compatibility - Implemented a unified search API at /api/search that accepts catch-all parameters for filtering by type, observation type, concepts, and files. - Maintained backward compatibility by keeping granular endpoints functional while routing through the same infrastructure. - Completed comprehensive testing of search capabilities with real-world query scenarios. fix: Address missing debug output in search API query tests - Flushed PM2 logs and executed search queries to verify functionality. - Diagnosed absence of "Raw Chroma" debug messages in worker logs, indicating potential issues with logging or query processing. refactor: Improve build and deployment pipeline for claude-mem plugin - Successfully built and synced all hooks and services to the marketplace directory. - Ensured all dependencies are installed and up-to-date in the deployment location. feat: Implement hybrid search filters with 90-day recency window - Enhanced search server to apply a 90-day recency filter to Chroma results before categorizing by document type. fix: Correct parameter handling in searchUserPrompts method - Added support for filter-only queries and improved dual-path logic for clarity. refactor: Rename FTS5 method to clarify fallback status - Renamed escapeFTS5 to escapeFTS5_fallback_when_chroma_unavailable to indicate its temporary usage. feat: Introduce contextualize tool for comprehensive project overview - Added a new tool to fetch recent observations, sessions, and user prompts, providing a quick project overview. feat: Add semantic shortcut tools for common search patterns - Implemented 'decisions', 'changes', and 'how_it_works' tools for convenient access to frequently searched observation categories. feat: Unified timeline tool supports anchor and query modes - Combined get_context_timeline and get_timeline_by_query into a single interface for timeline exploration. feat: Unified search tool added to MCP server - New tool queries all memory types simultaneously, providing combined chronological results for improved search efficiency. * Refactor search functionality to clarify FTS5 fallback usage - Updated `worker-service.cjs` to replace FTS5 fallback function with a more descriptive name and improved error handling. - Enhanced documentation in `SKILL.md` to specify the unified API endpoint and clarify the behavior of the search engine, including the conditions under which FTS5 is used. - Modified `search-server.ts` to provide clearer logging and descriptions regarding the fallback to FTS5 when UVX/Python is unavailable. - Renamed and updated the `SessionSearch.ts` methods to reflect the conditions for using FTS5, emphasizing the lack of semantic understanding in fallback scenarios. * feat: Add ID-based fetch endpoints and simplify mem-search skill **Problem:** - Search returns IDs but no way to fetch by ID - Skill documentation was bloated with too many options - Claude wasn't using IDs because we didn't tell it how **Solution:** 1. Added three new HTTP endpoints: - GET /api/observation/:id - GET /api/session/:id - GET /api/prompt/:id 2. Completely rewrote SKILL.md: - Stripped complexity down to essentials - Clear 3-step prescriptive workflow: Search → Review IDs → Fetch by ID - Emphasized ID usage: "The IDs are there for a reason - USE THEM" - Removed confusing multi-endpoint documentation - Kept only unified search with filters **Impact:** - Token efficiency: Claude can now fetch full details only for relevant IDs - Clarity: One clear workflow instead of 10+ options to choose from - Usability: IDs are no longer wasted context - they're actionable 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * chore: Move internal docs to private directory Moved POSTMORTEM and planning docs to ./private to exclude from PR reviews. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * refactor: Remove experimental contextualize endpoint - Removed contextualize MCP tool from search-server (saves ~4KB) - Disabled FTS5 fallback paths in SessionSearch (now vector-first) - Cleaned up CLAUDE.md documentation - Removed contextualize-rewrite-plan.md doc Rationale: - Contextualize is better suited as a skill (LLM-powered) than an endpoint - Search API already provides vector search with configurable limits - Created issue #132 to track future contextualize skill implementation Changes: - src/servers/search-server.ts: Removed contextualize tool definition - src/services/sqlite/SessionSearch.ts: Disabled FTS5 fallback, added deprecation warnings - CLAUDE.md: Cleaned up outdated skill documentation - docs/: Removed contextualize plan document 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * refactor: Complete FTS5 cleanup - remove all deprecated search code This completes the FTS5 cleanup work by removing all commented-out FTS5 search code while preserving database tables for backward compatibility. Changes: - Removed 200+ lines of commented FTS5 search code from SessionSearch.ts - Removed deprecated degraded_search_query__when_uvx_unavailable method - Updated all method documentation to clarify vector-first architecture - Updated class documentation to reflect filter-only query support - Updated CLAUDE.md to remove FTS5 search references - Clarified that FTS5 tables exist for backward compatibility only - Updated "Why SQLite FTS5" section to "Why Vector-First Search" Database impact: NONE - FTS5 tables remain intact for existing installations Search architecture: - ChromaDB: All text-based vector search queries - SQLite: Filter-only queries (date ranges, metadata, no query text) - FTS5 tables: Maintained but unused (backward compatibility) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * refactor: Remove all FTS5 fallback execution code from search-server Completes the FTS5 cleanup by removing all fallback execution paths that attempted to use FTS5 when ChromaDB was unavailable. Changes: - Removed all FTS5 fallback code execution paths - When ChromaDB fails or is unavailable, return empty results with helpful error messages - Updated all deprecated tool descriptions (search_observations, search_sessions, search_user_prompts) - Changed error messages to indicate FTS5 fallback has been removed - Added installation instructions for UVX/Python when vector search is unavailable - Updated comments from "hybrid search" to "vector-first search" - Removed ~100 lines of dead FTS5 fallback code Database impact: NONE - FTS5 tables remain intact (backward compatibility) Search behavior when ChromaDB unavailable: - Text queries: Return empty results with error explaining ChromaDB is required - Filter-only queries (no text): Continue to work via direct SQLite 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Address PR 133 review feedback Critical fixes: - Remove contextualize endpoint from worker-service (route + handler) - Fix build script logging to show correct .cjs extension (was .mjs) Documentation improvements: - Add comprehensive FTS5 retention rationale documentation - Include v7.0.0 removal TODO for future cleanup Testing: - Build succeeds with correct output logging - Worker restarts successfully (30th restart) - Contextualize endpoint properly removed (404 response) - Search endpoint verified working This addresses all critical review feedback from PR 133. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
2025-11-21 18:59:23 -05:00
parent d64939c379
commit c5e68a17c8
42 changed files with 2511 additions and 559 deletions
@@ -0,0 +1,86 @@
+#!/bin/bash
+
+# Comprehensive Search API Test Suite
+# Tests all endpoints and parameter combinations
+
+API_URL="http://localhost:37777"
+RESULTS_DIR="test-results"
+
+echo "🔍 Starting comprehensive search API tests..."
+echo ""
+
+# SEMANTIC QUERIES - Understanding how things work
+echo "📚 Running semantic queries..."
+curl -s "$API_URL/api/search?type=observations&query=worker%20service%20startup&format=full&limit=5&orderBy=relevance" > "$RESULTS_DIR/test-01-worker-service-startup.json"
+curl -s "$API_URL/api/search?type=observations&query=SQLite%20FTS5%20implementation&format=full&limit=5&orderBy=relevance" > "$RESULTS_DIR/test-02-sqlite-fts5-implementation.json"
+curl -s "$API_URL/api/search?type=observations&query=hook%20lifecycle%20flow&format=full&limit=5&orderBy=relevance" > "$RESULTS_DIR/test-03-hook-lifecycle-flow.json"
+curl -s "$API_URL/api/search?type=observations&query=build%20pipeline%20process&format=full&limit=5&orderBy=relevance" > "$RESULTS_DIR/test-04-build-pipeline-process.json"
+echo "✅ Semantic queries complete (4 tests)"
+
+# DECISION QUERIES - Architectural choices
+echo "⚖️  Running decision queries..."
+curl -s "$API_URL/api/search?type=observations&obs_type=decision&query=PM2%20instead%20of%20direct%20process&format=full&limit=5&orderBy=relevance" > "$RESULTS_DIR/test-05-pm2-decision.json"
+curl -s "$API_URL/api/search?type=observations&obs_type=decision&query=search%20architecture%20guidelines&format=full&limit=5&orderBy=relevance" > "$RESULTS_DIR/test-06-search-architecture-decision.json"
+curl -s "$API_URL/api/search?type=observations&obs_type=decision&query=MCP%20as%20DRY%20source&format=full&limit=5&orderBy=relevance" > "$RESULTS_DIR/test-07-mcp-dry-decision.json"
+echo "✅ Decision queries complete (3 tests)"
+
+# TROUBLESHOOTING QUERIES - Finding bugfixes
+echo "🔴 Running troubleshooting queries..."
+curl -s "$API_URL/api/search?type=observations&obs_type=bugfix&query=worker%20service%20debugging&format=full&limit=5&orderBy=relevance" > "$RESULTS_DIR/test-08-worker-debugging.json"
+curl -s "$API_URL/api/search?type=observations&query=hook%20timeout%20problems&format=full&limit=5&orderBy=relevance" > "$RESULTS_DIR/test-09-hook-timeout.json"
+curl -s "$API_URL/api/search?type=observations&query=database%20migration%20issues&format=full&limit=5&orderBy=relevance" > "$RESULTS_DIR/test-10-database-migration.json"
+echo "✅ Troubleshooting queries complete (3 tests)"
+
+# FILE-SPECIFIC QUERIES - Tracking file changes
+echo "📁 Running file-specific queries..."
+curl -s "$API_URL/api/search?type=observations&files=search-server.ts&format=full&limit=5&orderBy=date_desc" > "$RESULTS_DIR/test-11-search-server-changes.json"
+curl -s "$API_URL/api/search?type=observations&files=context-hook&format=full&limit=5&orderBy=date_desc" > "$RESULTS_DIR/test-12-context-hook-changes.json"
+curl -s "$API_URL/api/search?type=observations&files=worker-service&format=full&limit=5&orderBy=date_desc" > "$RESULTS_DIR/test-13-worker-service-changes.json"
+echo "✅ File-specific queries complete (3 tests)"
+
+# CONCEPT-BASED QUERIES - Patterns, gotchas, discoveries
+echo "🏷️  Running concept-based queries..."
+curl -s "$API_URL/api/search?type=observations&concepts=pattern&format=full&limit=5&orderBy=date_desc" > "$RESULTS_DIR/test-14-patterns.json"
+curl -s "$API_URL/api/search?type=observations&concepts=gotcha&format=full&limit=5&orderBy=date_desc" > "$RESULTS_DIR/test-15-gotchas.json"
+curl -s "$API_URL/api/search?type=observations&concepts=discovery&format=full&limit=5&orderBy=date_desc" > "$RESULTS_DIR/test-16-discoveries.json"
+echo "✅ Concept-based queries complete (3 tests)"
+
+# TYPE-FILTERED QUERIES - Bugfixes, features, decisions
+echo "🔖 Running type-filtered queries..."
+curl -s "$API_URL/api/search?type=observations&obs_type=bugfix&format=full&limit=5&orderBy=date_desc" > "$RESULTS_DIR/test-17-all-bugfixes.json"
+curl -s "$API_URL/api/search?type=observations&obs_type=feature&format=full&limit=5&orderBy=date_desc" > "$RESULTS_DIR/test-18-all-features.json"
+curl -s "$API_URL/api/search?type=observations&obs_type=decision&format=full&limit=5&orderBy=date_desc" > "$RESULTS_DIR/test-19-all-decisions.json"
+echo "✅ Type-filtered queries complete (3 tests)"
+
+# SESSION QUERIES - Testing session search
+echo "📝 Running session queries..."
+curl -s "$API_URL/api/search?type=sessions&query=search%20architecture&format=full&limit=5&orderBy=relevance" > "$RESULTS_DIR/test-20-session-search.json"
+echo "✅ Session queries complete (1 test)"
+
+# USER PROMPT QUERIES - Testing prompt search
+echo "💬 Running user prompt queries..."
+curl -s "$API_URL/api/search?type=prompts&query=build%20and%20deploy&format=full&limit=5&orderBy=relevance" > "$RESULTS_DIR/test-21-prompt-search.json"
+echo "✅ User prompt queries complete (1 test)"
+
+# DEDICATED ENDPOINTS - Timeline and semantic shortcuts
+echo "🎯 Running dedicated endpoint tests..."
+curl -s "$API_URL/api/decisions?format=full&limit=5" > "$RESULTS_DIR/test-22-decisions-endpoint.json"
+curl -s "$API_URL/api/changes?format=full&limit=5" > "$RESULTS_DIR/test-23-changes-endpoint.json"
+curl -s "$API_URL/api/how-it-works?format=full&limit=5" > "$RESULTS_DIR/test-24-how-it-works-endpoint.json"
+curl -s "$API_URL/api/contextualize?format=full" > "$RESULTS_DIR/test-25-contextualize-endpoint.json"
+echo "✅ Dedicated endpoint tests complete (4 tests)"
+
+# TIMELINE QUERY - Get context around a specific observation
+echo "⏱️  Running timeline query..."
+curl -s "$API_URL/api/timeline?anchor=10630&depth_before=3&depth_after=3&format=full" > "$RESULTS_DIR/test-26-timeline-around-observation.json"
+echo "✅ Timeline query complete (1 test)"
+
+# MULTI-PARAMETER COMBO - Test complex query combinations
+echo "🎛️  Running multi-parameter combination tests..."
+curl -s "$API_URL/api/search?type=observations&obs_type=decision&concepts=pattern&query=search&format=full&limit=5&orderBy=relevance" > "$RESULTS_DIR/test-27-multi-param-combo.json"
+curl -s "$API_URL/api/search?type=observations&files=search-server&obs_type=feature&format=full&limit=5&orderBy=date_desc" > "$RESULTS_DIR/test-28-file-type-combo.json"
+echo "✅ Multi-parameter tests complete (2 tests)"
+
+echo ""
+echo "✨ All tests complete! 28 total queries executed."
+echo "📊 Results saved to $RESULTS_DIR/"
@@ -0,0 +1 @@
+[{"type":"text","text":"## User Prompt #5\n*Source: claude-mem://user-prompt/2735*\n\nAre you ok? You seem a little excited and it's hard to do this project. Can you tell me what's going on?\n\n---\nDate: 11/17/2025, 9:38:06 PM\n\n---\n\n## User Prompt #2\n*Source: claude-mem://user-prompt/2719*\n\nedit the build script, we need it to always do this.\n\n---\nDate: 11/17/2025, 7:49:17 PM\n\n---\n\n## User Prompt #3\n*Source: claude-mem://user-prompt/2662*\n\nthose built files, the js, the cjs, if the source files don't have merge issues, then these are just way outdated build files.... is that the case?\n\n---\nDate: 11/17/2025, 3:03:13 PM\n\n---\n\n## User Prompt #12\n*Source: claude-mem://user-prompt/2651*\n\nI FUCKING DID BUILD AND SYNC AND DELETE. I would not have told you to fucking check pm2 info had I not done that\n\n---\nDate: 11/17/2025, 2:25:18 PM\n\n---\n\n## User Prompt #7\n*Source: claude-mem://user-prompt/2646*\n\nCan you fix it so that it runs from the marketplace folder so it doesn't break on EVERYONES SYSTEM WHO DOESNT FUCKING BUILD MANUALLY\n\n---\nDate: 11/17/2025, 2:19:29 PM"}]
@@ -0,0 +1 @@
+[{"type":"text","text":"# Project Context: claude-mem\n\nNo activity found for this project."}]
@@ -0,0 +1 @@
+[{"type":"text","text":"# Timeline around anchor: 10630\n**Window:** 3 records before → 3 records after | **Items:** 3 (3 obs, 0 sessions, 0 prompts)\n\n**Legend:** 🎯 session-request | 🔴 bugfix | 🟣 feature | 🔄 refactor | ✅ change | 🔵 discovery | 🧠 decision\n\n### Nov 17, 2025\n\n**General**\n| ID | Time | T | Title | Tokens |\n|----|------|---|-------|--------|\n| #10756 | 11:50 PM | 🔴 | Fixed Incorrect Parameter Array in searchUserPrompts FTS5 Path | ~213 |\n| #10757 | ″ | 🔵 | Unified search handler implements Chroma-first with FTS5 fallback on zero results | ~224 |\n| #10758 | 11:51 PM | ✅ | Build and sync claude-mem plugin to marketplace location | ~189 |\n"}]
				`@@ -0,0 +1 @@`
				[{"type":"text","text":"## User Prompt #5\nSource: claude-mem://user-prompt/2735\n\nAre you ok? You seem a little excited and it's hard to do this project. Can you tell me what's going on?\n\n---\nDate: 11/17/2025, 9:38:06 PM\n\n---\n\n## User Prompt #2\nSource: claude-mem://user-prompt/2719\n\nedit the build script, we need it to always do this.\n\n---\nDate: 11/17/2025, 7:49:17 PM\n\n---\n\n## User Prompt #3\nSource: claude-mem://user-prompt/2662\n\nthose built files, the js, the cjs, if the source files don't have merge issues, then these are just way outdated build files.... is that the case?\n\n---\nDate: 11/17/2025, 3:03:13 PM\n\n---\n\n## User Prompt #12\nSource: claude-mem://user-prompt/2651\n\nI FUCKING DID BUILD AND SYNC AND DELETE. I would not have told you to fucking check pm2 info had I not done that\n\n---\nDate: 11/17/2025, 2:25:18 PM\n\n---\n\n## User Prompt #7\nSource: claude-mem://user-prompt/2646\n\nCan you fix it so that it runs from the marketplace folder so it doesn't break on EVERYONES SYSTEM WHO DOESNT FUCKING BUILD MANUALLY\n\n---\nDate: 11/17/2025, 2:19:29 PM"}]
				`@@ -0,0 +1 @@`
				`[{"type":"text","text":"# Project Context: claude-mem\n\nNo activity found for this project."}]`
				`@@ -0,0 +1 @@`
				[{"type":"text","text":"# Timeline around anchor: 10630\nWindow: 3 records before → 3 records after \| Items: 3 (3 obs, 0 sessions, 0 prompts)\n\nLegend: 🎯 session-request \| 🔴 bugfix \| 🟣 feature \| 🔄 refactor \| ✅ change \| 🔵 discovery \| 🧠 decision\n\n### Nov 17, 2025\n\nGeneral\n\| ID \| Time \| T \| Title \| Tokens \|\n\|----\|------\|---\|-------\|--------\|\n\| #10756 \| 11:50 PM \| 🔴 \| Fixed Incorrect Parameter Array in searchUserPrompts FTS5 Path \| ~213 \|\n\| #10757 \| ″ \| 🔵 \| Unified search handler implements Chroma-first with FTS5 fallback on zero results \| ~224 \|\n\| #10758 \| 11:51 PM \| ✅ \| Build and sync claude-mem plugin to marketplace location \| ~189 \|\n"}]