Merge branch 'thedotmack/add-lang-parsers' into integration/validation-batch
Adds 24-language support for smart-explore: Kotlin, Swift, Elixir, Lua, Scala, Bash, Haskell, Zig, CSS, SCSS, TOML, YAML, SQL, Markdown. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
+16
-1
@@ -14,7 +14,22 @@
|
||||
"tree-sitter-python": "^0.25.0",
|
||||
"tree-sitter-ruby": "^0.23.1",
|
||||
"tree-sitter-rust": "^0.24.0",
|
||||
"tree-sitter-typescript": "^0.23.2"
|
||||
"tree-sitter-typescript": "^0.23.2",
|
||||
"tree-sitter-kotlin": "^0.3.8",
|
||||
"tree-sitter-swift": "^0.7.1",
|
||||
"tree-sitter-php": "^0.24.2",
|
||||
"tree-sitter-elixir": "^0.3.5",
|
||||
"@tree-sitter-grammars/tree-sitter-lua": "^0.4.1",
|
||||
"tree-sitter-scala": "^0.24.0",
|
||||
"tree-sitter-bash": "^0.25.1",
|
||||
"tree-sitter-haskell": "^0.23.1",
|
||||
"@tree-sitter-grammars/tree-sitter-zig": "^1.1.2",
|
||||
"tree-sitter-css": "^0.25.0",
|
||||
"tree-sitter-scss": "^1.0.0",
|
||||
"@tree-sitter-grammars/tree-sitter-toml": "^0.7.0",
|
||||
"@tree-sitter-grammars/tree-sitter-yaml": "^0.7.1",
|
||||
"@derekstride/tree-sitter-sql": "^0.3.11",
|
||||
"@tree-sitter-grammars/tree-sitter-markdown": "^0.3.2"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=18.0.0",
|
||||
|
||||
@@ -125,3 +125,51 @@ get_observations(ids=[11131, 10942, 10855], orderBy="date_desc")
|
||||
- **Full observation:** ~500-1000 tokens each
|
||||
- **Batch fetch:** 1 HTTP request vs N individual requests
|
||||
- **10x token savings** by filtering before fetching
|
||||
|
||||
## Smart-Explore Language Support
|
||||
|
||||
Smart-explore tools (`smart_search`, `smart_outline`, `smart_unfold`) use tree-sitter AST parsing. The following languages are supported out of the box.
|
||||
|
||||
### 24 Bundled Languages
|
||||
|
||||
JS, TS, Python, Go, Rust, Ruby, Java, C, C++, Kotlin, Swift, PHP, Elixir, Lua, Scala, Bash, Haskell, Zig, CSS, SCSS, TOML, YAML, SQL, Markdown
|
||||
|
||||
### Markdown Special Support
|
||||
|
||||
Markdown files get structure-aware parsing beyond generic tree-sitter:
|
||||
|
||||
- **Heading hierarchy** -- `#`/`##`/`###` headings are extracted as nested symbols (sections contain subsections)
|
||||
- **Code block detection** -- fenced code blocks are surfaced as `code` symbols with language annotation
|
||||
- **Section-aware unfold** -- `smart_unfold` on a heading returns the full section content (heading through all subsections until the next heading of equal or higher level)
|
||||
|
||||
### User-Installable Grammars via `.claude-mem.json`
|
||||
|
||||
Add custom tree-sitter grammars for languages not in the bundled set. Place `.claude-mem.json` in the project root:
|
||||
|
||||
```json
|
||||
{
|
||||
"grammars": {
|
||||
"gleam": {
|
||||
"package": "tree-sitter-gleam",
|
||||
"extensions": [".gleam"]
|
||||
},
|
||||
"protobuf": {
|
||||
"package": "tree-sitter-proto",
|
||||
"extensions": [".proto"],
|
||||
"query": ".claude-mem/queries/proto.scm"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Fields:**
|
||||
|
||||
- `package` (string, required) -- npm package name for the tree-sitter grammar
|
||||
- `extensions` (array of strings, required) -- file extensions to associate with this language
|
||||
- `query` (string, optional) -- path to a custom `.scm` query file for symbol extraction. If omitted, a generic query is used.
|
||||
|
||||
**Rules:**
|
||||
|
||||
- User grammars do NOT override bundled languages. If a language is already bundled, the entry is ignored.
|
||||
- The npm package must be installed in the project (`npm install tree-sitter-gleam`).
|
||||
- Config is cached per project root. Changes to `.claude-mem.json` take effect on next worker restart.
|
||||
|
||||
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user