Commit Graph

8 Commits

Author SHA1 Message Date
Blue-B 07d6689d87 fix(claude): add interleaved-thinking beta header, AMP gzip error decoding, normalizeClaudeBudget max_tokens
1. Always include interleaved-thinking-2025-05-14 beta header so that
   thinking blocks are returned correctly for all Claude models.

2. Remove status-code guard in AMP reverse proxy ModifyResponse so that
   error responses (4xx/5xx) with hidden gzip encoding are decoded
   properly — prevents garbled error messages reaching the client.

3. In normalizeClaudeBudget, when the adjusted budget falls below the
   model minimum, set max_tokens = budgetTokens+1 instead of leaving
   the request unchanged (which causes a 400 from the API).
2026-03-07 21:31:10 +09:00
hkfires c44793789b feat(thinking): add adaptive thinking support for Claude models
Add support for Claude's "adaptive" and "auto" thinking modes using `output_config.effort`. Introduce support for new effort level "max" in adaptive thinking. Update thinking logic, validate model capabilities, and extend converters and handling to ensure compatibility with adaptive modes. Adjust static model data with supported levels and refine handling across translators and executors.
2026-03-03 09:05:31 +08:00
hkfires 239a28793c feat(claude): clamp thinking budget to max_tokens constraints 2026-01-19 16:32:20 +08:00
hkfires c421d653e7 refactor(claude): move max_tokens constraint enforcement to Apply method 2026-01-19 15:50:35 +08:00
hkfires 33d66959e9 test(thinking): remove legacy unit and integration tests 2026-01-15 13:06:40 +08:00
hkfires 7f1b2b3f6e fix(thinking): improve model lookup and validation 2026-01-15 13:06:40 +08:00
hkfires 40ee065eff fix(thinking): use static lookup to avoid alias issues 2026-01-15 13:06:40 +08:00
hkfires 0b06d637e7 refactor: improve thinking logic 2026-01-15 13:06:39 +08:00