fix(claude): add interleaved-thinking beta header, AMP gzip error decoding, normalizeClaudeBudget max_tokens

1. Always include interleaved-thinking-2025-05-14 beta header so that
   thinking blocks are returned correctly for all Claude models.

2. Remove status-code guard in AMP reverse proxy ModifyResponse so that
   error responses (4xx/5xx) with hidden gzip encoding are decoded
   properly — prevents garbled error messages reaching the client.

3. In normalizeClaudeBudget, when the adjusted budget falls below the
   model minimum, set max_tokens = budgetTokens+1 instead of leaving
   the request unchanged (which causes a 400 from the API).
This commit is contained in:
Blue-B
2026-03-07 21:31:10 +09:00
parent 5ebc58fab4
commit 07d6689d87
3 changed files with 6 additions and 6 deletions
+3 -1
View File
@@ -194,7 +194,9 @@ func (a *Applier) normalizeClaudeBudget(body []byte, budgetTokens int, modelInfo
}
if minBudget > 0 && adjustedBudget > 0 && adjustedBudget < minBudget {
// If enforcing the max_tokens constraint would push the budget below the model minimum,
// leave the request unchanged.
// increase max_tokens to accommodate the original budget instead of leaving the
// request unchanged (which would cause a 400 error from the API).
body, _ = sjson.SetBytes(body, "max_tokens", budgetTokens+1)
return body
}