Compare commits
16 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
3e7b645346 | ||
|
|
24446a4dc4 | ||
|
|
475f473dab | ||
|
|
8dba32a077 | ||
|
|
1bbbd16df6 | ||
|
|
5cb378256b | ||
|
|
3ac5f05e8c | ||
|
|
58d30369b4 | ||
|
|
7dd93a4a25 | ||
|
|
2a3ee8d0e3 | ||
|
|
41577bce07 | ||
|
|
3d7aca22c0 | ||
|
|
680b3f5010 | ||
|
|
9d42e4b239 | ||
|
|
97af785aad | ||
|
|
0defb68c6c |
55
README.md
55
README.md
@@ -23,6 +23,7 @@ Chinese providers have now been added: [Qwen Code](https://github.com/QwenLM/qwe
|
||||
- Multiple accounts with round-robin load balancing (Gemini, OpenAI, Claude, Qwen and iFlow)
|
||||
- Simple CLI authentication flows (Gemini, OpenAI, Claude, Qwen and iFlow)
|
||||
- Generative Language API Key support
|
||||
- AI Studio Build multi-account load balancing
|
||||
- Gemini CLI multi-account load balancing
|
||||
- Claude Code multi-account load balancing
|
||||
- Qwen Code multi-account load balancing
|
||||
@@ -68,6 +69,14 @@ brew install cliproxyapi
|
||||
brew services start cliproxyapi
|
||||
```
|
||||
|
||||
### Installation via CLIProxyAPI Linux Installer
|
||||
|
||||
```bash
|
||||
curl -fsSL https://raw.githubusercontent.com/brokechubb/cliproxyapi-installer/refs/heads/master/cliproxyapi-installer | bash
|
||||
```
|
||||
|
||||
Thanks to [brokechubb](https://github.com/brokechubb) for building the Linux installer!
|
||||
|
||||
## Usage
|
||||
|
||||
### GUI Client & Official WebUI
|
||||
@@ -260,12 +269,16 @@ console.log(await claudeResponse.json());
|
||||
- gemini-2.5-flash-lite
|
||||
- gemini-2.5-flash-image
|
||||
- gemini-2.5-flash-image-preview
|
||||
- gemini-pro-latest
|
||||
- gemini-flash-latest
|
||||
- gemini-flash-lite-latest
|
||||
- gpt-5
|
||||
- gpt-5-codex
|
||||
- claude-opus-4-1-20250805
|
||||
- claude-opus-4-20250514
|
||||
- claude-sonnet-4-20250514
|
||||
- claude-sonnet-4-5-20250929
|
||||
- claude-haiku-4-5-20251001
|
||||
- claude-3-7-sonnet-20250219
|
||||
- claude-3-5-haiku-20241022
|
||||
- qwen3-coder-plus
|
||||
@@ -277,7 +290,6 @@ console.log(await claudeResponse.json());
|
||||
- deepseek-r1
|
||||
- deepseek-v3
|
||||
- kimi-k2
|
||||
- glm-4.5
|
||||
- glm-4.6
|
||||
- tstars2.0
|
||||
- And other iFlow-supported models
|
||||
@@ -510,28 +522,37 @@ openai-compatibility:
|
||||
alias: "kimi-k2"
|
||||
```
|
||||
|
||||
Legacy format (still supported):
|
||||
|
||||
```yaml
|
||||
openai-compatibility:
|
||||
- name: "openrouter"
|
||||
base-url: "https://openrouter.ai/api/v1"
|
||||
api-keys:
|
||||
- "sk-or-v1-...b780"
|
||||
- "sk-or-v1-...b781"
|
||||
models:
|
||||
- name: "moonshotai/kimi-k2:free"
|
||||
alias: "kimi-k2"
|
||||
```
|
||||
|
||||
Usage:
|
||||
|
||||
Call OpenAI's endpoint `/v1/chat/completions` with `model` set to the alias (e.g., `kimi-k2`). The proxy routes to the configured provider/model automatically.
|
||||
|
||||
Also, you may call Claude's endpoint `/v1/messages`, Gemini's `/v1beta/models/model-name:streamGenerateContent` or `/v1beta/models/model-name:generateContent`.
|
||||
|
||||
And you can always use Gemini CLI with `CODE_ASSIST_ENDPOINT` set to `http://127.0.0.1:8317` for these OpenAI-compatible provider's models.
|
||||
|
||||
### AI Studio Instructions
|
||||
|
||||
You can use this service (CLIProxyAPI) as a backend for [this AI Studio App](https://aistudio.google.com/apps/drive/1CPW7FpWGsDZzkaYgYOyXQ_6FWgxieLmL). Follow the steps below to configure it:
|
||||
|
||||
1. **Start the CLIProxyAPI Service**: Ensure your CLIProxyAPI instance is running, either locally or remotely.
|
||||
2. **Access the AI Studio App**: Log in to your Google account in your browser, then open the following link:
|
||||
- [https://aistudio.google.com/apps/drive/1CPW7FpWGsDZzkaYgYOyXQ_6FWgxieLmL](https://aistudio.google.com/apps/drive/1CPW7FpWGsDZzkaYgYOyXQ_6FWgxieLmL)
|
||||
|
||||
#### Connection Configuration
|
||||
|
||||
By default, the AI Studio App attempts to connect to a local CLIProxyAPI instance at `ws://127.0.0.1:8317`.
|
||||
|
||||
- **Connecting to a Remote Service**:
|
||||
If you need to connect to a remotely deployed CLIProxyAPI, modify the `config.ts` file in the AI Studio App to update the `WEBSOCKET_PROXY_URL` value.
|
||||
- Use the `wss://` protocol if your remote service has SSL enabled.
|
||||
- Use the `ws://` protocol if SSL is not enabled.
|
||||
|
||||
#### Authentication Configuration
|
||||
|
||||
By default, WebSocket connections to CLIProxyAPI do not require authentication.
|
||||
|
||||
- **Enable Authentication on the CLIProxyAPI Server**:
|
||||
In your `config.yaml` file, set `ws_auth` to `true`.
|
||||
- **Configure Authentication on the AI Studio Client**:
|
||||
In the `config.ts` file of the AI Studio App, set the `JWT_TOKEN` value to your authentication token.
|
||||
|
||||
### Authentication Directory
|
||||
|
||||
|
||||
54
README_CN.md
54
README_CN.md
@@ -43,6 +43,7 @@
|
||||
- 多账户支持与轮询负载均衡(Gemini、OpenAI、Claude、Qwen 与 iFlow)
|
||||
- 简单的 CLI 身份验证流程(Gemini、OpenAI、Claude、Qwen 与 iFlow)
|
||||
- 支持 Gemini AIStudio API 密钥
|
||||
- 支持 AI Studio Build 多账户轮询
|
||||
- 支持 Gemini CLI 多账户轮询
|
||||
- 支持 Claude Code 多账户轮询
|
||||
- 支持 Qwen Code 多账户轮询
|
||||
@@ -82,6 +83,14 @@ brew install cliproxyapi
|
||||
brew services start cliproxyapi
|
||||
```
|
||||
|
||||
### 通过 CLIProxyAPI Linux Installer 安装
|
||||
|
||||
```bash
|
||||
curl -fsSL https://raw.githubusercontent.com/brokechubb/cliproxyapi-installer/refs/heads/master/cliproxyapi-installer | bash
|
||||
```
|
||||
|
||||
感谢 [brokechubb](https://github.com/brokechubb) 构建了 Linux installer!
|
||||
|
||||
## 使用方法
|
||||
|
||||
### 图形客户端与官方 WebUI
|
||||
@@ -273,12 +282,16 @@ console.log(await claudeResponse.json());
|
||||
- gemini-2.5-flash-lite
|
||||
- gemini-2.5-flash-image
|
||||
- gemini-2.5-flash-image-preview
|
||||
- gemini-pro-latest
|
||||
- gemini-flash-latest
|
||||
- gemini-flash-lite-latest
|
||||
- gpt-5
|
||||
- gpt-5-codex
|
||||
- claude-opus-4-1-20250805
|
||||
- claude-opus-4-20250514
|
||||
- claude-sonnet-4-20250514
|
||||
- claude-sonnet-4-5-20250929
|
||||
- claude-haiku-4-5-20251001
|
||||
- claude-3-7-sonnet-20250219
|
||||
- claude-3-5-haiku-20241022
|
||||
- qwen3-coder-plus
|
||||
@@ -290,7 +303,6 @@ console.log(await claudeResponse.json());
|
||||
- deepseek-r1
|
||||
- deepseek-v3
|
||||
- kimi-k2
|
||||
- glm-4.5
|
||||
- glm-4.6
|
||||
- tstars2.0
|
||||
- 以及其他 iFlow 支持的模型
|
||||
@@ -523,24 +535,36 @@ openai-compatibility:
|
||||
alias: "kimi-k2"
|
||||
```
|
||||
|
||||
旧格式(仍支持):
|
||||
|
||||
```yaml
|
||||
openai-compatibility:
|
||||
- name: "openrouter"
|
||||
base-url: "https://openrouter.ai/api/v1"
|
||||
api-keys:
|
||||
- "sk-or-v1-...b780"
|
||||
- "sk-or-v1-...b781"
|
||||
models:
|
||||
- name: "moonshotai/kimi-k2:free"
|
||||
alias: "kimi-k2"
|
||||
```
|
||||
|
||||
使用方式:在 `/v1/chat/completions` 中将 `model` 设为别名(如 `kimi-k2`),代理将自动路由到对应提供商与模型。
|
||||
|
||||
并且,对于这些与OpenAI兼容的提供商模型,您始终可以通过将CODE_ASSIST_ENDPOINT设置为 http://127.0.0.1:8317 来使用Gemini CLI。
|
||||
|
||||
### AI Studio 使用说明
|
||||
|
||||
您可以将本服务 (CLIProxyAPI) 作为后端,配合 [这个 AI Studio 应用](https://aistudio.google.com/apps/drive/1CPW7FpWGsDZzkaYgYOyXQ_6FWgxieLmL) 使用。请遵循以下步骤进行配置:
|
||||
|
||||
1. **启动 CLIProxyAPI 服务**:确保您的 CLIProxyAPI 实例正在本地或远程运行。
|
||||
2. **访问 AI Studio 应用**:在浏览器中登录您的 Google 账户,然后打开以下链接:
|
||||
- [https://aistudio.google.com/apps/drive/1CPW7FpWGsDZzkaYgYOyXQ_6FWgxieLmL](https://aistudio.google.com/apps/drive/1CPW7FpWGsDZzkaYgYOyXQ_6FWgxieLmL)
|
||||
|
||||
#### 连接配置
|
||||
|
||||
默认情况下,AI Studio 应用会尝试连接到本地的 CLIProxyAPI (`ws://127.0.0.1:8317`)。
|
||||
|
||||
- **连接到远程服务**:
|
||||
如果您需要连接到远程部署的 CLIProxyAPI,请修改 AI Studio 应用中的 `config.ts` 文件,更新 `WEBSOCKET_PROXY_URL` 的值。
|
||||
- 如果您的远程服务启用了 SSL,请使用 `wss://` 协议。
|
||||
- 如果未启用 SSL,请使用 `ws://` 协议。
|
||||
|
||||
#### 认证配置
|
||||
|
||||
默认情况下,CLIProxyAPI 的 WebSocket 连接不要求认证。
|
||||
|
||||
- **在 CLIProxyAPI 服务端启用认证**:
|
||||
在您的 `config.yaml` 文件中,将 `ws_auth` 设置为 `true`。
|
||||
- **在 AI Studio 客户端配置认证**:
|
||||
在 AI Studio 应用的 `config.ts` 文件中,设置 `JWT_TOKEN` 的值为您的认证令牌。
|
||||
|
||||
### 身份验证目录
|
||||
|
||||
`auth-dir` 参数指定身份验证令牌的存储位置。当您运行登录命令时,应用程序将在此目录中创建包含 Google 账户身份验证令牌的 JSON 文件。多个账户可用于轮询。
|
||||
|
||||
@@ -84,6 +84,7 @@ func GeminiModels() []*ModelInfo {
|
||||
InputTokenLimit: 1048576,
|
||||
OutputTokenLimit: 65536,
|
||||
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
|
||||
Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
|
||||
},
|
||||
{
|
||||
ID: "gemini-2.5-pro",
|
||||
@@ -98,6 +99,7 @@ func GeminiModels() []*ModelInfo {
|
||||
InputTokenLimit: 1048576,
|
||||
OutputTokenLimit: 65536,
|
||||
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
|
||||
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true},
|
||||
},
|
||||
{
|
||||
ID: "gemini-2.5-flash-lite",
|
||||
@@ -112,6 +114,7 @@ func GeminiModels() []*ModelInfo {
|
||||
InputTokenLimit: 1048576,
|
||||
OutputTokenLimit: 65536,
|
||||
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
|
||||
Thinking: &ThinkingSupport{Min: 512, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
|
||||
},
|
||||
{
|
||||
ID: "gemini-2.5-flash-image-preview",
|
||||
@@ -126,6 +129,7 @@ func GeminiModels() []*ModelInfo {
|
||||
InputTokenLimit: 1048576,
|
||||
OutputTokenLimit: 8192,
|
||||
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
|
||||
// image models don't support thinkingConfig; leave Thinking nil
|
||||
},
|
||||
{
|
||||
ID: "gemini-2.5-flash-image",
|
||||
@@ -140,6 +144,7 @@ func GeminiModels() []*ModelInfo {
|
||||
InputTokenLimit: 1048576,
|
||||
OutputTokenLimit: 8192,
|
||||
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
|
||||
// image models don't support thinkingConfig; leave Thinking nil
|
||||
},
|
||||
}
|
||||
}
|
||||
@@ -152,9 +157,8 @@ func GetGeminiCLIModels() []*ModelInfo { return GeminiModels() }
|
||||
|
||||
// GetAIStudioModels returns the Gemini model definitions for AI Studio integrations
|
||||
func GetAIStudioModels() []*ModelInfo {
|
||||
models := make([]*ModelInfo, 0, 8)
|
||||
models = append(models, GeminiModels()...)
|
||||
models = append(models,
|
||||
base := GeminiModels()
|
||||
return append(base,
|
||||
&ModelInfo{
|
||||
ID: "gemini-pro-latest",
|
||||
Object: "model",
|
||||
@@ -168,6 +172,7 @@ func GetAIStudioModels() []*ModelInfo {
|
||||
InputTokenLimit: 1048576,
|
||||
OutputTokenLimit: 65536,
|
||||
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
|
||||
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true},
|
||||
},
|
||||
&ModelInfo{
|
||||
ID: "gemini-flash-latest",
|
||||
@@ -182,6 +187,7 @@ func GetAIStudioModels() []*ModelInfo {
|
||||
InputTokenLimit: 1048576,
|
||||
OutputTokenLimit: 65536,
|
||||
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
|
||||
Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
|
||||
},
|
||||
&ModelInfo{
|
||||
ID: "gemini-flash-lite-latest",
|
||||
@@ -196,9 +202,9 @@ func GetAIStudioModels() []*ModelInfo {
|
||||
InputTokenLimit: 1048576,
|
||||
OutputTokenLimit: 65536,
|
||||
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
|
||||
Thinking: &ThinkingSupport{Min: 512, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
|
||||
},
|
||||
)
|
||||
return models
|
||||
}
|
||||
|
||||
// GetOpenAIModels returns the standard OpenAI model definitions
|
||||
|
||||
@@ -45,6 +45,23 @@ type ModelInfo struct {
|
||||
MaxCompletionTokens int `json:"max_completion_tokens,omitempty"`
|
||||
// SupportedParameters lists supported parameters
|
||||
SupportedParameters []string `json:"supported_parameters,omitempty"`
|
||||
|
||||
// Thinking holds provider-specific reasoning/thinking budget capabilities.
|
||||
// This is optional and currently used for Gemini thinking budget normalization.
|
||||
Thinking *ThinkingSupport `json:"thinking,omitempty"`
|
||||
}
|
||||
|
||||
// ThinkingSupport describes a model family's supported internal reasoning budget range.
|
||||
// Values are interpreted in provider-native token units.
|
||||
type ThinkingSupport struct {
|
||||
// Min is the minimum allowed thinking budget (inclusive).
|
||||
Min int `json:"min,omitempty"`
|
||||
// Max is the maximum allowed thinking budget (inclusive).
|
||||
Max int `json:"max,omitempty"`
|
||||
// ZeroAllowed indicates whether 0 is a valid value (to disable thinking).
|
||||
ZeroAllowed bool `json:"zero_allowed,omitempty"`
|
||||
// DynamicAllowed indicates whether -1 is a valid value (dynamic thinking budget).
|
||||
DynamicAllowed bool `json:"dynamic_allowed,omitempty"`
|
||||
}
|
||||
|
||||
// ModelRegistration tracks a model's availability
|
||||
@@ -652,6 +669,17 @@ func (r *ModelRegistry) GetModelProviders(modelID string) []string {
|
||||
return result
|
||||
}
|
||||
|
||||
// GetModelInfo returns the registered ModelInfo for the given model ID, if present.
|
||||
// Returns nil if the model is unknown to the registry.
|
||||
func (r *ModelRegistry) GetModelInfo(modelID string) *ModelInfo {
|
||||
r.mutex.RLock()
|
||||
defer r.mutex.RUnlock()
|
||||
if reg, ok := r.models[modelID]; ok && reg != nil {
|
||||
return reg.Info
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// convertModelToMap converts ModelInfo to the appropriate format for different handler types
|
||||
func (r *ModelRegistry) convertModelToMap(model *ModelInfo, handlerType string) map[string]any {
|
||||
if model == nil {
|
||||
|
||||
@@ -256,10 +256,14 @@ func (e *AIStudioExecutor) translateRequest(req cliproxyexecutor.Request, opts c
|
||||
from := opts.SourceFormat
|
||||
to := sdktranslator.FromString("gemini")
|
||||
payload := sdktranslator.TranslateRequest(from, to, req.Model, bytes.Clone(req.Payload), stream)
|
||||
if budgetOverride, includeOverride, ok := util.GeminiThinkingFromMetadata(req.Metadata); ok {
|
||||
if budgetOverride, includeOverride, ok := util.GeminiThinkingFromMetadata(req.Metadata); ok && util.ModelSupportsThinking(req.Model) {
|
||||
if budgetOverride != nil {
|
||||
norm := util.NormalizeThinkingBudget(req.Model, *budgetOverride)
|
||||
budgetOverride = &norm
|
||||
}
|
||||
payload = util.ApplyGeminiThinkingConfig(payload, budgetOverride, includeOverride)
|
||||
}
|
||||
payload = disableGeminiThinkingConfig(payload, req.Model)
|
||||
payload = util.StripThinkingConfigIfUnsupported(req.Model, payload)
|
||||
payload = fixGeminiImageAspectRatio(req.Model, payload)
|
||||
metadataAction := "generateContent"
|
||||
if req.Metadata != nil {
|
||||
|
||||
@@ -551,8 +551,8 @@ func applyClaudeHeaders(r *http.Request, apiKey string, stream bool) {
|
||||
misc.EnsureHeader(r.Header, ginHeaders, "X-Stainless-Arch", "arm64")
|
||||
misc.EnsureHeader(r.Header, ginHeaders, "X-Stainless-Os", "MacOS")
|
||||
misc.EnsureHeader(r.Header, ginHeaders, "X-Stainless-Timeout", "60")
|
||||
misc.EnsureHeader(r.Header, ginHeaders, "User-Agent", "claude-cli/1.0.83 (external, cli)")
|
||||
r.Header.Set("Connection", "keep-alive")
|
||||
r.Header.Set("User-Agent", "claude-cli/1.0.83 (external, cli)")
|
||||
r.Header.Set("Accept-Encoding", "gzip, deflate, br, zstd")
|
||||
if stream {
|
||||
r.Header.Set("Accept", "text/event-stream")
|
||||
|
||||
@@ -532,6 +532,7 @@ func applyCodexHeaders(r *http.Request, auth *cliproxyauth.Auth, token string) {
|
||||
misc.EnsureHeader(r.Header, ginHeaders, "Version", "0.21.0")
|
||||
misc.EnsureHeader(r.Header, ginHeaders, "Openai-Beta", "responses=experimental")
|
||||
misc.EnsureHeader(r.Header, ginHeaders, "Session_id", uuid.NewString())
|
||||
misc.EnsureHeader(r.Header, ginHeaders, "User-Agent", "codex_cli_rs/0.50.0 (Mac OS 26.0.1; arm64) Apple_Terminal/464")
|
||||
|
||||
r.Header.Set("Accept", "text/event-stream")
|
||||
r.Header.Set("Connection", "Keep-Alive")
|
||||
|
||||
@@ -63,9 +63,14 @@ func (e *GeminiCLIExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
|
||||
to := sdktranslator.FromString("gemini-cli")
|
||||
budgetOverride, includeOverride, hasOverride := util.GeminiThinkingFromMetadata(req.Metadata)
|
||||
basePayload := sdktranslator.TranslateRequest(from, to, req.Model, bytes.Clone(req.Payload), false)
|
||||
if hasOverride {
|
||||
if hasOverride && util.ModelSupportsThinking(req.Model) {
|
||||
if budgetOverride != nil {
|
||||
norm := util.NormalizeThinkingBudget(req.Model, *budgetOverride)
|
||||
budgetOverride = &norm
|
||||
}
|
||||
basePayload = util.ApplyGeminiCLIThinkingConfig(basePayload, budgetOverride, includeOverride)
|
||||
}
|
||||
basePayload = util.StripThinkingConfigIfUnsupported(req.Model, basePayload)
|
||||
basePayload = fixGeminiCLIImageAspectRatio(req.Model, basePayload)
|
||||
|
||||
action := "generateContent"
|
||||
@@ -92,7 +97,7 @@ func (e *GeminiCLIExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
|
||||
var lastStatus int
|
||||
var lastBody []byte
|
||||
|
||||
for _, attemptModel := range models {
|
||||
for idx, attemptModel := range models {
|
||||
payload := append([]byte(nil), basePayload...)
|
||||
if action == "countTokens" {
|
||||
payload = deleteJSONField(payload, "project")
|
||||
@@ -101,7 +106,6 @@ func (e *GeminiCLIExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
|
||||
payload = setJSONField(payload, "project", projectID)
|
||||
payload = setJSONField(payload, "model", attemptModel)
|
||||
}
|
||||
payload = disableGeminiThinkingConfig(payload, attemptModel)
|
||||
|
||||
tok, errTok := tokenSource.Token()
|
||||
if errTok != nil {
|
||||
@@ -166,7 +170,11 @@ func (e *GeminiCLIExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
|
||||
lastBody = append([]byte(nil), data...)
|
||||
log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, string(data))
|
||||
if httpResp.StatusCode == 429 {
|
||||
log.Debugf("gemini cli executor: rate limited, retrying with next model")
|
||||
if idx+1 < len(models) {
|
||||
log.Debugf("gemini cli executor: rate limited, retrying with next model: %s", models[idx+1])
|
||||
} else {
|
||||
log.Debug("gemini cli executor: rate limited, no additional fallback model")
|
||||
}
|
||||
continue
|
||||
}
|
||||
|
||||
@@ -196,9 +204,14 @@ func (e *GeminiCLIExecutor) ExecuteStream(ctx context.Context, auth *cliproxyaut
|
||||
to := sdktranslator.FromString("gemini-cli")
|
||||
budgetOverride, includeOverride, hasOverride := util.GeminiThinkingFromMetadata(req.Metadata)
|
||||
basePayload := sdktranslator.TranslateRequest(from, to, req.Model, bytes.Clone(req.Payload), true)
|
||||
if hasOverride {
|
||||
if hasOverride && util.ModelSupportsThinking(req.Model) {
|
||||
if budgetOverride != nil {
|
||||
norm := util.NormalizeThinkingBudget(req.Model, *budgetOverride)
|
||||
budgetOverride = &norm
|
||||
}
|
||||
basePayload = util.ApplyGeminiCLIThinkingConfig(basePayload, budgetOverride, includeOverride)
|
||||
}
|
||||
basePayload = util.StripThinkingConfigIfUnsupported(req.Model, basePayload)
|
||||
basePayload = fixGeminiCLIImageAspectRatio(req.Model, basePayload)
|
||||
|
||||
projectID := strings.TrimSpace(stringValue(auth.Metadata, "project_id"))
|
||||
@@ -219,11 +232,10 @@ func (e *GeminiCLIExecutor) ExecuteStream(ctx context.Context, auth *cliproxyaut
|
||||
var lastStatus int
|
||||
var lastBody []byte
|
||||
|
||||
for _, attemptModel := range models {
|
||||
for idx, attemptModel := range models {
|
||||
payload := append([]byte(nil), basePayload...)
|
||||
payload = setJSONField(payload, "project", projectID)
|
||||
payload = setJSONField(payload, "model", attemptModel)
|
||||
payload = disableGeminiThinkingConfig(payload, attemptModel)
|
||||
|
||||
tok, errTok := tokenSource.Token()
|
||||
if errTok != nil {
|
||||
@@ -282,7 +294,11 @@ func (e *GeminiCLIExecutor) ExecuteStream(ctx context.Context, auth *cliproxyaut
|
||||
lastBody = append([]byte(nil), data...)
|
||||
log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, string(data))
|
||||
if httpResp.StatusCode == 429 {
|
||||
log.Debugf("gemini cli executor: rate limited, retrying with next model")
|
||||
if idx+1 < len(models) {
|
||||
log.Debugf("gemini cli executor: rate limited, retrying with next model: %s", models[idx+1])
|
||||
} else {
|
||||
log.Debug("gemini cli executor: rate limited, no additional fallback model")
|
||||
}
|
||||
continue
|
||||
}
|
||||
err = statusErr{code: httpResp.StatusCode, msg: string(data)}
|
||||
@@ -393,12 +409,16 @@ func (e *GeminiCLIExecutor) CountTokens(ctx context.Context, auth *cliproxyauth.
|
||||
budgetOverride, includeOverride, hasOverride := util.GeminiThinkingFromMetadata(req.Metadata)
|
||||
for _, attemptModel := range models {
|
||||
payload := sdktranslator.TranslateRequest(from, to, attemptModel, bytes.Clone(req.Payload), false)
|
||||
if hasOverride {
|
||||
if hasOverride && util.ModelSupportsThinking(req.Model) {
|
||||
if budgetOverride != nil {
|
||||
norm := util.NormalizeThinkingBudget(req.Model, *budgetOverride)
|
||||
budgetOverride = &norm
|
||||
}
|
||||
payload = util.ApplyGeminiCLIThinkingConfig(payload, budgetOverride, includeOverride)
|
||||
}
|
||||
payload = deleteJSONField(payload, "project")
|
||||
payload = deleteJSONField(payload, "model")
|
||||
payload = disableGeminiThinkingConfig(payload, attemptModel)
|
||||
payload = util.StripThinkingConfigIfUnsupported(req.Model, payload)
|
||||
payload = fixGeminiCLIImageAspectRatio(attemptModel, payload)
|
||||
|
||||
tok, errTok := tokenSource.Token()
|
||||
@@ -623,29 +643,6 @@ func cliPreviewFallbackOrder(model string) []string {
|
||||
}
|
||||
}
|
||||
|
||||
func disableGeminiThinkingConfig(body []byte, model string) []byte {
|
||||
if !geminiModelDisallowsThinking(model) {
|
||||
return body
|
||||
}
|
||||
|
||||
updated := deleteJSONField(body, "request.generationConfig.thinkingConfig")
|
||||
updated = deleteJSONField(updated, "generationConfig.thinkingConfig")
|
||||
return updated
|
||||
}
|
||||
|
||||
func geminiModelDisallowsThinking(model string) bool {
|
||||
if model == "" {
|
||||
return false
|
||||
}
|
||||
lower := strings.ToLower(model)
|
||||
for _, marker := range []string{"gemini-2.5-flash-image-preview", "gemini-2.5-flash-image"} {
|
||||
if strings.Contains(lower, marker) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// setJSONField sets a top-level JSON field on a byte slice payload via sjson.
|
||||
func setJSONField(body []byte, key, value string) []byte {
|
||||
if key == "" {
|
||||
|
||||
@@ -78,10 +78,14 @@ func (e *GeminiExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, r
|
||||
from := opts.SourceFormat
|
||||
to := sdktranslator.FromString("gemini")
|
||||
body := sdktranslator.TranslateRequest(from, to, req.Model, bytes.Clone(req.Payload), false)
|
||||
if budgetOverride, includeOverride, ok := util.GeminiThinkingFromMetadata(req.Metadata); ok {
|
||||
if budgetOverride, includeOverride, ok := util.GeminiThinkingFromMetadata(req.Metadata); ok && util.ModelSupportsThinking(req.Model) {
|
||||
if budgetOverride != nil {
|
||||
norm := util.NormalizeThinkingBudget(req.Model, *budgetOverride)
|
||||
budgetOverride = &norm
|
||||
}
|
||||
body = util.ApplyGeminiThinkingConfig(body, budgetOverride, includeOverride)
|
||||
}
|
||||
body = disableGeminiThinkingConfig(body, req.Model)
|
||||
body = util.StripThinkingConfigIfUnsupported(req.Model, body)
|
||||
body = fixGeminiImageAspectRatio(req.Model, body)
|
||||
|
||||
action := "generateContent"
|
||||
@@ -166,10 +170,14 @@ func (e *GeminiExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.A
|
||||
from := opts.SourceFormat
|
||||
to := sdktranslator.FromString("gemini")
|
||||
body := sdktranslator.TranslateRequest(from, to, req.Model, bytes.Clone(req.Payload), true)
|
||||
if budgetOverride, includeOverride, ok := util.GeminiThinkingFromMetadata(req.Metadata); ok {
|
||||
if budgetOverride, includeOverride, ok := util.GeminiThinkingFromMetadata(req.Metadata); ok && util.ModelSupportsThinking(req.Model) {
|
||||
if budgetOverride != nil {
|
||||
norm := util.NormalizeThinkingBudget(req.Model, *budgetOverride)
|
||||
budgetOverride = &norm
|
||||
}
|
||||
body = util.ApplyGeminiThinkingConfig(body, budgetOverride, includeOverride)
|
||||
}
|
||||
body = disableGeminiThinkingConfig(body, req.Model)
|
||||
body = util.StripThinkingConfigIfUnsupported(req.Model, body)
|
||||
body = fixGeminiImageAspectRatio(req.Model, body)
|
||||
|
||||
url := fmt.Sprintf("%s/%s/models/%s:%s", glEndpoint, glAPIVersion, req.Model, "streamGenerateContent")
|
||||
@@ -269,10 +277,14 @@ func (e *GeminiExecutor) CountTokens(ctx context.Context, auth *cliproxyauth.Aut
|
||||
from := opts.SourceFormat
|
||||
to := sdktranslator.FromString("gemini")
|
||||
translatedReq := sdktranslator.TranslateRequest(from, to, req.Model, bytes.Clone(req.Payload), false)
|
||||
if budgetOverride, includeOverride, ok := util.GeminiThinkingFromMetadata(req.Metadata); ok {
|
||||
if budgetOverride, includeOverride, ok := util.GeminiThinkingFromMetadata(req.Metadata); ok && util.ModelSupportsThinking(req.Model) {
|
||||
if budgetOverride != nil {
|
||||
norm := util.NormalizeThinkingBudget(req.Model, *budgetOverride)
|
||||
budgetOverride = &norm
|
||||
}
|
||||
translatedReq = util.ApplyGeminiThinkingConfig(translatedReq, budgetOverride, includeOverride)
|
||||
}
|
||||
translatedReq = disableGeminiThinkingConfig(translatedReq, req.Model)
|
||||
translatedReq = util.StripThinkingConfigIfUnsupported(req.Model, translatedReq)
|
||||
translatedReq = fixGeminiImageAspectRatio(req.Model, translatedReq)
|
||||
respCtx := context.WithValue(ctx, "alt", opts.Alt)
|
||||
translatedReq, _ = sjson.DeleteBytes(translatedReq, "tools")
|
||||
|
||||
@@ -11,6 +11,7 @@ import (
|
||||
"strings"
|
||||
|
||||
client "github.com/router-for-me/CLIProxyAPI/v6/internal/interfaces"
|
||||
"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
|
||||
"github.com/tidwall/gjson"
|
||||
"github.com/tidwall/sjson"
|
||||
)
|
||||
@@ -136,7 +137,7 @@ func ConvertClaudeRequestToCLI(modelName string, inputRawJSON []byte, _ bool) []
|
||||
}
|
||||
|
||||
// Build output Gemini CLI request JSON
|
||||
out := `{"model":"","request":{"contents":[],"generationConfig":{"thinkingConfig":{"include_thoughts":true}}}}`
|
||||
out := `{"model":"","request":{"contents":[]}}`
|
||||
out, _ = sjson.Set(out, "model", modelName)
|
||||
if systemInstruction != nil {
|
||||
b, _ := json.Marshal(systemInstruction)
|
||||
@@ -151,21 +152,16 @@ func ConvertClaudeRequestToCLI(modelName string, inputRawJSON []byte, _ bool) []
|
||||
out, _ = sjson.SetRaw(out, "request.tools", string(b))
|
||||
}
|
||||
|
||||
// Map reasoning and sampling configs
|
||||
reasoningEffortResult := gjson.GetBytes(rawJSON, "reasoning_effort")
|
||||
if reasoningEffortResult.String() == "none" {
|
||||
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.include_thoughts", false)
|
||||
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.thinkingBudget", 0)
|
||||
} else if reasoningEffortResult.String() == "auto" {
|
||||
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.thinkingBudget", -1)
|
||||
} else if reasoningEffortResult.String() == "low" {
|
||||
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.thinkingBudget", 1024)
|
||||
} else if reasoningEffortResult.String() == "medium" {
|
||||
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.thinkingBudget", 8192)
|
||||
} else if reasoningEffortResult.String() == "high" {
|
||||
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.thinkingBudget", 24576)
|
||||
} else {
|
||||
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.thinkingBudget", -1)
|
||||
// Map Anthropic thinking -> Gemini thinkingBudget/include_thoughts when type==enabled
|
||||
if t := gjson.GetBytes(rawJSON, "thinking"); t.Exists() && t.IsObject() && util.ModelSupportsThinking(modelName) {
|
||||
if t.Get("type").String() == "enabled" {
|
||||
if b := t.Get("budget_tokens"); b.Exists() && b.Type == gjson.Number {
|
||||
budget := int(b.Int())
|
||||
budget = util.NormalizeThinkingBudget(modelName, budget)
|
||||
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.thinkingBudget", budget)
|
||||
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
|
||||
}
|
||||
}
|
||||
}
|
||||
if v := gjson.GetBytes(rawJSON, "temperature"); v.Exists() && v.Type == gjson.Number {
|
||||
out, _ = sjson.Set(out, "request.generationConfig.temperature", v.Num)
|
||||
|
||||
@@ -26,32 +26,57 @@ import (
|
||||
// - []byte: The transformed request data in Gemini CLI API format
|
||||
func ConvertOpenAIRequestToGeminiCLI(modelName string, inputRawJSON []byte, _ bool) []byte {
|
||||
rawJSON := bytes.Clone(inputRawJSON)
|
||||
// Base envelope
|
||||
out := []byte(`{"project":"","request":{"contents":[],"generationConfig":{"thinkingConfig":{"include_thoughts":true}}},"model":"gemini-2.5-pro"}`)
|
||||
// Base envelope (no default thinkingConfig)
|
||||
out := []byte(`{"project":"","request":{"contents":[]},"model":"gemini-2.5-pro"}`)
|
||||
|
||||
// Model
|
||||
out, _ = sjson.SetBytes(out, "model", modelName)
|
||||
|
||||
// Reasoning effort -> thinkingBudget/include_thoughts
|
||||
// Note: OpenAI official fields take precedence over extra_body.google.thinking_config
|
||||
re := gjson.GetBytes(rawJSON, "reasoning_effort")
|
||||
if re.Exists() {
|
||||
hasOfficialThinking := re.Exists()
|
||||
if hasOfficialThinking && util.ModelSupportsThinking(modelName) {
|
||||
switch re.String() {
|
||||
case "none":
|
||||
out, _ = sjson.DeleteBytes(out, "request.generationConfig.thinkingConfig.include_thoughts")
|
||||
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", 0)
|
||||
case "auto":
|
||||
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", -1)
|
||||
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
|
||||
case "low":
|
||||
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", 1024)
|
||||
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 1024))
|
||||
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
|
||||
case "medium":
|
||||
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", 8192)
|
||||
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 8192))
|
||||
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
|
||||
case "high":
|
||||
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", 24576)
|
||||
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 32768))
|
||||
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
|
||||
default:
|
||||
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", -1)
|
||||
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
|
||||
}
|
||||
}
|
||||
|
||||
// Cherry Studio extension extra_body.google.thinking_config (effective only when official fields are absent)
|
||||
if !hasOfficialThinking && util.ModelSupportsThinking(modelName) {
|
||||
if tc := gjson.GetBytes(rawJSON, "extra_body.google.thinking_config"); tc.Exists() && tc.IsObject() {
|
||||
var setBudget bool
|
||||
var normalized int
|
||||
if v := tc.Get("thinking_budget"); v.Exists() {
|
||||
normalized = util.NormalizeThinkingBudget(modelName, int(v.Int()))
|
||||
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", normalized)
|
||||
setBudget = true
|
||||
}
|
||||
if v := tc.Get("include_thoughts"); v.Exists() {
|
||||
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", v.Bool())
|
||||
} else if setBudget {
|
||||
if normalized != 0 {
|
||||
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
|
||||
}
|
||||
}
|
||||
}
|
||||
} else {
|
||||
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", -1)
|
||||
}
|
||||
|
||||
// Temperature/top_p/top_k
|
||||
@@ -250,8 +275,34 @@ func ConvertOpenAIRequestToGeminiCLI(modelName string, inputRawJSON []byte, _ bo
|
||||
if t.Get("type").String() == "function" {
|
||||
fn := t.Get("function")
|
||||
if fn.Exists() && fn.IsObject() {
|
||||
parametersJsonSchema, _ := util.RenameKey(fn.Raw, "parameters", "parametersJsonSchema")
|
||||
out, _ = sjson.SetRawBytes(out, fdPath+".-1", []byte(parametersJsonSchema))
|
||||
fnRaw := fn.Raw
|
||||
if fn.Get("parameters").Exists() {
|
||||
renamed, errRename := util.RenameKey(fnRaw, "parameters", "parametersJsonSchema")
|
||||
if errRename != nil {
|
||||
log.Warnf("Failed to rename parameters for tool '%s': %v", fn.Get("name").String(), errRename)
|
||||
} else {
|
||||
fnRaw = renamed
|
||||
}
|
||||
} else {
|
||||
var errSet error
|
||||
fnRaw, errSet = sjson.Set(fnRaw, "parametersJsonSchema.type", "object")
|
||||
if errSet != nil {
|
||||
log.Warnf("Failed to set default schema type for tool '%s': %v", fn.Get("name").String(), errSet)
|
||||
continue
|
||||
}
|
||||
fnRaw, errSet = sjson.Set(fnRaw, "parametersJsonSchema.properties", map[string]interface{}{})
|
||||
if errSet != nil {
|
||||
log.Warnf("Failed to set default schema properties for tool '%s': %v", fn.Get("name").String(), errSet)
|
||||
continue
|
||||
}
|
||||
}
|
||||
|
||||
tmp, errSet := sjson.SetRawBytes(out, fdPath+".-1", []byte(fnRaw))
|
||||
if errSet != nil {
|
||||
log.Warnf("Failed to append tool declaration for '%s': %v", fn.Get("name").String(), errSet)
|
||||
continue
|
||||
}
|
||||
out = tmp
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -11,6 +11,7 @@ import (
|
||||
"strings"
|
||||
|
||||
client "github.com/router-for-me/CLIProxyAPI/v6/internal/interfaces"
|
||||
"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
|
||||
"github.com/tidwall/gjson"
|
||||
"github.com/tidwall/sjson"
|
||||
)
|
||||
@@ -129,7 +130,7 @@ func ConvertClaudeRequestToGemini(modelName string, inputRawJSON []byte, _ bool)
|
||||
}
|
||||
|
||||
// Build output Gemini CLI request JSON
|
||||
out := `{"contents":[],"generationConfig":{"thinkingConfig":{"include_thoughts":true}}}`
|
||||
out := `{"contents":[]}`
|
||||
out, _ = sjson.Set(out, "model", modelName)
|
||||
if systemInstruction != nil {
|
||||
b, _ := json.Marshal(systemInstruction)
|
||||
@@ -144,21 +145,16 @@ func ConvertClaudeRequestToGemini(modelName string, inputRawJSON []byte, _ bool)
|
||||
out, _ = sjson.SetRaw(out, "tools", string(b))
|
||||
}
|
||||
|
||||
// Map reasoning and sampling configs
|
||||
reasoningEffortResult := gjson.GetBytes(rawJSON, "reasoning_effort")
|
||||
if reasoningEffortResult.String() == "none" {
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", false)
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", 0)
|
||||
} else if reasoningEffortResult.String() == "auto" {
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", -1)
|
||||
} else if reasoningEffortResult.String() == "low" {
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", 1024)
|
||||
} else if reasoningEffortResult.String() == "medium" {
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", 8192)
|
||||
} else if reasoningEffortResult.String() == "high" {
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", 24576)
|
||||
} else {
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", -1)
|
||||
// Map Anthropic thinking -> Gemini thinkingBudget/include_thoughts when enabled
|
||||
if t := gjson.GetBytes(rawJSON, "thinking"); t.Exists() && t.IsObject() && util.ModelSupportsThinking(modelName) {
|
||||
if t.Get("type").String() == "enabled" {
|
||||
if b := t.Get("budget_tokens"); b.Exists() && b.Type == gjson.Number {
|
||||
budget := int(b.Int())
|
||||
budget = util.NormalizeThinkingBudget(modelName, budget)
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", budget)
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", true)
|
||||
}
|
||||
}
|
||||
}
|
||||
if v := gjson.GetBytes(rawJSON, "temperature"); v.Exists() && v.Type == gjson.Number {
|
||||
out, _ = sjson.Set(out, "generationConfig.temperature", v.Num)
|
||||
|
||||
@@ -26,32 +26,58 @@ import (
|
||||
// - []byte: The transformed request data in Gemini API format
|
||||
func ConvertOpenAIRequestToGemini(modelName string, inputRawJSON []byte, _ bool) []byte {
|
||||
rawJSON := bytes.Clone(inputRawJSON)
|
||||
// Base envelope
|
||||
out := []byte(`{"contents":[],"generationConfig":{"thinkingConfig":{"include_thoughts":true}}}`)
|
||||
// Base envelope (no default thinkingConfig)
|
||||
out := []byte(`{"contents":[]}`)
|
||||
|
||||
// Model
|
||||
out, _ = sjson.SetBytes(out, "model", modelName)
|
||||
|
||||
// Reasoning effort -> thinkingBudget/include_thoughts
|
||||
// Note: OpenAI official fields take precedence over extra_body.google.thinking_config
|
||||
re := gjson.GetBytes(rawJSON, "reasoning_effort")
|
||||
if re.Exists() {
|
||||
hasOfficialThinking := re.Exists()
|
||||
if hasOfficialThinking && util.ModelSupportsThinking(modelName) {
|
||||
switch re.String() {
|
||||
case "none":
|
||||
out, _ = sjson.DeleteBytes(out, "generationConfig.thinkingConfig.include_thoughts")
|
||||
out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", 0)
|
||||
case "auto":
|
||||
out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", -1)
|
||||
out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.include_thoughts", true)
|
||||
case "low":
|
||||
out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", 1024)
|
||||
out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 1024))
|
||||
out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.include_thoughts", true)
|
||||
case "medium":
|
||||
out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", 8192)
|
||||
out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 8192))
|
||||
out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.include_thoughts", true)
|
||||
case "high":
|
||||
out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", 24576)
|
||||
out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 32768))
|
||||
out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.include_thoughts", true)
|
||||
default:
|
||||
out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", -1)
|
||||
out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.include_thoughts", true)
|
||||
}
|
||||
}
|
||||
|
||||
// Cherry Studio extension extra_body.google.thinking_config (effective only when official fields are absent)
|
||||
if !hasOfficialThinking && util.ModelSupportsThinking(modelName) {
|
||||
if tc := gjson.GetBytes(rawJSON, "extra_body.google.thinking_config"); tc.Exists() && tc.IsObject() {
|
||||
var setBudget bool
|
||||
var normalized int
|
||||
if v := tc.Get("thinking_budget"); v.Exists() {
|
||||
// Normalize budget to model range
|
||||
normalized = util.NormalizeThinkingBudget(modelName, int(v.Int()))
|
||||
out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", normalized)
|
||||
setBudget = true
|
||||
}
|
||||
if v := tc.Get("include_thoughts"); v.Exists() {
|
||||
out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.include_thoughts", v.Bool())
|
||||
} else if setBudget {
|
||||
if normalized != 0 {
|
||||
out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.include_thoughts", true)
|
||||
}
|
||||
}
|
||||
}
|
||||
} else {
|
||||
out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", -1)
|
||||
}
|
||||
|
||||
// Temperature/top_p/top_k
|
||||
|
||||
@@ -4,6 +4,7 @@ import (
|
||||
"bytes"
|
||||
"strings"
|
||||
|
||||
"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
|
||||
"github.com/tidwall/gjson"
|
||||
"github.com/tidwall/sjson"
|
||||
)
|
||||
@@ -15,8 +16,8 @@ func ConvertOpenAIResponsesRequestToGemini(modelName string, inputRawJSON []byte
|
||||
_ = modelName // Unused but required by interface
|
||||
_ = stream // Unused but required by interface
|
||||
|
||||
// Base Gemini API template
|
||||
out := `{"contents":[],"generationConfig":{"thinkingConfig":{"include_thoughts":true}}}`
|
||||
// Base Gemini API template (do not include thinkingConfig by default)
|
||||
out := `{"contents":[]}`
|
||||
|
||||
root := gjson.ParseBytes(rawJSON)
|
||||
|
||||
@@ -242,23 +243,52 @@ func ConvertOpenAIResponsesRequestToGemini(modelName string, inputRawJSON []byte
|
||||
out, _ = sjson.Set(out, "generationConfig.stopSequences", sequences)
|
||||
}
|
||||
|
||||
if reasoningEffort := root.Get("reasoning.effort"); reasoningEffort.Exists() {
|
||||
// OpenAI official reasoning fields take precedence
|
||||
hasOfficialThinking := root.Get("reasoning.effort").Exists()
|
||||
if hasOfficialThinking && util.ModelSupportsThinking(modelName) {
|
||||
reasoningEffort := root.Get("reasoning.effort")
|
||||
switch reasoningEffort.String() {
|
||||
case "none":
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", false)
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", 0)
|
||||
case "auto":
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", -1)
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", true)
|
||||
case "minimal":
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", 1024)
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 1024))
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", true)
|
||||
case "low":
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", 4096)
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 4096))
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", true)
|
||||
case "medium":
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", 8192)
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 8192))
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", true)
|
||||
case "high":
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", 24576)
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 32768))
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", true)
|
||||
default:
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", -1)
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", true)
|
||||
}
|
||||
}
|
||||
|
||||
// Cherry Studio extension (applies only when official fields are missing)
|
||||
if !hasOfficialThinking && util.ModelSupportsThinking(modelName) {
|
||||
if tc := root.Get("extra_body.google.thinking_config"); tc.Exists() && tc.IsObject() {
|
||||
var setBudget bool
|
||||
var normalized int
|
||||
if v := tc.Get("thinking_budget"); v.Exists() {
|
||||
normalized = util.NormalizeThinkingBudget(modelName, int(v.Int()))
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", normalized)
|
||||
setBudget = true
|
||||
}
|
||||
if v := tc.Get("include_thoughts"); v.Exists() {
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", v.Bool())
|
||||
} else if setBudget {
|
||||
if normalized != 0 {
|
||||
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", true)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
return []byte(out)
|
||||
|
||||
@@ -179,3 +179,19 @@ func GeminiThinkingFromMetadata(metadata map[string]any) (*int, *bool, bool) {
|
||||
}
|
||||
return budgetPtr, includePtr, matched
|
||||
}
|
||||
|
||||
// StripThinkingConfigIfUnsupported removes thinkingConfig from the request body
|
||||
// when the target model does not advertise Thinking capability. It cleans both
|
||||
// standard Gemini and Gemini CLI JSON envelopes. This acts as a final safety net
|
||||
// in case upstream injected thinking for an unsupported model.
|
||||
func StripThinkingConfigIfUnsupported(model string, body []byte) []byte {
|
||||
if ModelSupportsThinking(model) || len(body) == 0 {
|
||||
return body
|
||||
}
|
||||
updated := body
|
||||
// Gemini CLI path
|
||||
updated, _ = sjson.DeleteBytes(updated, "request.generationConfig.thinkingConfig")
|
||||
// Standard Gemini path
|
||||
updated, _ = sjson.DeleteBytes(updated, "generationConfig.thinkingConfig")
|
||||
return updated
|
||||
}
|
||||
|
||||
69
internal/util/thinking.go
Normal file
69
internal/util/thinking.go
Normal file
@@ -0,0 +1,69 @@
|
||||
package util
|
||||
|
||||
import (
|
||||
"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
|
||||
)
|
||||
|
||||
// ModelSupportsThinking reports whether the given model has Thinking capability
|
||||
// according to the model registry metadata (provider-agnostic).
|
||||
func ModelSupportsThinking(model string) bool {
|
||||
if model == "" {
|
||||
return false
|
||||
}
|
||||
if info := registry.GetGlobalRegistry().GetModelInfo(model); info != nil {
|
||||
return info.Thinking != nil
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// NormalizeThinkingBudget clamps the requested thinking budget to the
|
||||
// supported range for the specified model using registry metadata only.
|
||||
// If the model is unknown or has no Thinking metadata, returns the original budget.
|
||||
// For dynamic (-1), returns -1 if DynamicAllowed; otherwise approximates mid-range
|
||||
// or min (0 if zero is allowed and mid <= 0).
|
||||
func NormalizeThinkingBudget(model string, budget int) int {
|
||||
if budget == -1 { // dynamic
|
||||
if found, min, max, zeroAllowed, dynamicAllowed := thinkingRangeFromRegistry(model); found {
|
||||
if dynamicAllowed {
|
||||
return -1
|
||||
}
|
||||
mid := (min + max) / 2
|
||||
if mid <= 0 && zeroAllowed {
|
||||
return 0
|
||||
}
|
||||
if mid <= 0 {
|
||||
return min
|
||||
}
|
||||
return mid
|
||||
}
|
||||
return -1
|
||||
}
|
||||
if found, min, max, zeroAllowed, _ := thinkingRangeFromRegistry(model); found {
|
||||
if budget == 0 {
|
||||
if zeroAllowed {
|
||||
return 0
|
||||
}
|
||||
return min
|
||||
}
|
||||
if budget < min {
|
||||
return min
|
||||
}
|
||||
if budget > max {
|
||||
return max
|
||||
}
|
||||
return budget
|
||||
}
|
||||
return budget
|
||||
}
|
||||
|
||||
// thinkingRangeFromRegistry attempts to read thinking ranges from the model registry.
|
||||
func thinkingRangeFromRegistry(model string) (found bool, min int, max int, zeroAllowed bool, dynamicAllowed bool) {
|
||||
if model == "" {
|
||||
return false, 0, 0, false, false
|
||||
}
|
||||
info := registry.GetGlobalRegistry().GetModelInfo(model)
|
||||
if info == nil || info.Thinking == nil {
|
||||
return false, 0, 0, false, false
|
||||
}
|
||||
return true, info.Thinking.Min, info.Thinking.Max, info.Thinking.ZeroAllowed, info.Thinking.DynamicAllowed
|
||||
}
|
||||
@@ -872,6 +872,11 @@ func (m *Manager) persist(ctx context.Context, auth *Auth) error {
|
||||
if m.store == nil || auth == nil {
|
||||
return nil
|
||||
}
|
||||
if auth.Attributes != nil {
|
||||
if v := strings.ToLower(strings.TrimSpace(auth.Attributes["runtime_only"])); v == "true" {
|
||||
return nil
|
||||
}
|
||||
}
|
||||
// Skip persistence when metadata is absent (e.g., runtime-only auths).
|
||||
if auth.Metadata == nil {
|
||||
return nil
|
||||
|
||||
@@ -210,13 +210,14 @@ func (s *Service) wsOnConnected(channelID string) {
|
||||
}
|
||||
now := time.Now().UTC()
|
||||
auth := &coreauth.Auth{
|
||||
ID: channelID, // keep channel identifier as ID
|
||||
Provider: "aistudio", // logical provider for switch routing
|
||||
Label: channelID, // display original channel id
|
||||
Status: coreauth.StatusActive,
|
||||
CreatedAt: now,
|
||||
UpdatedAt: now,
|
||||
Metadata: map[string]any{"email": channelID}, // inject email inline
|
||||
ID: channelID, // keep channel identifier as ID
|
||||
Provider: "aistudio", // logical provider for switch routing
|
||||
Label: channelID, // display original channel id
|
||||
Status: coreauth.StatusActive,
|
||||
CreatedAt: now,
|
||||
UpdatedAt: now,
|
||||
Attributes: map[string]string{"runtime_only": "true"},
|
||||
Metadata: map[string]any{"email": channelID}, // metadata drives logging and usage tracking
|
||||
}
|
||||
log.Infof("websocket provider connected: %s", channelID)
|
||||
s.applyCoreAuthAddOrUpdate(context.Background(), auth)
|
||||
|
||||
Reference in New Issue
Block a user