Merge pull request #186 from router-for-me/doc

docs: add AI Studio setup
feat(cliproxy): skip persisting runtime-only websocket auths
2025-10-29 21:53:49 +08:00 · 2025-10-29 21:49:35 +08:00 · 2025-10-29 21:10:14 +08:00 · 2025-10-29 20:27:07 +08:00 · 2025-10-29 19:19:18 +08:00 · 2025-10-29 19:19:18 +08:00
18 changed files with 425 additions and 142 deletions
--- a/README.md
+++ b/README.md
@@ -23,6 +23,7 @@ Chinese providers have now been added: [Qwen Code](https://github.com/QwenLM/qwe
 - Multiple accounts with round-robin load balancing (Gemini, OpenAI, Claude, Qwen and iFlow)
 - Simple CLI authentication flows (Gemini, OpenAI, Claude, Qwen and iFlow)
 - Generative Language API Key support
+- AI Studio Build multi-account load balancing
 - Gemini CLI multi-account load balancing
 - Claude Code multi-account load balancing
 - Qwen Code multi-account load balancing
@@ -68,6 +69,14 @@ brew install cliproxyapi
 brew services start cliproxyapi
 ```

+### Installation via CLIProxyAPI Linux Installer
+
+```bash
+curl -fsSL https://raw.githubusercontent.com/brokechubb/cliproxyapi-installer/refs/heads/master/cliproxyapi-installer | bash
+```
+
+Thanks to [brokechubb](https://github.com/brokechubb) for building the Linux installer!
+
 ## Usage

 ### GUI Client & Official WebUI
@@ -260,12 +269,16 @@ console.log(await claudeResponse.json());
 - gemini-2.5-flash-lite
 - gemini-2.5-flash-image
 - gemini-2.5-flash-image-preview
+- gemini-pro-latest
+- gemini-flash-latest
+- gemini-flash-lite-latest
 - gpt-5
 - gpt-5-codex
 - claude-opus-4-1-20250805
 - claude-opus-4-20250514
 - claude-sonnet-4-20250514
 - claude-sonnet-4-5-20250929
+- claude-haiku-4-5-20251001
 - claude-3-7-sonnet-20250219
 - claude-3-5-haiku-20241022
 - qwen3-coder-plus
@@ -277,7 +290,6 @@ console.log(await claudeResponse.json());
 - deepseek-r1
 - deepseek-v3
 - kimi-k2
- glm-4.5
 - glm-4.6
 - tstars2.0
 - And other iFlow-supported models
@@ -510,28 +522,37 @@ openai-compatibility:
        alias: "kimi-k2"
 ```

-Legacy format (still supported):
-
-```yaml
-openai-compatibility:
-  - name: "openrouter"
-    base-url: "https://openrouter.ai/api/v1"
-    api-keys:
-      - "sk-or-v1-...b780"
-      - "sk-or-v1-...b781"
-    models:
-      - name: "moonshotai/kimi-k2:free"
-        alias: "kimi-k2"
-```
-
 Usage: 

 Call OpenAI's endpoint `/v1/chat/completions` with `model` set to the alias (e.g., `kimi-k2`). The proxy routes to the configured provider/model automatically.

-Also, you may call Claude's endpoint `/v1/messages`, Gemini's `/v1beta/models/model-name:streamGenerateContent` or `/v1beta/models/model-name:generateContent`.
-
 And you can always use Gemini CLI with `CODE_ASSIST_ENDPOINT` set to `http://127.0.0.1:8317` for these OpenAI-compatible provider's models.

+### AI Studio Instructions
+
+You can use this service (CLIProxyAPI) as a backend for [this AI Studio App](https://aistudio.google.com/apps/drive/1CPW7FpWGsDZzkaYgYOyXQ_6FWgxieLmL). Follow the steps below to configure it:
+
+1.  **Start the CLIProxyAPI Service**: Ensure your CLIProxyAPI instance is running, either locally or remotely.
+2.  **Access the AI Studio App**: Log in to your Google account in your browser, then open the following link:
+    - [https://aistudio.google.com/apps/drive/1CPW7FpWGsDZzkaYgYOyXQ_6FWgxieLmL](https://aistudio.google.com/apps/drive/1CPW7FpWGsDZzkaYgYOyXQ_6FWgxieLmL)
+
+#### Connection Configuration
+
+By default, the AI Studio App attempts to connect to a local CLIProxyAPI instance at `ws://127.0.0.1:8317`.
+
+-   **Connecting to a Remote Service**:
+    If you need to connect to a remotely deployed CLIProxyAPI, modify the `config.ts` file in the AI Studio App to update the `WEBSOCKET_PROXY_URL` value.
+    -   Use the `wss://` protocol if your remote service has SSL enabled.
+    -   Use the `ws://` protocol if SSL is not enabled.
+
+#### Authentication Configuration
+
+By default, WebSocket connections to CLIProxyAPI do not require authentication.
+
+-   **Enable Authentication on the CLIProxyAPI Server**:
+    In your `config.yaml` file, set `ws_auth` to `true`.
+-   **Configure Authentication on the AI Studio Client**:
+    In the `config.ts` file of the AI Studio App, set the `JWT_TOKEN` value to your authentication token.

 ### Authentication Directory

--- a/README_CN.md
+++ b/README_CN.md
@@ -43,6 +43,7 @@
 - 多账户支持与轮询负载均衡（Gemini、OpenAI、Claude、Qwen 与 iFlow）
 - 简单的 CLI 身份验证流程（Gemini、OpenAI、Claude、Qwen 与 iFlow）
 - 支持 Gemini AIStudio API 密钥
+- 支持 AI Studio Build 多账户轮询
 - 支持 Gemini CLI 多账户轮询
 - 支持 Claude Code 多账户轮询
 - 支持 Qwen Code 多账户轮询
@@ -82,6 +83,14 @@ brew install cliproxyapi
 brew services start cliproxyapi
 ```

+### 通过 CLIProxyAPI Linux Installer 安装
+
+```bash
+curl -fsSL https://raw.githubusercontent.com/brokechubb/cliproxyapi-installer/refs/heads/master/cliproxyapi-installer | bash
+```
+
+感谢 [brokechubb](https://github.com/brokechubb) 构建了 Linux installer！
+
 ## 使用方法

 ### 图形客户端与官方 WebUI
@@ -273,12 +282,16 @@ console.log(await claudeResponse.json());
 - gemini-2.5-flash-lite
 - gemini-2.5-flash-image
 - gemini-2.5-flash-image-preview
+- gemini-pro-latest
+- gemini-flash-latest
+- gemini-flash-lite-latest
 - gpt-5
 - gpt-5-codex
 - claude-opus-4-1-20250805
 - claude-opus-4-20250514
 - claude-sonnet-4-20250514
 - claude-sonnet-4-5-20250929
+- claude-haiku-4-5-20251001
 - claude-3-7-sonnet-20250219
 - claude-3-5-haiku-20241022
 - qwen3-coder-plus
@@ -290,7 +303,6 @@ console.log(await claudeResponse.json());
 - deepseek-r1
 - deepseek-v3
 - kimi-k2
- glm-4.5
 - glm-4.6
 - tstars2.0
 - 以及其他 iFlow 支持的模型
@@ -523,24 +535,36 @@ openai-compatibility:
        alias: "kimi-k2"
 ```

-旧格式（仍支持）：
-
-```yaml
-openai-compatibility:
-  - name: "openrouter"
-    base-url: "https://openrouter.ai/api/v1"
-    api-keys:
-      - "sk-or-v1-...b780"
-      - "sk-or-v1-...b781"
-    models:
-      - name: "moonshotai/kimi-k2:free"
-        alias: "kimi-k2"
-```
-
 使用方式：在 `/v1/chat/completions` 中将 `model` 设为别名（如 `kimi-k2`），代理将自动路由到对应提供商与模型。

 并且，对于这些与OpenAI兼容的提供商模型，您始终可以通过将CODE_ASSIST_ENDPOINT设置为 http://127.0.0.1:8317 来使用Gemini CLI。

+### AI Studio 使用说明
+
+您可以将本服务 (CLIProxyAPI) 作为后端，配合 [这个 AI Studio 应用](https://aistudio.google.com/apps/drive/1CPW7FpWGsDZzkaYgYOyXQ_6FWgxieLmL) 使用。请遵循以下步骤进行配置：
+
+1.  **启动 CLIProxyAPI 服务**：确保您的 CLIProxyAPI 实例正在本地或远程运行。
+2.  **访问 AI Studio 应用**：在浏览器中登录您的 Google 账户，然后打开以下链接：
+    - [https://aistudio.google.com/apps/drive/1CPW7FpWGsDZzkaYgYOyXQ_6FWgxieLmL](https://aistudio.google.com/apps/drive/1CPW7FpWGsDZzkaYgYOyXQ_6FWgxieLmL)
+
+#### 连接配置
+
+默认情况下，AI Studio 应用会尝试连接到本地的 CLIProxyAPI (`ws://127.0.0.1:8317`)。
+
+-   **连接到远程服务**：
+    如果您需要连接到远程部署的 CLIProxyAPI，请修改 AI Studio 应用中的 `config.ts` 文件，更新 `WEBSOCKET_PROXY_URL` 的值。
+    -   如果您的远程服务启用了 SSL，请使用 `wss://` 协议。
+    -   如果未启用 SSL，请使用 `ws://` 协议。
+
+#### 认证配置
+
+默认情况下，CLIProxyAPI 的 WebSocket 连接不要求认证。
+
+-   **在 CLIProxyAPI 服务端启用认证**：
+    在您的 `config.yaml` 文件中，将 `ws_auth` 设置为 `true`。
+-   **在 AI Studio 客户端配置认证**：
+    在 AI Studio 应用的 `config.ts` 文件中，设置 `JWT_TOKEN` 的值为您的认证令牌。
+
 ### 身份验证目录

 `auth-dir` 参数指定身份验证令牌的存储位置。当您运行登录命令时，应用程序将在此目录中创建包含 Google 账户身份验证令牌的 JSON 文件。多个账户可用于轮询。
--- a/internal/registry/model_definitions.go
+++ b/internal/registry/model_definitions.go
@@ -84,6 +84,7 @@ func GeminiModels() []*ModelInfo {
 			InputTokenLimit:            1048576,
 			OutputTokenLimit:           65536,
 			SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
+			Thinking:                   &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
 		},
 		{
 			ID:                         "gemini-2.5-pro",
@@ -98,6 +99,7 @@ func GeminiModels() []*ModelInfo {
 			InputTokenLimit:            1048576,
 			OutputTokenLimit:           65536,
 			SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
+			Thinking:                   &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true},
 		},
 		{
 			ID:                         "gemini-2.5-flash-lite",
@@ -112,6 +114,7 @@ func GeminiModels() []*ModelInfo {
 			InputTokenLimit:            1048576,
 			OutputTokenLimit:           65536,
 			SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
+			Thinking:                   &ThinkingSupport{Min: 512, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
 		},
 		{
 			ID:                         "gemini-2.5-flash-image-preview",
@@ -126,6 +129,7 @@ func GeminiModels() []*ModelInfo {
 			InputTokenLimit:            1048576,
 			OutputTokenLimit:           8192,
 			SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
+			// image models don't support thinkingConfig; leave Thinking nil
 		},
 		{
 			ID:                         "gemini-2.5-flash-image",
@@ -140,6 +144,7 @@ func GeminiModels() []*ModelInfo {
 			InputTokenLimit:            1048576,
 			OutputTokenLimit:           8192,
 			SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
+			// image models don't support thinkingConfig; leave Thinking nil
 		},
 	}
 }
@@ -152,9 +157,8 @@ func GetGeminiCLIModels() []*ModelInfo { return GeminiModels() }

 // GetAIStudioModels returns the Gemini model definitions for AI Studio integrations
 func GetAIStudioModels() []*ModelInfo {
-	models := make([]*ModelInfo, 0, 8)
-	models = append(models, GeminiModels()...)
-	models = append(models,
+	base := GeminiModels()
+	return append(base,
 		&ModelInfo{
 			ID:                         "gemini-pro-latest",
 			Object:                     "model",
@@ -168,6 +172,7 @@ func GetAIStudioModels() []*ModelInfo {
 			InputTokenLimit:            1048576,
 			OutputTokenLimit:           65536,
 			SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
+			Thinking:                   &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true},
 		},
 		&ModelInfo{
 			ID:                         "gemini-flash-latest",
@@ -182,6 +187,7 @@ func GetAIStudioModels() []*ModelInfo {
 			InputTokenLimit:            1048576,
 			OutputTokenLimit:           65536,
 			SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
+			Thinking:                   &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
 		},
 		&ModelInfo{
 			ID:                         "gemini-flash-lite-latest",
@@ -196,9 +202,9 @@ func GetAIStudioModels() []*ModelInfo {
 			InputTokenLimit:            1048576,
 			OutputTokenLimit:           65536,
 			SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
+			Thinking:                   &ThinkingSupport{Min: 512, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
 		},
 	)
-	return models
 }

 // GetOpenAIModels returns the standard OpenAI model definitions
--- a/internal/registry/model_registry.go
+++ b/internal/registry/model_registry.go
@@ -45,6 +45,23 @@ type ModelInfo struct {
 	MaxCompletionTokens int `json:"max_completion_tokens,omitempty"`
 	// SupportedParameters lists supported parameters
 	SupportedParameters []string `json:"supported_parameters,omitempty"`
+
+	// Thinking holds provider-specific reasoning/thinking budget capabilities.
+	// This is optional and currently used for Gemini thinking budget normalization.
+	Thinking *ThinkingSupport `json:"thinking,omitempty"`
+}
+
+// ThinkingSupport describes a model family's supported internal reasoning budget range.
+// Values are interpreted in provider-native token units.
+type ThinkingSupport struct {
+	// Min is the minimum allowed thinking budget (inclusive).
+	Min int `json:"min,omitempty"`
+	// Max is the maximum allowed thinking budget (inclusive).
+	Max int `json:"max,omitempty"`
+	// ZeroAllowed indicates whether 0 is a valid value (to disable thinking).
+	ZeroAllowed bool `json:"zero_allowed,omitempty"`
+	// DynamicAllowed indicates whether -1 is a valid value (dynamic thinking budget).
+	DynamicAllowed bool `json:"dynamic_allowed,omitempty"`
 }

 // ModelRegistration tracks a model's availability
@@ -652,6 +669,17 @@ func (r *ModelRegistry) GetModelProviders(modelID string) []string {
 	return result
 }

+// GetModelInfo returns the registered ModelInfo for the given model ID, if present.
+// Returns nil if the model is unknown to the registry.
+func (r *ModelRegistry) GetModelInfo(modelID string) *ModelInfo {
+	r.mutex.RLock()
+	defer r.mutex.RUnlock()
+	if reg, ok := r.models[modelID]; ok && reg != nil {
+		return reg.Info
+	}
+	return nil
+}
+
 // convertModelToMap converts ModelInfo to the appropriate format for different handler types
 func (r *ModelRegistry) convertModelToMap(model *ModelInfo, handlerType string) map[string]any {
 	if model == nil {
--- a/internal/runtime/executor/aistudio_executor.go
+++ b/internal/runtime/executor/aistudio_executor.go
@@ -256,10 +256,14 @@ func (e *AIStudioExecutor) translateRequest(req cliproxyexecutor.Request, opts c
 	from := opts.SourceFormat
 	to := sdktranslator.FromString("gemini")
 	payload := sdktranslator.TranslateRequest(from, to, req.Model, bytes.Clone(req.Payload), stream)
-	if budgetOverride, includeOverride, ok := util.GeminiThinkingFromMetadata(req.Metadata); ok {
+	if budgetOverride, includeOverride, ok := util.GeminiThinkingFromMetadata(req.Metadata); ok && util.ModelSupportsThinking(req.Model) {
+		if budgetOverride != nil {
+			norm := util.NormalizeThinkingBudget(req.Model, *budgetOverride)
+			budgetOverride = &norm
+		}
 		payload = util.ApplyGeminiThinkingConfig(payload, budgetOverride, includeOverride)
 	}
-	payload = disableGeminiThinkingConfig(payload, req.Model)
+	payload = util.StripThinkingConfigIfUnsupported(req.Model, payload)
 	payload = fixGeminiImageAspectRatio(req.Model, payload)
 	metadataAction := "generateContent"
 	if req.Metadata != nil {
--- a/internal/runtime/executor/claude_executor.go
+++ b/internal/runtime/executor/claude_executor.go
@@ -551,8 +551,8 @@ func applyClaudeHeaders(r *http.Request, apiKey string, stream bool) {
 	misc.EnsureHeader(r.Header, ginHeaders, "X-Stainless-Arch", "arm64")
 	misc.EnsureHeader(r.Header, ginHeaders, "X-Stainless-Os", "MacOS")
 	misc.EnsureHeader(r.Header, ginHeaders, "X-Stainless-Timeout", "60")
+	misc.EnsureHeader(r.Header, ginHeaders, "User-Agent", "claude-cli/1.0.83 (external, cli)")
 	r.Header.Set("Connection", "keep-alive")
-	r.Header.Set("User-Agent", "claude-cli/1.0.83 (external, cli)")
 	r.Header.Set("Accept-Encoding", "gzip, deflate, br, zstd")
 	if stream {
 		r.Header.Set("Accept", "text/event-stream")
--- a/internal/runtime/executor/codex_executor.go
+++ b/internal/runtime/executor/codex_executor.go
@@ -532,6 +532,7 @@ func applyCodexHeaders(r *http.Request, auth *cliproxyauth.Auth, token string) {
 	misc.EnsureHeader(r.Header, ginHeaders, "Version", "0.21.0")
 	misc.EnsureHeader(r.Header, ginHeaders, "Openai-Beta", "responses=experimental")
 	misc.EnsureHeader(r.Header, ginHeaders, "Session_id", uuid.NewString())
+	misc.EnsureHeader(r.Header, ginHeaders, "User-Agent", "codex_cli_rs/0.50.0 (Mac OS 26.0.1; arm64) Apple_Terminal/464")

 	r.Header.Set("Accept", "text/event-stream")
 	r.Header.Set("Connection", "Keep-Alive")
--- a/internal/runtime/executor/gemini_cli_executor.go
+++ b/internal/runtime/executor/gemini_cli_executor.go
@@ -63,9 +63,14 @@ func (e *GeminiCLIExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
 	to := sdktranslator.FromString("gemini-cli")
 	budgetOverride, includeOverride, hasOverride := util.GeminiThinkingFromMetadata(req.Metadata)
 	basePayload := sdktranslator.TranslateRequest(from, to, req.Model, bytes.Clone(req.Payload), false)
-	if hasOverride {
+	if hasOverride && util.ModelSupportsThinking(req.Model) {
+		if budgetOverride != nil {
+			norm := util.NormalizeThinkingBudget(req.Model, *budgetOverride)
+			budgetOverride = &norm
+		}
 		basePayload = util.ApplyGeminiCLIThinkingConfig(basePayload, budgetOverride, includeOverride)
 	}
+	basePayload = util.StripThinkingConfigIfUnsupported(req.Model, basePayload)
 	basePayload = fixGeminiCLIImageAspectRatio(req.Model, basePayload)

 	action := "generateContent"
@@ -92,7 +97,7 @@ func (e *GeminiCLIExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
 	var lastStatus int
 	var lastBody []byte

-	for _, attemptModel := range models {
+	for idx, attemptModel := range models {
 		payload := append([]byte(nil), basePayload...)
 		if action == "countTokens" {
 			payload = deleteJSONField(payload, "project")
@@ -101,7 +106,6 @@ func (e *GeminiCLIExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
 			payload = setJSONField(payload, "project", projectID)
 			payload = setJSONField(payload, "model", attemptModel)
 		}
-		payload = disableGeminiThinkingConfig(payload, attemptModel)

 		tok, errTok := tokenSource.Token()
 		if errTok != nil {
@@ -166,7 +170,11 @@ func (e *GeminiCLIExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
 		lastBody = append([]byte(nil), data...)
 		log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, string(data))
 		if httpResp.StatusCode == 429 {
-			log.Debugf("gemini cli executor: rate limited, retrying with next model")
+			if idx+1 < len(models) {
+				log.Debugf("gemini cli executor: rate limited, retrying with next model: %s", models[idx+1])
+			} else {
+				log.Debug("gemini cli executor: rate limited, no additional fallback model")
+			}
 			continue
 		}

@@ -196,9 +204,14 @@ func (e *GeminiCLIExecutor) ExecuteStream(ctx context.Context, auth *cliproxyaut
 	to := sdktranslator.FromString("gemini-cli")
 	budgetOverride, includeOverride, hasOverride := util.GeminiThinkingFromMetadata(req.Metadata)
 	basePayload := sdktranslator.TranslateRequest(from, to, req.Model, bytes.Clone(req.Payload), true)
-	if hasOverride {
+	if hasOverride && util.ModelSupportsThinking(req.Model) {
+		if budgetOverride != nil {
+			norm := util.NormalizeThinkingBudget(req.Model, *budgetOverride)
+			budgetOverride = &norm
+		}
 		basePayload = util.ApplyGeminiCLIThinkingConfig(basePayload, budgetOverride, includeOverride)
 	}
+	basePayload = util.StripThinkingConfigIfUnsupported(req.Model, basePayload)
 	basePayload = fixGeminiCLIImageAspectRatio(req.Model, basePayload)

 	projectID := strings.TrimSpace(stringValue(auth.Metadata, "project_id"))
@@ -219,11 +232,10 @@ func (e *GeminiCLIExecutor) ExecuteStream(ctx context.Context, auth *cliproxyaut
 	var lastStatus int
 	var lastBody []byte

-	for _, attemptModel := range models {
+	for idx, attemptModel := range models {
 		payload := append([]byte(nil), basePayload...)
 		payload = setJSONField(payload, "project", projectID)
 		payload = setJSONField(payload, "model", attemptModel)
-		payload = disableGeminiThinkingConfig(payload, attemptModel)

 		tok, errTok := tokenSource.Token()
 		if errTok != nil {
@@ -282,7 +294,11 @@ func (e *GeminiCLIExecutor) ExecuteStream(ctx context.Context, auth *cliproxyaut
 			lastBody = append([]byte(nil), data...)
 			log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, string(data))
 			if httpResp.StatusCode == 429 {
-				log.Debugf("gemini cli executor: rate limited, retrying with next model")
+				if idx+1 < len(models) {
+					log.Debugf("gemini cli executor: rate limited, retrying with next model: %s", models[idx+1])
+				} else {
+					log.Debug("gemini cli executor: rate limited, no additional fallback model")
+				}
 				continue
 			}
 			err = statusErr{code: httpResp.StatusCode, msg: string(data)}
@@ -393,12 +409,16 @@ func (e *GeminiCLIExecutor) CountTokens(ctx context.Context, auth *cliproxyauth.
 	budgetOverride, includeOverride, hasOverride := util.GeminiThinkingFromMetadata(req.Metadata)
 	for _, attemptModel := range models {
 		payload := sdktranslator.TranslateRequest(from, to, attemptModel, bytes.Clone(req.Payload), false)
-		if hasOverride {
+		if hasOverride && util.ModelSupportsThinking(req.Model) {
+			if budgetOverride != nil {
+				norm := util.NormalizeThinkingBudget(req.Model, *budgetOverride)
+				budgetOverride = &norm
+			}
 			payload = util.ApplyGeminiCLIThinkingConfig(payload, budgetOverride, includeOverride)
 		}
 		payload = deleteJSONField(payload, "project")
 		payload = deleteJSONField(payload, "model")
-		payload = disableGeminiThinkingConfig(payload, attemptModel)
+		payload = util.StripThinkingConfigIfUnsupported(req.Model, payload)
 		payload = fixGeminiCLIImageAspectRatio(attemptModel, payload)

 		tok, errTok := tokenSource.Token()
@@ -623,29 +643,6 @@ func cliPreviewFallbackOrder(model string) []string {
 	}
 }

-func disableGeminiThinkingConfig(body []byte, model string) []byte {
-	if !geminiModelDisallowsThinking(model) {
-		return body
-	}
-
-	updated := deleteJSONField(body, "request.generationConfig.thinkingConfig")
-	updated = deleteJSONField(updated, "generationConfig.thinkingConfig")
-	return updated
-}
-
-func geminiModelDisallowsThinking(model string) bool {
-	if model == "" {
-		return false
-	}
-	lower := strings.ToLower(model)
-	for _, marker := range []string{"gemini-2.5-flash-image-preview", "gemini-2.5-flash-image"} {
-		if strings.Contains(lower, marker) {
-			return true
-		}
-	}
-	return false
-}
-
 // setJSONField sets a top-level JSON field on a byte slice payload via sjson.
 func setJSONField(body []byte, key, value string) []byte {
 	if key == "" {
--- a/internal/runtime/executor/gemini_executor.go
+++ b/internal/runtime/executor/gemini_executor.go
@@ -78,10 +78,14 @@ func (e *GeminiExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, r
 	from := opts.SourceFormat
 	to := sdktranslator.FromString("gemini")
 	body := sdktranslator.TranslateRequest(from, to, req.Model, bytes.Clone(req.Payload), false)
-	if budgetOverride, includeOverride, ok := util.GeminiThinkingFromMetadata(req.Metadata); ok {
+	if budgetOverride, includeOverride, ok := util.GeminiThinkingFromMetadata(req.Metadata); ok && util.ModelSupportsThinking(req.Model) {
+		if budgetOverride != nil {
+			norm := util.NormalizeThinkingBudget(req.Model, *budgetOverride)
+			budgetOverride = &norm
+		}
 		body = util.ApplyGeminiThinkingConfig(body, budgetOverride, includeOverride)
 	}
-	body = disableGeminiThinkingConfig(body, req.Model)
+	body = util.StripThinkingConfigIfUnsupported(req.Model, body)
 	body = fixGeminiImageAspectRatio(req.Model, body)

 	action := "generateContent"
@@ -166,10 +170,14 @@ func (e *GeminiExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.A
 	from := opts.SourceFormat
 	to := sdktranslator.FromString("gemini")
 	body := sdktranslator.TranslateRequest(from, to, req.Model, bytes.Clone(req.Payload), true)
-	if budgetOverride, includeOverride, ok := util.GeminiThinkingFromMetadata(req.Metadata); ok {
+	if budgetOverride, includeOverride, ok := util.GeminiThinkingFromMetadata(req.Metadata); ok && util.ModelSupportsThinking(req.Model) {
+		if budgetOverride != nil {
+			norm := util.NormalizeThinkingBudget(req.Model, *budgetOverride)
+			budgetOverride = &norm
+		}
 		body = util.ApplyGeminiThinkingConfig(body, budgetOverride, includeOverride)
 	}
-	body = disableGeminiThinkingConfig(body, req.Model)
+	body = util.StripThinkingConfigIfUnsupported(req.Model, body)
 	body = fixGeminiImageAspectRatio(req.Model, body)

 	url := fmt.Sprintf("%s/%s/models/%s:%s", glEndpoint, glAPIVersion, req.Model, "streamGenerateContent")
@@ -269,10 +277,14 @@ func (e *GeminiExecutor) CountTokens(ctx context.Context, auth *cliproxyauth.Aut
 	from := opts.SourceFormat
 	to := sdktranslator.FromString("gemini")
 	translatedReq := sdktranslator.TranslateRequest(from, to, req.Model, bytes.Clone(req.Payload), false)
-	if budgetOverride, includeOverride, ok := util.GeminiThinkingFromMetadata(req.Metadata); ok {
+	if budgetOverride, includeOverride, ok := util.GeminiThinkingFromMetadata(req.Metadata); ok && util.ModelSupportsThinking(req.Model) {
+		if budgetOverride != nil {
+			norm := util.NormalizeThinkingBudget(req.Model, *budgetOverride)
+			budgetOverride = &norm
+		}
 		translatedReq = util.ApplyGeminiThinkingConfig(translatedReq, budgetOverride, includeOverride)
 	}
-	translatedReq = disableGeminiThinkingConfig(translatedReq, req.Model)
+	translatedReq = util.StripThinkingConfigIfUnsupported(req.Model, translatedReq)
 	translatedReq = fixGeminiImageAspectRatio(req.Model, translatedReq)
 	respCtx := context.WithValue(ctx, "alt", opts.Alt)
 	translatedReq, _ = sjson.DeleteBytes(translatedReq, "tools")
--- a/internal/translator/gemini-cli/claude/gemini-cli_claude_request.go
+++ b/internal/translator/gemini-cli/claude/gemini-cli_claude_request.go
@@ -11,6 +11,7 @@ import (
 	"strings"

 	client "github.com/router-for-me/CLIProxyAPI/v6/internal/interfaces"
+	"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
 	"github.com/tidwall/gjson"
 	"github.com/tidwall/sjson"
 )
@@ -136,7 +137,7 @@ func ConvertClaudeRequestToCLI(modelName string, inputRawJSON []byte, _ bool) []
 	}

 	// Build output Gemini CLI request JSON
-	out := `{"model":"","request":{"contents":[],"generationConfig":{"thinkingConfig":{"include_thoughts":true}}}}`
+	out := `{"model":"","request":{"contents":[]}}`
 	out, _ = sjson.Set(out, "model", modelName)
 	if systemInstruction != nil {
 		b, _ := json.Marshal(systemInstruction)
@@ -151,21 +152,16 @@ func ConvertClaudeRequestToCLI(modelName string, inputRawJSON []byte, _ bool) []
 		out, _ = sjson.SetRaw(out, "request.tools", string(b))
 	}

-	// Map reasoning and sampling configs
-	reasoningEffortResult := gjson.GetBytes(rawJSON, "reasoning_effort")
-	if reasoningEffortResult.String() == "none" {
-		out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.include_thoughts", false)
-		out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.thinkingBudget", 0)
-	} else if reasoningEffortResult.String() == "auto" {
-		out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.thinkingBudget", -1)
-	} else if reasoningEffortResult.String() == "low" {
-		out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.thinkingBudget", 1024)
-	} else if reasoningEffortResult.String() == "medium" {
-		out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.thinkingBudget", 8192)
-	} else if reasoningEffortResult.String() == "high" {
-		out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.thinkingBudget", 24576)
-	} else {
-		out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.thinkingBudget", -1)
+	// Map Anthropic thinking -> Gemini thinkingBudget/include_thoughts when type==enabled
+	if t := gjson.GetBytes(rawJSON, "thinking"); t.Exists() && t.IsObject() && util.ModelSupportsThinking(modelName) {
+		if t.Get("type").String() == "enabled" {
+			if b := t.Get("budget_tokens"); b.Exists() && b.Type == gjson.Number {
+				budget := int(b.Int())
+				budget = util.NormalizeThinkingBudget(modelName, budget)
+				out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.thinkingBudget", budget)
+				out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
+			}
+		}
 	}
 	if v := gjson.GetBytes(rawJSON, "temperature"); v.Exists() && v.Type == gjson.Number {
 		out, _ = sjson.Set(out, "request.generationConfig.temperature", v.Num)
--- a/internal/translator/gemini-cli/openai/chat-completions/gemini-cli_openai_request.go
+++ b/internal/translator/gemini-cli/openai/chat-completions/gemini-cli_openai_request.go
@@ -26,32 +26,57 @@ import (
 //   - []byte: The transformed request data in Gemini CLI API format
 func ConvertOpenAIRequestToGeminiCLI(modelName string, inputRawJSON []byte, _ bool) []byte {
 	rawJSON := bytes.Clone(inputRawJSON)
-	// Base envelope
-	out := []byte(`{"project":"","request":{"contents":[],"generationConfig":{"thinkingConfig":{"include_thoughts":true}}},"model":"gemini-2.5-pro"}`)
+	// Base envelope (no default thinkingConfig)
+	out := []byte(`{"project":"","request":{"contents":[]},"model":"gemini-2.5-pro"}`)

 	// Model
 	out, _ = sjson.SetBytes(out, "model", modelName)

 	// Reasoning effort -> thinkingBudget/include_thoughts
+	// Note: OpenAI official fields take precedence over extra_body.google.thinking_config
 	re := gjson.GetBytes(rawJSON, "reasoning_effort")
-	if re.Exists() {
+	hasOfficialThinking := re.Exists()
+	if hasOfficialThinking && util.ModelSupportsThinking(modelName) {
 		switch re.String() {
 		case "none":
 			out, _ = sjson.DeleteBytes(out, "request.generationConfig.thinkingConfig.include_thoughts")
 			out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", 0)
 		case "auto":
 			out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", -1)
+			out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
 		case "low":
-			out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", 1024)
+			out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 1024))
+			out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
 		case "medium":
-			out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", 8192)
+			out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 8192))
+			out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
 		case "high":
-			out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", 24576)
+			out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 32768))
+			out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
 		default:
 			out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", -1)
+			out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
+		}
+	}
+
+	// Cherry Studio extension extra_body.google.thinking_config (effective only when official fields are absent)
+	if !hasOfficialThinking && util.ModelSupportsThinking(modelName) {
+		if tc := gjson.GetBytes(rawJSON, "extra_body.google.thinking_config"); tc.Exists() && tc.IsObject() {
+			var setBudget bool
+			var normalized int
+			if v := tc.Get("thinking_budget"); v.Exists() {
+				normalized = util.NormalizeThinkingBudget(modelName, int(v.Int()))
+				out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", normalized)
+				setBudget = true
+			}
+			if v := tc.Get("include_thoughts"); v.Exists() {
+				out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", v.Bool())
+			} else if setBudget {
+				if normalized != 0 {
+					out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
+				}
+			}
 		}
-	} else {
-		out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", -1)
 	}

 	// Temperature/top_p/top_k
@@ -250,8 +275,34 @@ func ConvertOpenAIRequestToGeminiCLI(modelName string, inputRawJSON []byte, _ bo
 			if t.Get("type").String() == "function" {
 				fn := t.Get("function")
 				if fn.Exists() && fn.IsObject() {
-					parametersJsonSchema, _ := util.RenameKey(fn.Raw, "parameters", "parametersJsonSchema")
-					out, _ = sjson.SetRawBytes(out, fdPath+".-1", []byte(parametersJsonSchema))
+					fnRaw := fn.Raw
+					if fn.Get("parameters").Exists() {
+						renamed, errRename := util.RenameKey(fnRaw, "parameters", "parametersJsonSchema")
+						if errRename != nil {
+							log.Warnf("Failed to rename parameters for tool '%s': %v", fn.Get("name").String(), errRename)
+						} else {
+							fnRaw = renamed
+						}
+					} else {
+						var errSet error
+						fnRaw, errSet = sjson.Set(fnRaw, "parametersJsonSchema.type", "object")
+						if errSet != nil {
+							log.Warnf("Failed to set default schema type for tool '%s': %v", fn.Get("name").String(), errSet)
+							continue
+						}
+						fnRaw, errSet = sjson.Set(fnRaw, "parametersJsonSchema.properties", map[string]interface{}{})
+						if errSet != nil {
+							log.Warnf("Failed to set default schema properties for tool '%s': %v", fn.Get("name").String(), errSet)
+							continue
+						}
+					}
+
+					tmp, errSet := sjson.SetRawBytes(out, fdPath+".-1", []byte(fnRaw))
+					if errSet != nil {
+						log.Warnf("Failed to append tool declaration for '%s': %v", fn.Get("name").String(), errSet)
+						continue
+					}
+					out = tmp
 				}
 			}
 		}
--- a/internal/translator/gemini/claude/gemini_claude_request.go
+++ b/internal/translator/gemini/claude/gemini_claude_request.go
@@ -11,6 +11,7 @@ import (
 	"strings"

 	client "github.com/router-for-me/CLIProxyAPI/v6/internal/interfaces"
+	"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
 	"github.com/tidwall/gjson"
 	"github.com/tidwall/sjson"
 )
@@ -129,7 +130,7 @@ func ConvertClaudeRequestToGemini(modelName string, inputRawJSON []byte, _ bool)
 	}

 	// Build output Gemini CLI request JSON
-	out := `{"contents":[],"generationConfig":{"thinkingConfig":{"include_thoughts":true}}}`
+	out := `{"contents":[]}`
 	out, _ = sjson.Set(out, "model", modelName)
 	if systemInstruction != nil {
 		b, _ := json.Marshal(systemInstruction)
@@ -144,21 +145,16 @@ func ConvertClaudeRequestToGemini(modelName string, inputRawJSON []byte, _ bool)
 		out, _ = sjson.SetRaw(out, "tools", string(b))
 	}

-	// Map reasoning and sampling configs
-	reasoningEffortResult := gjson.GetBytes(rawJSON, "reasoning_effort")
-	if reasoningEffortResult.String() == "none" {
-		out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", false)
-		out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", 0)
-	} else if reasoningEffortResult.String() == "auto" {
-		out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", -1)
-	} else if reasoningEffortResult.String() == "low" {
-		out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", 1024)
-	} else if reasoningEffortResult.String() == "medium" {
-		out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", 8192)
-	} else if reasoningEffortResult.String() == "high" {
-		out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", 24576)
-	} else {
-		out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", -1)
+	// Map Anthropic thinking -> Gemini thinkingBudget/include_thoughts when enabled
+	if t := gjson.GetBytes(rawJSON, "thinking"); t.Exists() && t.IsObject() && util.ModelSupportsThinking(modelName) {
+		if t.Get("type").String() == "enabled" {
+			if b := t.Get("budget_tokens"); b.Exists() && b.Type == gjson.Number {
+				budget := int(b.Int())
+				budget = util.NormalizeThinkingBudget(modelName, budget)
+				out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", budget)
+				out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", true)
+			}
+		}
 	}
 	if v := gjson.GetBytes(rawJSON, "temperature"); v.Exists() && v.Type == gjson.Number {
 		out, _ = sjson.Set(out, "generationConfig.temperature", v.Num)
--- a/internal/translator/gemini/openai/chat-completions/gemini_openai_request.go
+++ b/internal/translator/gemini/openai/chat-completions/gemini_openai_request.go
@@ -26,32 +26,58 @@ import (
 //   - []byte: The transformed request data in Gemini API format
 func ConvertOpenAIRequestToGemini(modelName string, inputRawJSON []byte, _ bool) []byte {
 	rawJSON := bytes.Clone(inputRawJSON)
-	// Base envelope
-	out := []byte(`{"contents":[],"generationConfig":{"thinkingConfig":{"include_thoughts":true}}}`)
+	// Base envelope (no default thinkingConfig)
+	out := []byte(`{"contents":[]}`)

 	// Model
 	out, _ = sjson.SetBytes(out, "model", modelName)

 	// Reasoning effort -> thinkingBudget/include_thoughts
+	// Note: OpenAI official fields take precedence over extra_body.google.thinking_config
 	re := gjson.GetBytes(rawJSON, "reasoning_effort")
-	if re.Exists() {
+	hasOfficialThinking := re.Exists()
+	if hasOfficialThinking && util.ModelSupportsThinking(modelName) {
 		switch re.String() {
 		case "none":
 			out, _ = sjson.DeleteBytes(out, "generationConfig.thinkingConfig.include_thoughts")
 			out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", 0)
 		case "auto":
 			out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", -1)
+			out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.include_thoughts", true)
 		case "low":
-			out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", 1024)
+			out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 1024))
+			out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.include_thoughts", true)
 		case "medium":
-			out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", 8192)
+			out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 8192))
+			out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.include_thoughts", true)
 		case "high":
-			out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", 24576)
+			out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 32768))
+			out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.include_thoughts", true)
 		default:
 			out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", -1)
+			out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.include_thoughts", true)
+		}
+	}
+
+	// Cherry Studio extension extra_body.google.thinking_config (effective only when official fields are absent)
+	if !hasOfficialThinking && util.ModelSupportsThinking(modelName) {
+		if tc := gjson.GetBytes(rawJSON, "extra_body.google.thinking_config"); tc.Exists() && tc.IsObject() {
+			var setBudget bool
+			var normalized int
+			if v := tc.Get("thinking_budget"); v.Exists() {
+				// Normalize budget to model range
+				normalized = util.NormalizeThinkingBudget(modelName, int(v.Int()))
+				out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", normalized)
+				setBudget = true
+			}
+			if v := tc.Get("include_thoughts"); v.Exists() {
+				out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.include_thoughts", v.Bool())
+			} else if setBudget {
+				if normalized != 0 {
+					out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.include_thoughts", true)
+				}
+			}
 		}
-	} else {
-		out, _ = sjson.SetBytes(out, "generationConfig.thinkingConfig.thinkingBudget", -1)
 	}

 	// Temperature/top_p/top_k
--- a/internal/translator/gemini/openai/responses/gemini_openai-responses_request.go
+++ b/internal/translator/gemini/openai/responses/gemini_openai-responses_request.go
@@ -4,6 +4,7 @@ import (
 	"bytes"
 	"strings"

+	"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
 	"github.com/tidwall/gjson"
 	"github.com/tidwall/sjson"
 )
@@ -15,8 +16,8 @@ func ConvertOpenAIResponsesRequestToGemini(modelName string, inputRawJSON []byte
 	_ = modelName // Unused but required by interface
 	_ = stream    // Unused but required by interface

-	// Base Gemini API template
-	out := `{"contents":[],"generationConfig":{"thinkingConfig":{"include_thoughts":true}}}`
+	// Base Gemini API template (do not include thinkingConfig by default)
+	out := `{"contents":[]}`

 	root := gjson.ParseBytes(rawJSON)

@@ -242,23 +243,52 @@ func ConvertOpenAIResponsesRequestToGemini(modelName string, inputRawJSON []byte
 		out, _ = sjson.Set(out, "generationConfig.stopSequences", sequences)
 	}

-	if reasoningEffort := root.Get("reasoning.effort"); reasoningEffort.Exists() {
+	// OpenAI official reasoning fields take precedence
+	hasOfficialThinking := root.Get("reasoning.effort").Exists()
+	if hasOfficialThinking && util.ModelSupportsThinking(modelName) {
+		reasoningEffort := root.Get("reasoning.effort")
 		switch reasoningEffort.String() {
 		case "none":
 			out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", false)
 			out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", 0)
 		case "auto":
 			out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", -1)
+			out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", true)
 		case "minimal":
-			out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", 1024)
+			out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 1024))
+			out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", true)
 		case "low":
-			out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", 4096)
+			out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 4096))
+			out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", true)
 		case "medium":
-			out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", 8192)
+			out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 8192))
+			out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", true)
 		case "high":
-			out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", 24576)
+			out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 32768))
+			out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", true)
 		default:
 			out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", -1)
+			out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", true)
+		}
+	}
+
+	// Cherry Studio extension (applies only when official fields are missing)
+	if !hasOfficialThinking && util.ModelSupportsThinking(modelName) {
+		if tc := root.Get("extra_body.google.thinking_config"); tc.Exists() && tc.IsObject() {
+			var setBudget bool
+			var normalized int
+			if v := tc.Get("thinking_budget"); v.Exists() {
+				normalized = util.NormalizeThinkingBudget(modelName, int(v.Int()))
+				out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", normalized)
+				setBudget = true
+			}
+			if v := tc.Get("include_thoughts"); v.Exists() {
+				out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", v.Bool())
+			} else if setBudget {
+				if normalized != 0 {
+					out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", true)
+				}
+			}
 		}
 	}
 	return []byte(out)
--- a/internal/util/gemini_thinking.go
+++ b/internal/util/gemini_thinking.go
@@ -179,3 +179,19 @@ func GeminiThinkingFromMetadata(metadata map[string]any) (*int, *bool, bool) {
 	}
 	return budgetPtr, includePtr, matched
 }
+
+// StripThinkingConfigIfUnsupported removes thinkingConfig from the request body
+// when the target model does not advertise Thinking capability. It cleans both
+// standard Gemini and Gemini CLI JSON envelopes. This acts as a final safety net
+// in case upstream injected thinking for an unsupported model.
+func StripThinkingConfigIfUnsupported(model string, body []byte) []byte {
+	if ModelSupportsThinking(model) || len(body) == 0 {
+		return body
+	}
+	updated := body
+	// Gemini CLI path
+	updated, _ = sjson.DeleteBytes(updated, "request.generationConfig.thinkingConfig")
+	// Standard Gemini path
+	updated, _ = sjson.DeleteBytes(updated, "generationConfig.thinkingConfig")
+	return updated
+}
--- a/internal/util/thinking.go
+++ b/internal/util/thinking.go
@@ -0,0 +1,69 @@
+package util
+
+import (
+	"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
+)
+
+// ModelSupportsThinking reports whether the given model has Thinking capability
+// according to the model registry metadata (provider-agnostic).
+func ModelSupportsThinking(model string) bool {
+	if model == "" {
+		return false
+	}
+	if info := registry.GetGlobalRegistry().GetModelInfo(model); info != nil {
+		return info.Thinking != nil
+	}
+	return false
+}
+
+// NormalizeThinkingBudget clamps the requested thinking budget to the
+// supported range for the specified model using registry metadata only.
+// If the model is unknown or has no Thinking metadata, returns the original budget.
+// For dynamic (-1), returns -1 if DynamicAllowed; otherwise approximates mid-range
+// or min (0 if zero is allowed and mid <= 0).
+func NormalizeThinkingBudget(model string, budget int) int {
+	if budget == -1 { // dynamic
+		if found, min, max, zeroAllowed, dynamicAllowed := thinkingRangeFromRegistry(model); found {
+			if dynamicAllowed {
+				return -1
+			}
+			mid := (min + max) / 2
+			if mid <= 0 && zeroAllowed {
+				return 0
+			}
+			if mid <= 0 {
+				return min
+			}
+			return mid
+		}
+		return -1
+	}
+	if found, min, max, zeroAllowed, _ := thinkingRangeFromRegistry(model); found {
+		if budget == 0 {
+			if zeroAllowed {
+				return 0
+			}
+			return min
+		}
+		if budget < min {
+			return min
+		}
+		if budget > max {
+			return max
+		}
+		return budget
+	}
+	return budget
+}
+
+// thinkingRangeFromRegistry attempts to read thinking ranges from the model registry.
+func thinkingRangeFromRegistry(model string) (found bool, min int, max int, zeroAllowed bool, dynamicAllowed bool) {
+	if model == "" {
+		return false, 0, 0, false, false
+	}
+	info := registry.GetGlobalRegistry().GetModelInfo(model)
+	if info == nil || info.Thinking == nil {
+		return false, 0, 0, false, false
+	}
+	return true, info.Thinking.Min, info.Thinking.Max, info.Thinking.ZeroAllowed, info.Thinking.DynamicAllowed
+}
--- a/sdk/cliproxy/auth/manager.go
+++ b/sdk/cliproxy/auth/manager.go
@@ -872,6 +872,11 @@ func (m *Manager) persist(ctx context.Context, auth *Auth) error {
 	if m.store == nil || auth == nil {
 		return nil
 	}
+	if auth.Attributes != nil {
+		if v := strings.ToLower(strings.TrimSpace(auth.Attributes["runtime_only"])); v == "true" {
+			return nil
+		}
+	}
 	// Skip persistence when metadata is absent (e.g., runtime-only auths).
 	if auth.Metadata == nil {
 		return nil
--- a/sdk/cliproxy/service.go
+++ b/sdk/cliproxy/service.go
@@ -210,13 +210,14 @@ func (s *Service) wsOnConnected(channelID string) {
 	}
 	now := time.Now().UTC()
 	auth := &coreauth.Auth{
-		ID:        channelID,  // keep channel identifier as ID
-		Provider:  "aistudio", // logical provider for switch routing
-		Label:     channelID,  // display original channel id
-		Status:    coreauth.StatusActive,
-		CreatedAt: now,
-		UpdatedAt: now,
-		Metadata:  map[string]any{"email": channelID}, // inject email inline
+		ID:         channelID,  // keep channel identifier as ID
+		Provider:   "aistudio", // logical provider for switch routing
+		Label:      channelID,  // display original channel id
+		Status:     coreauth.StatusActive,
+		CreatedAt:  now,
+		UpdatedAt:  now,
+		Attributes: map[string]string{"runtime_only": "true"},
+		Metadata:   map[string]any{"email": channelID}, // metadata drives logging and usage tracking
 	}
 	log.Infof("websocket provider connected: %s", channelID)
 	s.applyCoreAuthAddOrUpdate(context.Background(), auth)
Author	SHA1	Message	Date
Luis Pater	3e7b645346	Merge pull request #186 from router-for-me/doc Some checks failed docker-image / docker (push) Has been cancelled Details goreleaser / goreleaser (push) Has been cancelled Details docs: add AI Studio setup	2025-10-29 21:53:49 +08:00
hkfires	24446a4dc4	feat(cliproxy): skip persisting runtime-only websocket auths	2025-10-29 21:49:35 +08:00
hkfires	475f473dab	docs: add AI Studio setup	2025-10-29 21:10:14 +08:00
Luis Pater	8dba32a077	Merge pull request #185 from router-for-me/thinking Feat: Add reasoning effort support for Gemini models	2025-10-29 20:27:07 +08:00
hkfires	1bbbd16df6	chore(logging): clarify 429 rate-limit retries in Gemini executor	2025-10-29 19:19:18 +08:00
hkfires	5cb378256b	feat(gemini-translators): set include_thoughts when mapping thinking	2025-10-29 19:19:18 +08:00
hkfires	3ac5f05e8c	feat(gemini): prefer official reasoning fields, add extra_body(cherry studio) fallback	2025-10-29 19:19:18 +08:00
hkfires	58d30369b4	fix(gemini-cli): correctly strip/normalize thinking config by model	2025-10-29 19:19:18 +08:00
hkfires	7dd93a4a25	fix(executor): only apply thinking config to supported models	2025-10-29 19:19:17 +08:00
hkfires	2a3ee8d0e3	fix(translators): normalize thinking budgets	2025-10-29 19:19:17 +08:00
hkfires	41577bce07	feat(claude): map Anthropic 'thinking' to Gemini thinkingBudget	2025-10-29 19:19:17 +08:00
hkfires	3d7aca22c0	feat(registry): add thinking budget support; populate Gemini models	2025-10-29 19:19:17 +08:00
hkfires	680b3f5010	fix(translator): avoid default thinkingConfig in Gemini requests	2025-10-29 19:19:17 +08:00
Luis Pater	9d42e4b239	feat(runtime): add User-Agent headers to codex and claude executors - Standardized User-Agent strings for Codex and Claude executors to improve request tracing and compatibility. - Updated header insertion logic in both executors for consistency.	2025-10-29 12:57:37 +08:00
Luis Pater	97af785aad	docs(readme): add CLIProxyAPI Linux installer instructions - Updated `README.md` and `README_CN.md` with steps to install via the Linux installer. - Acknowledged [brokechubb](https://github.com/brokechubb) for building the installer.	2025-10-28 23:17:08 +08:00
Luis Pater	0defb68c6c	fix(translator): improve error handling for function parameters schema transformation Some checks failed docker-image / docker (push) Has been cancelled Details goreleaser / goreleaser (push) Has been cancelled Details - Added fallback to set default `parametersJsonSchema` when `parameters` key is absent. - Enhanced logging to capture detailed errors during schema transformation. - Refined tool declaration appending logic for robustness.	2025-10-28 22:57:26 +08:00