CLIProxyAPI

Author	SHA1	Message	Date
Luis Pater	13eb5268de	Merge pull request #582 from ben-vargas/fix-gemini-3-thinking-level feat: use thinkingLevel for Gemini 3 models per Google documentation	2025-12-18 07:19:37 +08:00
Ben Vargas	598f0af19b	fix: apply thinkingLevel from model suffix metadata for Gemini 3 The previous commit added thinkingLevel support but didn't apply it when the reasoning effort came from model name suffix (e.g., model(minimal)). This was because ResolveThinkingConfigFromMetadata returns nil for level-based models, bypassing the metadata application. Changes: - Add ApplyGemini3ThinkingLevelFromMetadata for standard Gemini API - Add ApplyGemini3ThinkingLevelFromMetadataCLI for CLI API format - Update gemini_cli_executor to apply Gemini 3 thinkingLevel from metadata - Update antigravity_executor to apply Gemini 3 thinkingLevel from metadata - Update aistudio_executor to apply Gemini 3 thinkingLevel from metadata - Add comprehensive test coverage for Gemini 3 thinkingLevel functions	2025-12-17 16:08:38 -07:00
Ben Vargas	a33f5d31fc	feat: use thinkingLevel for Gemini 3 models per Google documentation Per Google's official documentation, Gemini 3 models should use thinkingLevel (string) instead of thinkingBudget (number) for optimal performance. From Google's Gemini Thinking docs: > Use the thinkingLevel parameter with Gemini 3 models. While > thinkingBudget is accepted for backwards compatibility, using > it with Gemini 3 Pro may result in suboptimal performance. Changes: - Add model family detection functions (IsGemini3Model, IsGemini25Model, IsGemini3ProModel, IsGemini3FlashModel) - Add ApplyGeminiThinkingLevel and ApplyGeminiCLIThinkingLevel functions for applying thinkingLevel config - Add ValidateGemini3ThinkingLevel for model-specific level validation - Add ThinkingBudgetToGemini3Level for backward compatibility conversion - Update NormalizeGeminiThinkingBudget to convert budget to level for Gemini 3 models - Update ApplyDefaultThinkingIfNeeded to not set a default level for Gemini 3 (lets API use its dynamic default "high") - Update ConvertThinkingLevelToBudget to preserve thinkingLevel for Gemini 3 models - Add Levels field to all Gemini 3 model definitions: - Gemini 3 Pro: ["low", "high"] - Gemini 3 Flash: ["minimal", "low", "medium", "high"] Backward compatibility: - Gemini 2.5 models continue to use thinkingBudget as before - If thinkingBudget is provided for Gemini 3, it's converted to the appropriate thinkingLevel - Existing configurations continue to work	2025-12-17 15:28:20 -07:00
Luis Pater	68a27772b3	feat(antigravity): enable token counting via API with resilient routing docker-image / docker (push) Has been cancelled Details goreleaser / goreleaser (push) Has been cancelled Details Introduces the capability to count tokens for Antigravity-backed requests. This implementation leverages the `countTokens` endpoint of the Antigravity API, replacing the prior unsupported stub. Key aspects of this update include: - API Integration: Direct integration with the Antigravity `countTokens` API, including necessary request payload translation and authentication. - Resilient Infrastructure: A fallback mechanism has been established, allowing the system to attempt connections across multiple Antigravity base URLs to ensure request success even in the event of temporary service interruptions. - Model Aliasing: Added mappings for `gemini-3-flash` and `gemini-3-flash-preview` to ensure compatibility with the latest model variants. - Robust Error Handling: Comprehensive error handling and logging are in place to manage failures during API interactions.	2025-12-18 03:12:46 +08:00
Luis Pater	5fda6f8ef3	feat(antigravity): implement non-streaming execution for Claude model requests	2025-12-17 23:17:11 +08:00
Luis Pater	09923f654c	feat(antigravity): add streaming support for Claude model requests	2025-12-17 22:16:57 +08:00
이대희	1b8e538a77	feature: Improves Gemini JSON schema compatibility Enhances compatibility with the Gemini API by implementing a schema cleaning process. This includes: - Centralizing schema cleaning logic for Gemini in a dedicated utility function. - Converting unsupported schema keywords to hints within the description field. - Flattening complex schema structures like `anyOf`, `oneOf`, and type arrays to simplify the schema. - Handling streaming responses with empty tool names, which can occur in subsequent chunks after the initial tool use.	2025-12-17 17:10:53 +09:00
hkfires	28a428ae2f	fix(thinking): align budget effort mapping across translators Unify thinking budget-to-effort conversion in a shared helper, handle disabled/default thinking cases in translators, adjust zero-budget mapping, and drop the old OpenAI-specific helper with updated tests.	2025-12-16 18:34:43 +08:00
hkfires	b326ec3641	feat(iflow): add thinking support for iFlow models	2025-12-16 18:34:43 +08:00
hkfires	367a05bdf6	refactor(thinking): export thinking helpers Expose thinking/effort normalization helpers from the executor package so conversion tests use production code and stay aligned with runtime validation behavior.	2025-12-15 09:16:15 +08:00
hkfires	712ce9f781	fix(thinking): drop unsupported none effort When budget 0 maps to "none" for models that use thinking levels but don't support that effort level, strip thinking fields instead of setting an invalid reasoning_effort value. Tests now expect removal for this edge case.	2025-12-15 09:16:14 +08:00
hkfires	27a5ad8ec2	Fixed: #534 fix(aistudio): correct JSON string boundary detection for backslash sequences	2025-12-15 09:00:14 +08:00
Luis Pater	14ce6aebd1	Merge pull request #449 from sususu98/fix/gemini-cli-429-retry-delay-parsing fix(gemini-cli): enhance 429 retry delay parsing	2025-12-14 14:04:14 +08:00
Luis Pater	660aabc437	fix(executor): add `allowCompat` support for reasoning effort normalization Introduced `allowCompat` parameter to improve compatibility handling for reasoning effort in payloads across OpenAI and similar models.	2025-12-13 04:06:02 +08:00
Luis Pater	f3f0f1717d	Merge branch 'dev' into think	2025-12-12 22:16:44 +08:00
Luis Pater	7621ec609e	Merge pull request #501 from huynguyen03dev/fix/openai-compat-model-alias-resolution docker-image / docker (push) Has been cancelled Details goreleaser / goreleaser (push) Has been cancelled Details fix(openai-compat): prevent model alias from being overwritten	2025-12-12 21:58:15 +08:00
Luis Pater	9f511f0024	fix(executor): improve model compatibility handling for OpenAI-compatibility Enhances payload handling by introducing OpenAI-compatibility checks and refining how reasoning metadata is resolved, ensuring broader model support.	2025-12-12 21:57:25 +08:00
hkfires	374faa2640	fix(thinking): map budgets to effort levels Ensure thinking settings translate correctly across providers: - Only apply reasoning_effort to level-based models and derive it from numeric budget suffixes when present - Strip effort string fields for budget-based models and skip Claude/Gemini budget resolution for level-based or unsupported models - Default Gemini include_thoughts when a nonzero budget override is set - Add cross-protocol conversion and budget range tests	2025-12-12 21:33:20 +08:00
huynguyen03.dev	15c3cc3a50	fix(openai-compat): prevent model alias from being overwritten by ResolveOriginalModel When using OpenAI-compatible providers with model aliases (e.g., glm-4.6-zai -> glm-4.6), the alias resolution was correctly applied but then immediately overwritten by ResolveOriginalModel, causing 'Unknown Model' errors from upstream APIs. This fix skips the ResolveOriginalModel override when a model alias has already been resolved, ensuring the correct model name is sent to the upstream provider. Co-authored-by: Amp <amp@ampcode.com>	2025-12-12 17:20:24 +07:00
hkfires	d131435e25	fix(codex): raise default reasoning effort to medium	2025-12-12 18:18:48 +08:00
hkfires	3c315551b0	refactor(executor): relocate gemini token counters	2025-12-11 21:56:44 +08:00
hkfires	27c9c5c4da	refactor(executor): clarify executor comments and oauth names	2025-12-11 21:56:44 +08:00
hkfires	fc9f6c974a	refactor(executor): clarify providers and streams Add package and constructor documentation for AI Studio, Antigravity, Gemini CLI, Gemini API, and Vertex executors to describe their roles and inputs. Introduce a shared stream scanner buffer constant in the Gemini API executor and reuse it in Gemini CLI and Vertex streaming code so stream handling uses a consistent configuration. Update Refresh implementations for AI Studio, Gemini CLI, Gemini API (API key), and Vertex executors to short‑circuit and simply return the incoming auth object, while keeping Antigravity token renewal as the only executor that performs OAuth refresh. Remove OAuth2-based token refresh logic and related dependencies from the Gemini API executor, since it now operates strictly with API key credentials.	2025-12-11 21:56:43 +08:00
Luis Pater	a74ee3f319	Merge pull request #481 from sususu98/fix/increase-buffer-size docker-image / docker (push) Has been cancelled Details goreleaser / goreleaser (push) Has been cancelled Details fix: increase buffer size for stream scanners to 50MB across multiple executors	2025-12-11 21:20:54 +08:00
hkfires	e79f65fd8e	refactor(thinking): use parentheses for metadata suffix	2025-12-11 18:39:07 +08:00
hkfires	facfe7c518	refactor(thinking): use bracket tags for thinking meta Align thinking suffix handling on a single bracket-style marker. NormalizeThinkingModel strips a terminal `[value]` segment from model identifiers and turns it into either a thinking budget (for numeric values) or a reasoning effort hint (for strings). Emission of `ThinkingIncludeThoughtsMetadataKey` is removed. Executor helpers and the example config are updated so their comments reference the new `[value]` suffix format instead of the legacy dash variants. BREAKING CHANGE: dash-based thinking suffixes (`-thinking`, `-thinking-N`, `-reasoning`, `-nothinking`) are no longer parsed for thinking metadata; only `[value]` annotations are recognized.	2025-12-11 18:17:28 +08:00
hkfires	6285459c08	fix(runtime): unify claude thinking config resolution	2025-12-11 17:20:44 +08:00
hkfires	21bbceca0c	docs(runtime): document reasoning effort precedence	2025-12-11 16:35:36 +08:00
hkfires	f6300c72b7	fix(runtime): validate thinking config in iflow and qwen	2025-12-11 16:21:50 +08:00
hkfires	3a81ab22fd	fix(runtime): unify reasoning effort metadata overrides	2025-12-11 14:35:05 +08:00
hkfires	519da2e042	fix(runtime): validate reasoning effort levels	2025-12-11 12:36:54 +08:00
hkfires	3ffd120ae9	feat(runtime): add thinking config normalization	2025-12-11 11:51:33 +08:00
sususu	07d21463ca	fix(gemini-cli): enhance 429 retry delay parsing Add fallback parsing for quota reset delay when RetryInfo is not present: - Try ErrorInfo.metadata.quotaResetDelay (e.g., "373.801628ms") - Parse from error.message "Your quota will reset after Xs." This ensures proper cooldown timing for rate-limited requests. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-11 09:34:39 +08:00
Luis Pater	423ce97665	feat(util): implement dynamic thinking suffix normalization and refactor budget resolution logic docker-image / docker (push) Has been cancelled Details goreleaser / goreleaser (push) Has been cancelled Details - Added support for parsing and normalizing dynamic thinking model suffixes. - Centralized budget resolution across executors and payload helpers. - Retired legacy Gemini-specific thinking handlers in favor of unified logic. - Updated executors to use metadata-based thinking configuration. - Added `ResolveOriginalModel` utility for resolving normalized upstream models using request metadata. - Updated executors (Gemini, Codex, iFlow, OpenAI, Qwen) to incorporate upstream model resolution and substitute model values in payloads and request URLs. - Ensured fallbacks handle cases with missing or malformed metadata to derive models robustly. - Refactored upstream model resolution to dynamically incorporate metadata for selecting and normalizing models. - Improved handling of thinking configurations and model overrides in executors. - Removed hardcoded thinking model entries and migrated logic to metadata-based resolution. - Updated payload mutations to always include the resolved model.	2025-12-11 03:10:50 +08:00
sususu	76c563d161	fix(executor): increase buffer size for stream scanners to 50MB across multiple executors	2025-12-10 23:20:04 +08:00
Luis Pater	94d61c7b2b	fix(logging): update response aggregation logic to include all attempts docker-image / docker (push) Has been cancelled Details goreleaser / goreleaser (push) Has been cancelled Details	2025-12-10 16:53:48 +08:00
Luis Pater	f25f419e5a	fix(antigravity): remove references to `autopush` endpoint and update fallback logic docker-image / docker (push) Has been cancelled Details goreleaser / goreleaser (push) Has been cancelled Details	2025-12-10 00:13:20 +08:00
hkfires	9b202b6c1c	fix(executor): centralize default thinking config	2025-12-09 21:05:06 +08:00
hkfires	6a66b6801a	feat(executor): enforce minimum thinking budget for antigravity models	2025-12-09 21:05:06 +08:00
hkfires	5ec9b5e5a9	feat(executor): normalize thinking budget across all Gemini executors	2025-12-09 21:05:06 +08:00
Luis Pater	5db3b58717	Merge pull request #470 from router-for-me/agry fix(gemini): normalize model listing output	2025-12-09 21:00:29 +08:00
Luis Pater	39b6b3b289	Fixed: #463 docker-image / docker (push) Has been cancelled Details goreleaser / goreleaser (push) Has been cancelled Details fix(antigravity): remove `$ref` and `$defs` from JSON during key deletion	2025-12-09 17:32:17 +08:00
hkfires	e5312fb5a2	feat(antigravity): support canonical names for antigravity models	2025-12-09 16:54:13 +08:00
hkfires	96b55acff8	feat(aistudio): normalize thinking budget in request translation	2025-12-09 08:27:44 +08:00
Luis Pater	af00304b0c	fix(antigravity): remove `exclusiveMaximum` from JSON during key deletion	2025-12-08 23:28:01 +08:00
Luis Pater	6ad188921c	refactor(logging): remove unused variable in `ensureAttempt` and redundant function call docker-image / docker (push) Has been cancelled Details goreleaser / goreleaser (push) Has been cancelled Details	2025-12-08 22:25:58 +08:00
hkfires	a283545b6b	feat(antigravity): enforce thinking budget limits for Claude models	2025-12-08 20:36:17 +08:00
hkfires	9c09128e00	feat(registry): add explicit thinking support config for antigravity models	2025-12-07 19:12:55 +08:00
Luis Pater	fd29ab418a	Fixed: #424 docker-image / docker (push) Has been cancelled Details goreleaser / goreleaser (push) Has been cancelled Details feat(antigravity): add support for maxOutputTokens and refine Claude model handling	2025-12-07 01:55:57 +08:00
Luis Pater	d7564173dd	fix(antigravity): restore production base URL in the executor docker-image / docker (push) Has been cancelled Details goreleaser / goreleaser (push) Has been cancelled Details	2025-12-06 01:11:37 +08:00

1 2 3 4 5 ...

299 Commits