Compare commits

...

22 Commits

Author SHA1 Message Date
Luis Pater
bdac24bb4e Update PassthroughGeminiResponseStream to handle [DONE] marker
Some checks failed
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Added logic to return an empty slice when the raw JSON equals the `[DONE]` marker.
- Ensures proper termination of streamed Gemini responses.
2025-09-01 02:00:55 +08:00
Luis Pater
6d30faf9c9 Update Management API CN docs for authentication requirements
- Changed authentication requirements to mandate management keys for all requests, including local access.
- Clarified remote access setup and key provision methods.
- Adjusted the section header.
2025-08-31 15:29:20 +08:00
Luis Pater
c0eaa41c7a Add Gemini-to-Gemini request normalization and passthrough support
Some checks failed
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Introduced a `ConvertGeminiRequestToGemini` function to normalize Gemini v1beta requests by ensuring valid or default roles.
- Added passthrough response handlers for both streamed and non-streamed Gemini responses.
- Registered translators for Gemini-to-Gemini traffic in the initialization process.
- Updated `gemini-cli` request normalization to align with the new Gemini translator logic.
2025-08-31 15:05:16 +08:00
Luis Pater
8a2285e706 Reorganize and reintroduce Management API section in README files 2025-08-31 14:41:57 +08:00
Luis Pater
db43930b98 Add management API handlers for config and auth file management
Some checks failed
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Implemented CRUD operations for authentication files.
- Added endpoints for managing API keys, quotas, proxy settings, and other configurations.
- Enhanced management access with robust validation, remote access control, and persistence support.
- Updated README with new configuration details.

Fixed OpenAI Chat Completions for codex
2025-08-31 14:29:23 +08:00
Luis Pater
b1254106ee Enhance client reload process with new OpenAI compatibility support
Some checks failed
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Added handling for OpenAI-compatible providers during client reload.
- Implemented client unregistration for old clients during reload.
- Improved logging for detailed client reload insights.

Expand `AuthDir` handling to support tilde (`~`) for home directory resolution

- Added logic to replace `~` with the user's home directory in `AuthDir`.
- Prevents errors when using `~` in configuration paths.
2025-08-31 03:04:46 +08:00
Luis Pater
9c9ea99380 Add support for new GPT-5 model variants
Some checks failed
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Renamed existing GPT-5 variants for consistency (`nano` → `minimal`, `mini` → `low`, etc.).
- Added metadata definitions for new variants: `gpt-5-minimal`, `gpt-5-low`, `gpt-5-medium`, and updated logic to reflect variant-specific reasoning efforts.
2025-08-30 22:00:37 +08:00
Luis Pater
ba4c11428c Merge pull request #16 from hkfires/main
Set the default Docker timezone to Asia/Shanghai
2025-08-30 19:05:32 +08:00
Luis Pater
0331660fe2 Add token refresh handling for 401 responses across clients
Some checks failed
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Implemented `RefreshTokens` method in client interfaces and Gemini clients.
- Updated handlers to call `RefreshTokens` on 401 responses and retry requests if token refresh succeeds.
- Enhanced error handling and retry logic to accommodate token refresh flow.

Update README to include Qwen login instructions

- Added Qwen OAuth login command instructions in both English and Chinese README files.
- Made minor updates to existing command examples for consistency.
2025-08-30 17:26:41 +08:00
hkfires
3f7840188e Set the default Docker timezone to Asia/Shanghai 2025-08-30 16:41:27 +08:00
Luis Pater
512c8b600a Add token refresh handling for 401 responses across clients
Some checks failed
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Implemented `RefreshTokens` method in client interfaces and Gemini clients.
- Updated handlers to call `RefreshTokens` on 401 responses and retry requests if token refresh succeeds.
- Enhanced error handling and retry logic to accommodate token refresh flow.
2025-08-30 16:10:56 +08:00
Luis Pater
1aad033fec Update issue templates 2025-08-30 04:18:17 +08:00
Luis Pater
f1d9364ef4 Update README documentation to clarify auth-dir configuration for Windows users
Some checks failed
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Added a note for setting `auth-dir` on Windows systems in both English and Chinese README files.
- Improved descriptions for existing configuration options.

Address Qwen3 tool injection issue to prevent random token insertions

- Modify Qwen client to insert a placeholder tool when none is defined, avoiding erratic behavior in streaming responses.
2025-08-29 17:28:55 +08:00
Luis Pater
c2b2c9eafe Update README documentation to clarify auth-dir configuration for Windows users
Some checks failed
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Added a note for setting `auth-dir` on Windows systems in both English and Chinese README files.
- Improved descriptions for existing configuration options.
2025-08-29 09:52:10 +08:00
Luis Pater
09b9d3b3fa Unlock mutex before returning error in handlers.go to prevent deadlocks
Some checks failed
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
2025-08-29 09:46:12 +08:00
Luis Pater
e9e0016a63 Fix some bugs. 2025-08-29 04:05:08 +08:00
Luis Pater
3704dae342 Add nil-check for GetRequestMutex across handlers to prevent potential panics
Some checks failed
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Updated all handlers to safely unlock the request mutex only if it's non-nil.
- Enhanced mutex locking and unlocking logic to avoid runtime errors.
- Improved robustness of resource cleanup across clients.

Add `GetRequestMutex` method for synchronization across clients

- Introduced a new `GetRequestMutex` method in OpenAICompatibilityClient, CodexClient, GeminiCLIClient, GeminiClient, and QwenClient for request synchronization.
- Ensures only one request is processed at a time to manage quotas effectively.
2025-08-29 00:23:37 +08:00
Luis Pater
bea5f97cbf Add /v1/completions endpoint with OpenAI compatibility
Some checks failed
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Implemented `/v1/completions` endpoint mirroring OpenAI's completions API specification.
- Added conversion functions to translate between completions and chat completions formats.
- Introduced streaming and non-streaming response handling for completions requests.
- Updated `server.go` to register the new endpoint and include it in the API's metadata.
2025-08-28 00:30:46 +08:00
Luis Pater
7a6adfa97e Suppress debug logs for model routing and ignore empty tools arrays
Some checks failed
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Comment out verbose routing logs in the API server to reduce noise.
- Remove the `tools` field from Qwen client requests when it is an empty array.
- Add guards in Claude, Codex, Gemini‑CLI, and Gemini translators to skip tool conversion when the `tools` array is empty, preventing unnecessary payload modifications.
2025-08-27 22:29:08 +08:00
Luis Pater
1c4183d943 Add support for localhost unauthenticated requests
Some checks failed
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Introduced `AllowLocalhostUnauthenticated` flag allowing unauthenticated requests from localhost.
- Updated authentication middleware to bypass checks for localhost when enabled.

Add new Gemini CLI models and update model registry function

- Introduced `GetGeminiCLIModels` for updated Gemini CLI model definitions.
- Added new models: "Gemini 2.5 Flash Lite" and "Gemini 2.5 Pro".
- Updated `RegisterModels` to use `GetGeminiCLIModels` in Gemini client initialization.
2025-08-27 21:20:25 +08:00
Luis Pater
dff31a7a4c Improved the /v1/models endpoint
Some checks failed
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
2025-08-27 21:01:37 +08:00
Luis Pater
ed8873fbb0 Add OpenAI compatibility support and improve resource cleanup
Some checks failed
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Introduced OpenAI compatibility configurations for external providers, enabling model alias routing via the OpenAI API format.
- Enhanced provider logic in `GetProviderName` to handle OpenAI aliases and added new helper functions for compatibility checks.
- Updated API handlers and client initialization to support OpenAI compatibility models.
- Improved resource cleanup across clients by closing response bodies and streams using deferred functions.
2025-08-26 03:33:46 +08:00
42 changed files with 4204 additions and 248 deletions

37
.github/ISSUE_TEMPLATE/bug_report.md vendored Normal file
View File

@@ -0,0 +1,37 @@
---
name: Bug report
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''
---
**Describe the bug**
A clear and concise description of what the bug is.
**CLI Type**
What type of CLI account do you use? (gemini-cli, gemini, codex, claude code or openai-compatibility)
**Model Name**
What model are you using? (example: gemini-2.5-pro, claude-sonnet-4-20250514, gpt-5, etc.)
**LLM Client**
What LLM Client are you using? (example: roo-code, cline, claude code, etc.)
**Request Information**
The best way is to paste the cURL command of the HTTP request here.
Alternatively, you can set `request-log: true` in the `config.yaml` file and then upload the detailed log file.
**Expected behavior**
A clear and concise description of what you expected to happen.
**Screenshots**
If applicable, add screenshots to help explain your problem.
**OS Type**
- OS: [e.g. macOS]
- Version [e.g. 15.6.0]
**Additional context**
Add any other context about the problem here.

View File

@@ -12,6 +12,8 @@ RUN CGO_ENABLED=0 GOOS=linux go build -o ./CLIProxyAPI ./cmd/server/
FROM alpine:3.22.0
RUN apk add --no-cache tzdata
RUN mkdir /CLIProxyAPI
COPY --from=builder ./app/CLIProxyAPI /CLIProxyAPI/CLIProxyAPI
@@ -20,4 +22,8 @@ WORKDIR /CLIProxyAPI
EXPOSE 8317
ENV TZ=Asia/Shanghai
RUN cp /usr/share/zoneinfo/${TZ} /etc/localtime && echo "${TZ}" > /etc/timezone
CMD ["./CLIProxyAPI"]

248
MANAGEMENT_API.md Normal file
View File

@@ -0,0 +1,248 @@
# Management API
Base URL: `http://localhost:8317/v0/management`
This API manages runtime configuration and authentication files for the CLI Proxy API. All changes persist to the YAML config file and are hotreloaded by the server.
Note: The following options cannot be changed via API and must be edited in the config file, then restart if needed:
- `allow-remote-management`
- `remote-management-key` (stored as bcrypt hash after startup if plaintext was provided)
## Authentication
- All requests (including localhost) must include a management key.
- Remote access additionally requires `allow-remote-management: true` in config.
- Provide the key via one of:
- `Authorization: Bearer <plaintext-key>`
- `X-Management-Key: <plaintext-key>`
If a plaintext key is present in the config on startup, it is bcrypt-hashed and written back to the config file automatically. If `remote-management-key` is empty, the Management API is entirely disabled (404 for `/v0/management/*`).
## Request/Response Conventions
- Content type: `application/json` unless noted.
- Boolean/int/string updates use body: `{ "value": <type> }`.
- Array PUT bodies can be either a raw array (e.g. `["a","b"]`) or `{ "items": [ ... ] }`.
- Array PATCH accepts either `{ "old": "k1", "new": "k2" }` or `{ "index": 0, "value": "k2" }`.
- Object-array PATCH supports either index or key match (documented per endpoint).
## Endpoints
### Debug
- GET `/debug` — get current debug flag
- PUT/PATCH `/debug` — set debug (boolean)
Example (set true):
```bash
curl -X PUT \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-H 'Content-Type: application/json' \
-d '{"value":true}' \
http://localhost:8317/v0/management/debug
```
Response:
```json
{ "status": "ok" }
```
### Proxy URL
- GET `/proxy-url` — get proxy URL string
- PUT/PATCH `/proxy-url` — set proxy URL string
- DELETE `/proxy-url` — clear proxy URL
Example (set):
```bash
curl -X PATCH -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"value":"socks5://user:pass@127.0.0.1:1080/"}' \
http://localhost:8317/v0/management/proxy-url
```
Response:
```json
{ "status": "ok" }
```
### Quota Exceeded Behavior
- GET `/quota-exceeded/switch-project`
- PUT/PATCH `/quota-exceeded/switch-project` — boolean
- GET `/quota-exceeded/switch-preview-model`
- PUT/PATCH `/quota-exceeded/switch-preview-model` — boolean
Example:
```bash
curl -X PUT -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"value":false}' \
http://localhost:8317/v0/management/quota-exceeded/switch-project
```
Response:
```json
{ "status": "ok" }
```
### API Keys (proxy server auth)
- GET `/api-keys` — return the full list
- PUT `/api-keys` — replace the full list
- PATCH `/api-keys` — update one entry (by `old/new` or `index/value`)
- DELETE `/api-keys` — remove one entry (by `?value=` or `?index=`)
Examples:
```bash
# Replace list
curl -X PUT -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '["k1","k2","k3"]' \
http://localhost:8317/v0/management/api-keys
# Patch: replace k2 -> k2b
curl -X PATCH -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"old":"k2","new":"k2b"}' \
http://localhost:8317/v0/management/api-keys
# Delete by value
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' -X DELETE 'http://localhost:8317/v0/management/api-keys?value=k1'
```
Response (GET):
```json
{ "api-keys": ["k1","k2b","k3"] }
```
### Generative Language API Keys (Gemini)
- GET `/generative-language-api-key`
- PUT `/generative-language-api-key`
- PATCH `/generative-language-api-key`
- DELETE `/generative-language-api-key`
Same request/response shapes as API keys.
### Request Logging
- GET `/request-log` — get boolean
- PUT/PATCH `/request-log` — set boolean
### Request Retry
- GET `/request-retry` — get integer
- PUT/PATCH `/request-retry` — set integer
### Allow Localhost Unauthenticated
- GET `/allow-localhost-unauthenticated` — get boolean
- PUT/PATCH `/allow-localhost-unauthenticated` — set boolean
### Claude API Keys (object array)
- GET `/claude-api-key` — full list
- PUT `/claude-api-key` — replace list
- PATCH `/claude-api-key` — update one item (by `index` or `match` API key)
- DELETE `/claude-api-key` — remove one item (`?api-key=` or `?index=`)
Object shape:
```json
{
"api-key": "sk-...",
"base-url": "https://custom.example.com" // optional
}
```
Examples:
```bash
# Replace list
curl -X PUT -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '[{"api-key":"sk-a"},{"api-key":"sk-b","base-url":"https://c.example.com"}]' \
http://localhost:8317/v0/management/claude-api-key
# Patch by index
curl -X PATCH -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"index":1,"value":{"api-key":"sk-b2","base-url":"https://c.example.com"}}' \
http://localhost:8317/v0/management/claude-api-key
# Patch by match (api-key)
curl -X PATCH -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"match":"sk-a","value":{"api-key":"sk-a","base-url":""}}' \
http://localhost:8317/v0/management/claude-api-key
# Delete by api-key
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' -X DELETE 'http://localhost:8317/v0/management/claude-api-key?api-key=sk-b2'
```
Response (GET):
```json
{
"claude-api-key": [
{ "api-key": "sk-a", "base-url": "" }
]
}
```
### OpenAI Compatibility Providers (object array)
- GET `/openai-compatibility` — full list
- PUT `/openai-compatibility` — replace list
- PATCH `/openai-compatibility` — update one item by `index` or `name`
- DELETE `/openai-compatibility` — remove by `?name=` or `?index=`
Object shape:
```json
{
"name": "openrouter",
"base-url": "https://openrouter.ai/api/v1",
"api-keys": ["sk-..."],
"models": [ {"name": "moonshotai/kimi-k2:free", "alias": "kimi-k2"} ]
}
```
Examples:
```bash
# Replace list
curl -X PUT -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '[{"name":"openrouter","base-url":"https://openrouter.ai/api/v1","api-keys":["sk"],"models":[{"name":"m","alias":"a"}]}]' \
http://localhost:8317/v0/management/openai-compatibility
# Patch by name
curl -X PATCH -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"name":"openrouter","value":{"name":"openrouter","base-url":"https://openrouter.ai/api/v1","api-keys":[],"models":[]}}' \
http://localhost:8317/v0/management/openai-compatibility
# Delete by index
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' -X DELETE 'http://localhost:8317/v0/management/openai-compatibility?index=0'
```
Response (GET):
```json
{ "openai-compatibility": [ { "name": "openrouter", "base-url": "...", "api-keys": [], "models": [] } ] }
```
### Auth Files Management
List JSON token files under `auth-dir`, download/upload/delete.
- GET `/auth-files` — list
- Response:
```json
{ "files": [ { "name": "acc1.json", "size": 1234, "modtime": "2025-08-30T12:34:56Z" } ] }
```
- GET `/auth-files/download?name=<file.json>` — download a single file
- POST `/auth-files` — upload
- Multipart form: field `file` (must be `.json`)
- Or raw JSON body with `?name=<file.json>`
- Response: `{ "status": "ok" }`
- DELETE `/auth-files?name=<file.json>` — delete a single file
- DELETE `/auth-files?all=true` — delete all `.json` files in `auth-dir`
## Error Responses
Generic error shapes:
- 400 Bad Request: `{ "error": "invalid body" }`
- 401 Unauthorized: `{ "error": "missing management key" }` or `{ "error": "invalid management key" }`
- 403 Forbidden: `{ "error": "remote management disabled" }`
- 404 Not Found: `{ "error": "item not found" }` or `{ "error": "file not found" }`
- 500 Internal Server Error: `{ "error": "failed to save config: ..." }`
## Notes
- Changes are written to the YAML configuration file and picked up by the servers file watcher to hot-reload clients and settings.
- `allow-remote-management` and `remote-management-key` must be edited in the configuration file and cannot be changed via the API.

487
MANAGEMENT_API_CN.md Normal file
View File

@@ -0,0 +1,487 @@
# 管理 API
基础路径:`http://localhost:8317/v0/management`
该 API 用于管理 CLI Proxy API 的运行时配置与认证文件。所有变更会持久化写入 YAML 配置文件,并由服务自动热重载。
注意:以下选项不能通过 API 修改,需在配置文件中设置(如有必要可重启):
- `allow-remote-management`
- `remote-management-key`(若在启动时检测到明文,会自动进行 bcrypt 加密并写回配置)
## 认证
- 所有请求(包括本地访问)都必须提供有效的管理密钥.
- 远程访问需要在配置文件中开启远程访问: `allow-remote-management: true`
- 通过以下任意方式提供管理密钥(明文):
- `Authorization: Bearer <plaintext-key>`
- `X-Management-Key: <plaintext-key>`
若在启动时检测到配置中的管理密钥为明文,会自动使用 bcrypt 加密并回写到配置文件中。
## 请求/响应约定
- Content-Type`application/json`(除非另有说明)。
- 布尔/整数/字符串更新:请求体为 `{ "value": <type> }`
- 数组 PUT既可使用原始数组`["a","b"]`),也可使用 `{ "items": [ ... ] }`
- 数组 PATCH支持 `{ "old": "k1", "new": "k2" }``{ "index": 0, "value": "k2" }`
- 对象数组 PATCH支持按索引或按关键字段匹配各端点中单独说明
## 端点说明
### Debug
- GET `/debug` — 获取当前 debug 状态
- 请求:
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' http://localhost:8317/v0/management/debug
```
- 响应:
```json
{ "debug": false }
```
- PUT/PATCH `/debug` — 设置 debug布尔值
- 请求:
```bash
curl -X PUT -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"value":true}' \
http://localhost:8317/v0/management/debug
```
- 响应:
```json
{ "status": "ok" }
```
### 代理服务器 URL
- GET `/proxy-url` — 获取代理 URL 字符串
- 请求:
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' http://localhost:8317/v0/management/proxy-url
```
- 响应:
```json
{ "proxy-url": "socks5://user:pass@127.0.0.1:1080/" }
```
- PUT/PATCH `/proxy-url` — 设置代理 URL 字符串
- 请求PUT
```bash
curl -X PUT -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"value":"socks5://user:pass@127.0.0.1:1080/"}' \
http://localhost:8317/v0/management/proxy-url
```
- 请求PATCH
```bash
curl -X PATCH -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"value":"http://127.0.0.1:8080"}' \
http://localhost:8317/v0/management/proxy-url
```
- 响应:
```json
{ "status": "ok" }
```
- DELETE `/proxy-url` — 清空代理 URL
- 请求:
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' -X DELETE http://localhost:8317/v0/management/proxy-url
```
- 响应:
```json
{ "status": "ok" }
```
### 超出配额行为
- GET `/quota-exceeded/switch-project`
- 请求:
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' http://localhost:8317/v0/management/quota-exceeded/switch-project
```
- 响应:
```json
{ "switch-project": true }
```
- PUT/PATCH `/quota-exceeded/switch-project` — 布尔值
- 请求:
```bash
curl -X PUT -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"value":false}' \
http://localhost:8317/v0/management/quota-exceeded/switch-project
```
- 响应:
```json
{ "status": "ok" }
```
- GET `/quota-exceeded/switch-preview-model`
- 请求:
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' http://localhost:8317/v0/management/quota-exceeded/switch-preview-model
```
- 响应:
```json
{ "switch-preview-model": true }
```
- PUT/PATCH `/quota-exceeded/switch-preview-model` — 布尔值
- 请求:
```bash
curl -X PATCH -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"value":true}' \
http://localhost:8317/v0/management/quota-exceeded/switch-preview-model
```
- 响应:
```json
{ "status": "ok" }
```
### API Keys代理服务认证
- GET `/api-keys` — 返回完整列表
- 请求:
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' http://localhost:8317/v0/management/api-keys
```
- 响应:
```json
{ "api-keys": ["k1","k2","k3"] }
```
- PUT `/api-keys` — 完整改写列表
- 请求:
```bash
curl -X PUT -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '["k1","k2","k3"]' \
http://localhost:8317/v0/management/api-keys
```
- 响应:
```json
{ "status": "ok" }
```
- PATCH `/api-keys` — 修改其中一个(`old/new` 或 `index/value`
- 请求(按 old/new
```bash
curl -X PATCH -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"old":"k2","new":"k2b"}' \
http://localhost:8317/v0/management/api-keys
```
- 请求(按 index/value
```bash
curl -X PATCH -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"index":0,"value":"k1b"}' \
http://localhost:8317/v0/management/api-keys
```
- 响应:
```json
{ "status": "ok" }
```
- DELETE `/api-keys` — 删除其中一个(`?value=` 或 `?index=`
- 请求(按值删除):
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' -X DELETE 'http://localhost:8317/v0/management/api-keys?value=k1'
```
- 请求(按索引删除):
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' -X DELETE 'http://localhost:8317/v0/management/api-keys?index=0'
```
- 响应:
```json
{ "status": "ok" }
```
### Gemini API Key生成式语言
- GET `/generative-language-api-key`
- 请求:
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' http://localhost:8317/v0/management/generative-language-api-key
```
- 响应:
```json
{ "generative-language-api-key": ["AIzaSy...01","AIzaSy...02"] }
```
- PUT `/generative-language-api-key`
- 请求:
```bash
curl -X PUT -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '["AIzaSy-1","AIzaSy-2"]' \
http://localhost:8317/v0/management/generative-language-api-key
```
- 响应:
```json
{ "status": "ok" }
```
- PATCH `/generative-language-api-key`
- 请求:
```bash
curl -X PATCH -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"old":"AIzaSy-1","new":"AIzaSy-1b"}' \
http://localhost:8317/v0/management/generative-language-api-key
```
- 响应:
```json
{ "status": "ok" }
```
- DELETE `/generative-language-api-key`
- 请求:
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' -X DELETE 'http://localhost:8317/v0/management/generative-language-api-key?value=AIzaSy-2'
```
- 响应:
```json
{ "status": "ok" }
```
### 开启请求日志
- GET `/request-log` — 获取布尔值
- 请求:
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' http://localhost:8317/v0/management/request-log
```
- 响应:
```json
{ "request-log": true }
```
- PUT/PATCH `/request-log` — 设置布尔值
- 请求:
```bash
curl -X PUT -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"value":true}' \
http://localhost:8317/v0/management/request-log
```
- 响应:
```json
{ "status": "ok" }
```
### 请求重试次数
- GET `/request-retry` — 获取整数
- 请求:
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' http://localhost:8317/v0/management/request-retry
```
- 响应:
```json
{ "request-retry": 3 }
```
- PUT/PATCH `/request-retry` — 设置整数
- 请求:
```bash
curl -X PATCH -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"value":5}' \
http://localhost:8317/v0/management/request-retry
```
- 响应:
```json
{ "status": "ok" }
```
### 允许本地未认证访问
- GET `/allow-localhost-unauthenticated` — 获取布尔值
- 请求:
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' http://localhost:8317/v0/management/allow-localhost-unauthenticated
```
- 响应:
```json
{ "allow-localhost-unauthenticated": false }
```
- PUT/PATCH `/allow-localhost-unauthenticated` — 设置布尔值
- 请求:
```bash
curl -X PUT -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"value":true}' \
http://localhost:8317/v0/management/allow-localhost-unauthenticated
```
- 响应:
```json
{ "status": "ok" }
```
### Claude API KEY对象数组
- GET `/claude-api-key` — 列出全部
- 请求:
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' http://localhost:8317/v0/management/claude-api-key
```
- 响应:
```json
{ "claude-api-key": [ { "api-key": "sk-a", "base-url": "" } ] }
```
- PUT `/claude-api-key` — 完整改写列表
- 请求:
```bash
curl -X PUT -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '[{"api-key":"sk-a"},{"api-key":"sk-b","base-url":"https://c.example.com"}]' \
http://localhost:8317/v0/management/claude-api-key
```
- 响应:
```json
{ "status": "ok" }
```
- PATCH `/claude-api-key` — 修改其中一个(按 `index` 或 `match`
- 请求(按索引):
```bash
curl -X PATCH -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"index":1,"value":{"api-key":"sk-b2","base-url":"https://c.example.com"}}' \
http://localhost:8317/v0/management/claude-api-key
```
- 请求(按匹配):
```bash
curl -X PATCH -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"match":"sk-a","value":{"api-key":"sk-a","base-url":""}}' \
http://localhost:8317/v0/management/claude-api-key
```
- 响应:
```json
{ "status": "ok" }
```
- DELETE `/claude-api-key` — 删除其中一个(`?api-key=` 或 `?index=`
- 请求(按 api-key
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' -X DELETE 'http://localhost:8317/v0/management/claude-api-key?api-key=sk-b2'
```
- 请求(按索引):
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' -X DELETE 'http://localhost:8317/v0/management/claude-api-key?index=0'
```
- 响应:
```json
{ "status": "ok" }
```
### OpenAI 兼容提供商(对象数组)
- GET `/openai-compatibility` — 列出全部
- 请求:
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' http://localhost:8317/v0/management/openai-compatibility
```
- 响应:
```json
{ "openai-compatibility": [ { "name": "openrouter", "base-url": "https://openrouter.ai/api/v1", "api-keys": [], "models": [] } ] }
```
- PUT `/openai-compatibility` — 完整改写列表
- 请求:
```bash
curl -X PUT -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '[{"name":"openrouter","base-url":"https://openrouter.ai/api/v1","api-keys":["sk"],"models":[{"name":"m","alias":"a"}]}]' \
http://localhost:8317/v0/management/openai-compatibility
```
- 响应:
```json
{ "status": "ok" }
```
- PATCH `/openai-compatibility` — 修改其中一个(按 `index` 或 `name`
- 请求(按名称):
```bash
curl -X PATCH -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"name":"openrouter","value":{"name":"openrouter","base-url":"https://openrouter.ai/api/v1","api-keys":[],"models":[]}}' \
http://localhost:8317/v0/management/openai-compatibility
```
- 请求(按索引):
```bash
curl -X PATCH -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d '{"index":0,"value":{"name":"openrouter","base-url":"https://openrouter.ai/api/v1","api-keys":[],"models":[]}}' \
http://localhost:8317/v0/management/openai-compatibility
```
- 响应:
```json
{ "status": "ok" }
```
- DELETE `/openai-compatibility` — 删除(`?name=` 或 `?index=`
- 请求(按名称):
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' -X DELETE 'http://localhost:8317/v0/management/openai-compatibility?name=openrouter'
```
- 请求(按索引):
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' -X DELETE 'http://localhost:8317/v0/management/openai-compatibility?index=0'
```
- 响应:
```json
{ "status": "ok" }
```
### 认证文件管理
管理 `auth-dir` 下的 JSON 令牌文件:列出、下载、上传、删除。
- GET `/auth-files` — 列表
- 请求:
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' http://localhost:8317/v0/management/auth-files
```
- 响应:
```json
{ "files": [ { "name": "acc1.json", "size": 1234, "modtime": "2025-08-30T12:34:56Z" } ] }
```
- GET `/auth-files/download?name=<file.json>` — 下载单个文件
- 请求:
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' -OJ 'http://localhost:8317/v0/management/auth-files/download?name=acc1.json'
```
- POST `/auth-files` — 上传
- 请求multipart
```bash
curl -X POST -F 'file=@/path/to/acc1.json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
http://localhost:8317/v0/management/auth-files
```
- 请求(原始 JSON
```bash
curl -X POST -H 'Content-Type: application/json' \
-H 'Authorization: Bearer <MANAGEMENT_KEY>' \
-d @/path/to/acc1.json \
'http://localhost:8317/v0/management/auth-files?name=acc1.json'
```
- 响应:
```json
{ "status": "ok" }
```
- DELETE `/auth-files?name=<file.json>` — 删除单个文件
- 请求:
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' -X DELETE 'http://localhost:8317/v0/management/auth-files?name=acc1.json'
```
- 响应:
```json
{ "status": "ok" }
```
- DELETE `/auth-files?all=true` — 删除 `auth-dir` 下所有 `.json` 文件
- 请求:
```bash
curl -H 'Authorization: Bearer <MANAGEMENT_KEY>' -X DELETE 'http://localhost:8317/v0/management/auth-files?all=true'
```
- 响应:
```json
{ "status": "ok", "deleted": 3 }
```
## 错误响应
通用错误格式:
- 400 Bad Request: `{ "error": "invalid body" }`
- 401 Unauthorized: `{ "error": "missing management key" }` 或 `{ "error": "invalid management key" }`
- 403 Forbidden: `{ "error": "remote management disabled" }`
- 404 Not Found: `{ "error": "item not found" }` 或 `{ "error": "file not found" }`
- 500 Internal Server Error: `{ "error": "failed to save config: ..." }`
## 说明
- 变更会写回 YAML 配置文件,并由文件监控器热重载配置与客户端。
- `allow-remote-management` 与 `remote-management-key` 不能通过 API 修改,需在配置文件中设置。

138
README.md
View File

@@ -6,9 +6,9 @@ A proxy server that provides OpenAI/Gemini/Claude compatible API interfaces for
It now also supports OpenAI Codex (GPT models) and Claude Code via OAuth.
so you can use local or multiaccount CLI access with OpenAIcompatible clients and SDKs.
So you can use local or multi-account CLI access with OpenAI-compatible clients and SDKs.
Now, We added the first Chinese provider: [Qwen Code](https://github.com/QwenLM/qwen-code).
The first Chinese provider has now been added: [Qwen Code](https://github.com/QwenLM/qwen-code).
## Features
@@ -19,12 +19,13 @@ Now, We added the first Chinese provider: [Qwen Code](https://github.com/QwenLM/
- Streaming and non-streaming responses
- Function calling/tools support
- Multimodal input support (text and images)
- Multiple accounts with roundrobin load balancing (Gemini, OpenAI, Claude and Qwen)
- Multiple accounts with round-robin load balancing (Gemini, OpenAI, Claude and Qwen)
- Simple CLI authentication flows (Gemini, OpenAI, Claude and Qwen)
- Generative Language API Key support
- Gemini CLI multiaccount load balancing
- Claude Code multiaccount load balancing
- Qwen Code multiaccount load balancing
- Gemini CLI multi-account load balancing
- Claude Code multi-account load balancing
- Qwen Code multi-account load balancing
- OpenAI-compatible upstream providers via config (e.g., OpenRouter)
## Installation
@@ -59,13 +60,13 @@ You can authenticate for Gemini, OpenAI, and/or Claude. All can coexist in the s
```bash
./cli-proxy-api --login
```
If you are an old gemini code user, you may need to specify a project ID:
If you are an existing Gemini Code user, you may need to specify a project ID:
```bash
./cli-proxy-api --login --project_id <your_project_id>
```
The local OAuth callback uses port `8085`.
Options: add `--no-browser` to print the login URL instead of opening a browser. The local OAuth callback uses port `1455`.
Options: add `--no-browser` to print the login URL instead of opening a browser. The local OAuth callback uses port `8085`.
- OpenAI (Codex/GPT via OAuth):
```bash
@@ -126,7 +127,7 @@ Request body example:
```
Notes:
- Use a `gemini-*` model for Gemini (e.g., `gemini-2.5-pro`), a `gpt-*` model for OpenAI (e.g., `gpt-5`), a `claude-*` model for Claude (e.g., `claude-3-5-sonnet-20241022`), or a `qwen-*` model for Qwen (e.g., `qwen3-coder-plus`). The proxy will route to the correct provider automatically.
- Use a `gemini-*` model for Gemini (e.g., "gemini-2.5-pro"), a `gpt-*` model for OpenAI (e.g., "gpt-5"), a `claude-*` model for Claude (e.g., "claude-3-5-sonnet-20241022"), or a `qwen-*` model for Qwen (e.g., "qwen3-coder-plus"). The proxy will route to the correct provider automatically.
#### Claude Messages (SSE-compatible)
@@ -226,7 +227,7 @@ console.log(await claudeResponse.json());
- claude-3-5-haiku-20241022
- qwen3-coder-plus
- qwen3-coder-flash
- Gemini models autoswitch to preview variants when needed
- Gemini models auto-switch to preview variants when needed
## Configuration
@@ -238,21 +239,30 @@ The server uses a YAML configuration file (`config.yaml`) located in the project
### Configuration Options
| Parameter | Type | Default | Description |
|---------------------------------------|----------|--------------------|-----------------------------------------------------------------------------------------------|
| `port` | integer | 8317 | The port number on which the server will listen |
| `auth-dir` | string | "~/.cli-proxy-api" | Directory where authentication tokens are stored. Supports using `~` for home directory |
| `proxy-url` | string | "" | Proxy url, support socks5/http/https protocol, example: socks5://user:pass@192.168.1.1:1080/ |
| `request-retry` | integer | 0 | Request retry times, if the response http code is 403, 408, 500, 502, 503, 504, it will retry |
| `quota-exceeded` | object | {} | Configuration for handling quota exceeded |
| `quota-exceeded.switch-project` | boolean | true | Whether to automatically switch to another project when a quota is exceeded |
| `quota-exceeded.switch-preview-model` | boolean | true | Whether to automatically switch to a preview model when a quota is exceeded |
| `debug` | boolean | false | Enable debug mode for verbose logging |
| `api-keys` | string[] | [] | List of API keys that can be used to authenticate requests |
| `generative-language-api-key` | string[] | [] | List of Generative Language API keys |
| `claude-api-key` | object | {} | List of Claude API keys |
| `claude-api-key.api-key` | string | "" | Claude API key |
| `claude-api-key.base-url` | string | "" | Custom Claude API endpoint, if you use the third party API endpoint |
| Parameter | Type | Default | Description |
|-----------------------------------------|----------|--------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `port` | integer | 8317 | The port number on which the server will listen. |
| `auth-dir` | string | "~/.cli-proxy-api" | Directory where authentication tokens are stored. Supports using `~` for the home directory. If you use Windows, please set the directory like this: `C:/cli-proxy-api/` |
| `proxy-url` | string | "" | Proxy URL. Supports socks5/http/https protocols. Example: socks5://user:pass@192.168.1.1:1080/ |
| `request-retry` | integer | 0 | Number of times to retry a request. Retries will occur if the HTTP response code is 403, 408, 500, 502, 503, or 504. |
| `remote-management.allow-remote` | boolean | false | Whether to allow remote (non-localhost) access to the management API. If false, only localhost can access. A management key is still required for localhost. |
| `remote-management.secret-key` | string | "" | Management key. If a plaintext value is provided, it will be hashed on startup using bcrypt and persisted back to the config file. If empty, the entire management API is disabled (404). |
| `quota-exceeded` | object | {} | Configuration for handling quota exceeded. |
| `quota-exceeded.switch-project` | boolean | true | Whether to automatically switch to another project when a quota is exceeded. |
| `quota-exceeded.switch-preview-model` | boolean | true | Whether to automatically switch to a preview model when a quota is exceeded. |
| `debug` | boolean | false | Enable debug mode for verbose logging. |
| `api-keys` | string[] | [] | List of API keys that can be used to authenticate requests. |
| `generative-language-api-key` | string[] | [] | List of Generative Language API keys. |
| `claude-api-key` | object | {} | List of Claude API keys. |
| `claude-api-key.api-key` | string | "" | Claude API key. |
| `claude-api-key.base-url` | string | "" | Custom Claude API endpoint, if you use a third-party API endpoint. |
| `openai-compatibility` | object[] | [] | Upstream OpenAI-compatible providers configuration (name, base-url, api-keys, models). |
| `openai-compatibility.*.name` | string | "" | The name of the provider. It will be used in the user agent and other places. |
| `openai-compatibility.*.base-url` | string | "" | The base URL of the provider. |
| `openai-compatibility.*.api-keys` | string[] | [] | The API keys for the provider. Add multiple keys if needed. Omit if unauthenticated access is allowed. |
| `openai-compatibility.*.models` | object[] | [] | The actual model name. |
| `openai-compatibility.*.models.*.name` | string | "" | The models supported by the provider. |
| `openai-compatibility.*.models.*.alias` | string | "" | The alias used in the API. |
### Example Configuration File
@@ -260,16 +270,27 @@ The server uses a YAML configuration file (`config.yaml`) located in the project
# Server port
port: 8317
# Authentication directory (supports ~ for home directory)
# Management API settings
remote-management:
# Whether to allow remote (non-localhost) management access.
# When false, only localhost can access management endpoints (a key is still required).
allow-remote: false
# Management key. If a plaintext value is provided here, it will be hashed on startup.
# All management requests (even from localhost) require this key.
# Leave empty to disable the Management API entirely (404 for all /v0/management routes).
secret-key: ""
# Authentication directory (supports ~ for home directory). If you use Windows, please set the directory like this: `C:/cli-proxy-api/`
auth-dir: "~/.cli-proxy-api"
# Enable debug logging
debug: false
# Proxy url, support socks5/http/https protocol, example: socks5://user:pass@192.168.1.1:1080/
# Proxy URL. Supports socks5/http/https protocols. Example: socks5://user:pass@192.168.1.1:1080/
proxy-url: ""
# Request retry times, if the response http code is 403, 408, 500, 502, 503, 504, it will retry
# Number of times to retry a request. Retries will occur if the HTTP response code is 403, 408, 500, 502, 503, or 504.
request-retry: 3
# Quota exceeded behavior
@@ -294,8 +315,51 @@ claude-api-key:
- api-key: "sk-atSM..." # use the official claude API key, no need to set the base url
- api-key: "sk-atSM..."
base-url: "https://www.example.com" # use the custom claude API endpoint
# OpenAI compatibility providers
openai-compatibility:
- name: "openrouter" # The name of the provider; it will be used in the user agent and other places.
base-url: "https://openrouter.ai/api/v1" # The base URL of the provider.
api-keys: # The API keys for the provider. Add multiple keys if needed. Omit if unauthenticated access is allowed.
- "sk-or-v1-...b780"
- "sk-or-v1-...b781"
models: # The models supported by the provider.
- name: "moonshotai/kimi-k2:free" # The actual model name.
alias: "kimi-k2" # The alias used in the API.
```
### OpenAI Compatibility Providers
Configure upstream OpenAI-compatible providers (e.g., OpenRouter) via `openai-compatibility`.
- name: provider identifier used internally
- base-url: provider base URL
- api-keys: optional list of API keys (omit if provider allows unauthenticated requests)
- models: list of mappings from upstream model `name` to local `alias`
Example:
```yaml
openai-compatibility:
- name: "openrouter"
base-url: "https://openrouter.ai/api/v1"
api-keys:
- "sk-or-v1-...b780"
- "sk-or-v1-...b781"
models:
- name: "moonshotai/kimi-k2:free"
alias: "kimi-k2"
```
Usage:
Call OpenAI's endpoint `/v1/chat/completions` with `model` set to the alias (e.g., `kimi-k2`). The proxy routes to the configured provider/model automatically.
Also, you may call Claude's endpoint `/v1/messages`, Gemini's `/v1beta/models/model-name:streamGenerateContent` or `/v1beta/models/model-name:generateContent`.
And you can always use Gemini CLI with `CODE_ASSIST_ENDPOINT` set to `http://127.0.0.1:8317` for these OpenAI-compatible provider's models.
### Authentication Directory
The `auth-dir` parameter specifies where authentication tokens are stored. When you run the login command, the application will create JSON files in this directory containing the authentication tokens for your Google accounts. Multiple accounts can be used for load balancing.
@@ -327,8 +391,8 @@ export CODE_ASSIST_ENDPOINT="http://127.0.0.1:8317"
The server will relay the `loadCodeAssist`, `onboardUser`, and `countTokens` requests. And automatically load balance the text generation requests between the multiple accounts.
> [!NOTE]
> This feature only allows local access because I couldn't find a way to authenticate the requests.
> I hardcoded `127.0.0.1` into the load balancing.
> This feature only allows local access because there is currently no way to authenticate the requests.
> 127.0.0.1 is hardcoded for load balancing.
## Claude Code with multiple account load balancing
@@ -380,10 +444,16 @@ Run the following command to login (OpenAI OAuth on port 1455):
docker run --rm -p 1455:1455 -v /path/to/your/config.yaml:/CLIProxyAPI/config.yaml -v /path/to/your/auth-dir:/root/.cli-proxy-api eceasy/cli-proxy-api:latest /CLIProxyAPI/CLIProxyAPI --codex-login
```
Run the following command to login (Claude OAuth on port 54545):
Run the following command to logi (Claude OAuth on port 54545):
```bash
docker run --rm -p 54545:54545 -v /path/to/your/config.yaml:/CLIProxyAPI/config.yaml -v /path/to/your/auth-dir:/root/.cli-proxy-api eceasy/cli-proxy-api:latest /CLIProxyAPI/CLIProxyAPI --claude-login
docker run -rm -p 54545:54545 -v /path/to/your/config.yaml:/CLIProxyAPI/config.yaml -v /path/to/your/auth-dir:/root/.cli-proxy-api eceasy/cli-proxy-api:latest /CLIProxyAPI/CLIProxyAPI --claude-login
```
Run the following command to login (Qwen OAuth):
```bash
docker run -it -rm -v /path/to/your/config.yaml:/CLIProxyAPI/config.yaml -v /path/to/your/auth-dir:/root/.cli-proxy-api eceasy/cli-proxy-api:latest /CLIProxyAPI/CLIProxyAPI --qwen-login
```
Run the following command to start the server:
@@ -392,6 +462,10 @@ Run the following command to start the server:
docker run --rm -p 8317:8317 -v /path/to/your/config.yaml:/CLIProxyAPI/config.yaml -v /path/to/your/auth-dir:/root/.cli-proxy-api eceasy/cli-proxy-api:latest
```
## Management API
see [MANAGEMENT_API.md](MANAGEMENT_API.md)
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.

View File

@@ -6,9 +6,9 @@
现已支持通过 OAuth 登录接入 OpenAI CodexGPT 系列)和 Claude Code。
可与本地或多账户方式配合,使用任何 OpenAI 兼容的客户端SDK。
您可以使用本地或多账户的CLI方式通过任何OpenAI兼容的客户端SDK进行访问
在,我们添加了第一个中国提供商:[Qwen Code](https://github.com/QwenLM/qwen-code)。
已新增首个中国提供商:[Qwen Code](https://github.com/QwenLM/qwen-code)。
## 功能特性
@@ -25,6 +25,7 @@
- 支持 Gemini CLI 多账户轮询
- 支持 Claude Code 多账户轮询
- 支持 Qwen Code 多账户轮询
- 通过配置接入上游 OpenAI 兼容提供商(例如 OpenRouter
## 安装
@@ -59,12 +60,14 @@
```bash
./cli-proxy-api --login
```
如果您是旧版 gemini code 用户,可能需要指定项目 ID
如果您是现有的 Gemini Code 用户,可能需要指定一个项目ID
```bash
./cli-proxy-api --login --project_id <your_project_id>
```
本地 OAuth 回调端口为 `8085`。
选项:加上 `--no-browser` 可打印登录地址而不自动打开浏览器。本地 OAuth 回调端口为 `8085`。
- OpenAICodex/GPTOAuth
```bash
./cli-proxy-api --codex-login
@@ -123,7 +126,7 @@ POST http://localhost:8317/v1/chat/completions
```
说明:
- 使用 `gemini-*` 模型(如 `gemini-2.5-pro`)走 Gemini使用 `gpt-*` 模型(如 `gpt-5`)走 OpenAI使用 `claude-*` 模型(如 `claude-3-5-sonnet-20241022`)走 Claude使用 `qwen-*` 模型(如 `qwen3-coder-plus`)走 Qwen服务会自动路由到对应提供商。
- 使用 "gemini-*" 模型("gemini-2.5-pro")来调用 Gemini使用 "gpt-*" 模型("gpt-5")来调用 OpenAI使用 "claude-*" 模型("claude-3-5-sonnet-20241022")来调用 Claude或者使用 "qwen-*" 模型("qwen3-coder-plus")来调用 Qwen。代理服务会自动将请求路由到相应的提供商。
#### Claude 消息SSE 兼容)
@@ -230,26 +233,35 @@ console.log(await claudeResponse.json());
服务器默认使用位于项目根目录的 YAML 配置文件(`config.yaml`)。您可以使用 `--config` 标志指定不同的配置文件路径:
```bash
./cli-proxy-api --config /path/to/your/config.yaml
./cli-proxy-api --config /path/to/your/config.yaml
```
### 配置选项
| 参数 | 类型 | 默认值 | 描述 |
|---------------------------------------|----------|--------------------|------------------------------------------------------------------------|
| `port` | integer | 8317 | 服务器监听的端口号 |
| `auth-dir` | string | "~/.cli-proxy-api" | 存储身份验证令牌的目录。支持使用 `~` 表示主目录 |
| `proxy-url` | string | "" | 代理 URL支持 socks5/http/https 协议,示例socks5://user:pass@192.168.1.1:1080/ |
| `request-retry` | integer | 0 | 请求失败重试次数如果响应的http代码为403、408、500、502、503504则会自动重试 |
| `quota-exceeded` | object | {} | 处理配额超限的配置 |
| `quota-exceeded.switch-project` | boolean | true | 当配额超限时是否自动切换到另一个项目 |
| `quota-exceeded.switch-preview-model` | boolean | true | 当配额超限时是否自动切换到预览模型 |
| `debug` | boolean | false | 启用调试模式以进行详细日志记录 |
| `api-keys` | string[] | [] | 可用于验证请求的 API 密钥列表 |
| `generative-language-api-key` | string[] | [] | 生成式语言 API 密钥列表 |
| `claude-api-key` | object | {} | Claude API 密钥列表 |
| `claude-api-key.api-key` | string | "" | Claude API 密钥 |
| `claude-api-key.base-url` | string | "" | 自定义 Claude API 端点(如果你使用的是第三方 Claude API 端点) |
| 参数 | 类型 | 默认值 | 描述 |
|-----------------------------------------|----------|--------------------|---------------------------------------------------------------------|
| `port` | integer | 8317 | 服务器监听的端口号 |
| `auth-dir` | string | "~/.cli-proxy-api" | 存储身份验证令牌的目录。支持使用 `~` 表示主目录。如果你使用Windows建议设置成`C:/cli-proxy-api/`。 |
| `proxy-url` | string | "" | 代理URL支持socks5/http/https协议。例如socks5://user:pass@192.168.1.1:1080/ |
| `request-retry` | integer | 0 | 请求重试次数如果HTTP响应码为403、408、500、502、503504,将会触发重试 |
| `remote-management.allow-remote` | boolean | false | 是否允许远程非localhost访问管理接口。为false时仅允许本地访问本地访问同样需要管理密钥。 |
| `remote-management.secret-key` | string | "" | 管理密钥。若配置为明文启动时会自动进行bcrypt加密并写回配置文件。若为空管理接口整体不可用404 |
| `quota-exceeded` | object | {} | 用于处理配额超限的配置。 |
| `quota-exceeded.switch-project` | boolean | true | 当配额超限时,是否自动切换到另一个项目。 |
| `quota-exceeded.switch-preview-model` | boolean | true | 当配额超限时,是否自动切换到预览模型。 |
| `debug` | boolean | false | 启用调试模式以获取详细日志。 |
| `api-keys` | string[] | [] | 可用于验证请求的API密钥列表 |
| `generative-language-api-key` | string[] | [] | 生成式语言API密钥列表。 |
| `claude-api-key` | object | {} | Claude API密钥列表。 |
| `claude-api-key.api-key` | string | "" | Claude API密钥。 |
| `claude-api-key.base-url` | string | "" | 自定义的Claude API端点如果您使用第三方的API端点。 |
| `openai-compatibility` | object[] | [] | 上游OpenAI兼容提供商的配置名称、基础URL、API密钥、模型。 |
| `openai-compatibility.*.name` | string | "" | 提供商的名称。它将被用于用户代理User Agent和其他地方。 |
| `openai-compatibility.*.base-url` | string | "" | 提供商的基础URL。 |
| `openai-compatibility.*.api-keys` | string[] | [] | 提供商的API密钥。如果需要可以添加多个密钥。如果允许未经身份验证的访问则可以省略。 |
| `openai-compatibility.*.models` | object[] | [] | 实际的模型名称。 |
| `openai-compatibility.*.models.*.name` | string | "" | 提供商支持的模型。 |
| `openai-compatibility.*.models.*.alias` | string | "" | 在API中使用的别名。 |
### 配置文件示例
@@ -257,16 +269,26 @@ console.log(await claudeResponse.json());
# 服务器端口
port: 8317
# 身份验证目录(支持 ~ 表示主目录)
# 管理 API 设置
remote-management:
# 是否允许远程非localhost访问管理接口。为false时仅允许本地访问但本地访问同样需要管理密钥
allow-remote: false
# 管理密钥。若配置为明文启动时会自动进行bcrypt加密并写回配置文件。
# 所有管理请求(包括本地)都需要该密钥。
# 若为空,/v0/management 整体处于 404禁用
secret-key: ""
# 身份验证目录(支持 ~ 表示主目录。如果你使用Windows建议设置成`C:/cli-proxy-api/`。
auth-dir: "~/.cli-proxy-api"
# 启用调试日志
debug: false
# 代理 URL支持 socks5/http/https 协议,示例socks5://user:pass@192.168.1.1:1080/
# 代理URL支持socks5/http/https协议。例如socks5://user:pass@192.168.1.1:1080/
proxy-url: ""
# 请求失败重试次数如果响应的http代码为403、408、500、502、503504则会自动重试
# 请求重试次数如果HTTP响应码为403、408、500、502、503504,将会触发重试
request-retry: 3
@@ -292,8 +314,46 @@ claude-api-key:
- api-key: "sk-atSM..." # use the official claude API key, no need to set the base url
- api-key: "sk-atSM..."
base-url: "https://www.example.com" # use the custom claude API endpoint
# OpenAI 兼容提供商
openai-compatibility:
- name: "openrouter" # 提供商的名称;它将被用于用户代理和其它地方。
base-url: "https://openrouter.ai/api/v1" # 提供商的基础URL。
api-keys: # 提供商的API密钥。如果需要可以添加多个密钥。如果允许未经身份验证的访问则可以省略。
- "sk-or-v1-...b780"
- "sk-or-v1-...b781"
models: # 提供商支持的模型。
- name: "moonshotai/kimi-k2:free" # 实际的模型名称。
alias: "kimi-k2" # 在API中使用的别名。
```
### OpenAI 兼容上游提供商
通过 `openai-compatibility` 配置上游 OpenAI 兼容提供商(例如 OpenRouter
- name内部识别名
- base-url提供商基础地址
- api-keys可选多密钥轮询若提供商支持无鉴权可省略
- models将上游模型 `name` 映射为本地可用 `alias`
示例:
```yaml
openai-compatibility:
- name: "openrouter"
base-url: "https://openrouter.ai/api/v1"
api-keys:
- "sk-or-v1-...b780"
- "sk-or-v1-...b781"
models:
- name: "moonshotai/kimi-k2:free"
alias: "kimi-k2"
```
使用方式:在 `/v1/chat/completions` 中将 `model` 设为别名(如 `kimi-k2`),代理将自动路由到对应提供商与模型。
并且对于这些与OpenAI兼容的提供商模型您始终可以通过将CODE_ASSIST_ENDPOINT设置为 http://127.0.0.1:8317 来使用Gemini CLI。
### 身份验证目录
`auth-dir` 参数指定身份验证令牌的存储位置。当您运行登录命令时,应用程序将在此目录中创建包含 Google 账户身份验证令牌的 JSON 文件。多个账户可用于轮询。
@@ -325,7 +385,7 @@ export CODE_ASSIST_ENDPOINT="http://127.0.0.1:8317"
服务器将中继 `loadCodeAssist`、`onboardUser` 和 `countTokens` 请求。并自动在多个账户之间轮询文本生成请求。
> [!NOTE]
> 此功能仅允许本地访问,因为找不到一个可以验证请求的方法。
> 此功能仅允许本地访问,因为找不到一个可以验证请求的方法。
> 所以只能强制只有 `127.0.0.1` 可以访问。
## Claude Code 的使用方法
@@ -385,12 +445,23 @@ docker run --rm -p 1455:1455 -v /path/to/your/config.yaml:/CLIProxyAPI/config.ya
docker run --rm -p 54545:54545 -v /path/to/your/config.yaml:/CLIProxyAPI/config.yaml -v /path/to/your/auth-dir:/root/.cli-proxy-api eceasy/cli-proxy-api:latest /CLIProxyAPI/CLIProxyAPI --claude-login
```
运行以下命令进行登录Qwen OAuth
```bash
docker run -it -rm -v /path/to/your/config.yaml:/CLIProxyAPI/config.yaml -v /path/to/your/auth-dir:/root/.cli-proxy-api eceasy/cli-proxy-api:latest /CLIProxyAPI/CLIProxyAPI --qwen-login
```
运行以下命令启动服务器:
```bash
docker run --rm -p 8317:8317 -v /path/to/your/config.yaml:/CLIProxyAPI/config.yaml -v /path/to/your/auth-dir:/root/.cli-proxy-api eceasy/cli-proxy-api:latest
```
## 管理 API 文档
请参见 [MANAGEMENT_API_CN.md](MANAGEMENT_API_CN.md)
## 贡献
欢迎贡献!请随时提交 Pull Request。

View File

@@ -1,23 +1,40 @@
# Server configuration
# Server port
port: 8317
# Management API settings
remote-management:
# Whether to allow remote (non-localhost) management access.
# When false, only localhost can access management endpoints (a key is still required).
allow-remote: false
# Management key. If a plaintext value is provided here, it will be hashed on startup.
# All management requests (even from localhost) require this key.
# Leave empty to disable the Management API entirely (404 for all /v0/management routes).
secret-key: ""
# Authentication directory (supports ~ for home directory)
auth-dir: "~/.cli-proxy-api"
debug: true
# Enable debug logging
debug: false
# Proxy URL. Supports socks5/http/https protocols. Example: socks5://user:pass@192.168.1.1:1080/
proxy-url: ""
# Request retry times, if the response http code is 403, 408, 500, 502, 503, 504, it will retry
# Number of times to retry a request. Retries will occur if the HTTP response code is 403, 408, 500, 502, 503, or 504.
request-retry: 3
# Quota exceeded behavior
quota-exceeded:
switch-project: true
switch-preview-model: true
switch-project: true # Whether to automatically switch to another project when a quota is exceeded
switch-preview-model: true # Whether to automatically switch to a preview model when a quota is exceeded
# API keys for client authentication
# API keys for authentication
api-keys:
- "12345"
- "23456"
- "your-api-key-1"
- "your-api-key-2"
# Generative language API keys
# API keys for official Generative Language API
generative-language-api-key:
- "AIzaSy...01"
- "AIzaSy...02"
@@ -29,3 +46,14 @@ claude-api-key:
- api-key: "sk-atSM..." # use the official claude API key, no need to set the base url
- api-key: "sk-atSM..."
base-url: "https://www.example.com" # use the custom claude API endpoint
# OpenAI compatibility providers
openai-compatibility:
- name: "openrouter" # The name of the provider; it will be used in the user agent and other places.
base-url: "https://openrouter.ai/api/v1" # The base URL of the provider.
api-keys: # The API keys for the provider. Add multiple keys if needed. Omit if unauthenticated access is allowed.
- "sk-or-v1-...b780"
- "sk-or-v1-...b781"
models: # The models supported by the provider.
- name: "moonshotai/kimi-k2:free" # The actual model name.
alias: "kimi-k2" # The alias used in the API.

View File

@@ -16,6 +16,8 @@ import (
"github.com/luispater/CLIProxyAPI/internal/api/handlers"
. "github.com/luispater/CLIProxyAPI/internal/constant"
"github.com/luispater/CLIProxyAPI/internal/interfaces"
"github.com/luispater/CLIProxyAPI/internal/registry"
"github.com/luispater/CLIProxyAPI/internal/util"
log "github.com/sirupsen/logrus"
"github.com/tidwall/gjson"
)
@@ -47,7 +49,9 @@ func (h *ClaudeCodeAPIHandler) HandlerType() string {
// Models returns a list of models supported by this handler.
func (h *ClaudeCodeAPIHandler) Models() []map[string]any {
return make([]map[string]any, 0)
// Get dynamic models from the global registry
modelRegistry := registry.GetGlobalRegistry()
return modelRegistry.GetAvailableModels("claude")
}
// ClaudeMessages handles Claude-compatible streaming chat completions.
@@ -79,6 +83,17 @@ func (h *ClaudeCodeAPIHandler) ClaudeMessages(c *gin.Context) {
h.handleStreamingResponse(c, rawJSON)
}
// ClaudeModels handles the Claude models listing endpoint.
// It returns a JSON response containing available Claude models and their specifications.
//
// Parameters:
// - c: The Gin context for the request.
func (h *ClaudeCodeAPIHandler) ClaudeModels(c *gin.Context) {
c.JSON(http.StatusOK, gin.H{
"data": h.Models(),
})
}
// handleStreamingResponse streams Claude-compatible responses backed by Gemini.
// It sets up SSE, selects a backend client with rotation/quota logic,
// forwards chunks, and translates them to Claude CLI format.
@@ -119,7 +134,9 @@ func (h *ClaudeCodeAPIHandler) handleStreamingResponse(c *gin.Context, rawJSON [
// Ensure the client's mutex is unlocked on function exit.
// This prevents deadlocks and ensures proper resource cleanup
if cliClient != nil {
cliClient.GetRequestMutex().Unlock()
if mutex := cliClient.GetRequestMutex(); mutex != nil {
mutex.Unlock()
}
}
}()
retryCount := 0
@@ -148,7 +165,7 @@ outLoop:
// Detects when the HTTP client has disconnected and cleans up resources
case <-c.Request.Context().Done():
if c.Request.Context().Err().Error() == "context canceled" {
log.Debugf("GeminiClient disconnected: %v", c.Request.Context().Err())
log.Debugf("claude client disconnected: %v", c.Request.Context().Err())
cliCancel() // Cancel the backend request to prevent resource leaks
return
}
@@ -177,7 +194,15 @@ outLoop:
continue outLoop // Restart the client selection process
}
case 403, 408, 500, 502, 503, 504:
log.Debugf("http status code %d, switch client", errInfo.StatusCode)
log.Debugf("http status code %d, switch client, %s", errInfo.StatusCode, util.HideAPIKey(cliClient.GetEmail()))
retryCount++
continue outLoop
case 401:
log.Debugf("unauthorized request, try to refresh token, %s", util.HideAPIKey(cliClient.GetEmail()))
err := cliClient.RefreshTokens(cliCtx)
if err != nil {
log.Debugf("refresh token failed, switch client, %s", util.HideAPIKey(cliClient.GetEmail()))
}
retryCount++
continue outLoop
default:

View File

@@ -163,7 +163,9 @@ func (h *GeminiCLIAPIHandler) handleInternalStreamGenerateContent(c *gin.Context
defer func() {
// Ensure the client's mutex is unlocked on function exit.
if cliClient != nil {
cliClient.GetRequestMutex().Unlock()
if mutex := cliClient.GetRequestMutex(); mutex != nil {
mutex.Unlock()
}
}
}()
@@ -188,7 +190,7 @@ outLoop:
// Handle client disconnection.
case <-c.Request.Context().Done():
if c.Request.Context().Err().Error() == "context canceled" {
log.Debugf("Client disconnected: %v", c.Request.Context().Err())
log.Debugf("gemini cli client disconnected: %v", c.Request.Context().Err())
cliCancel() // Cancel the backend request.
return
}
@@ -244,7 +246,9 @@ func (h *GeminiCLIAPIHandler) handleInternalGenerateContent(c *gin.Context, rawJ
var cliClient interfaces.Client
defer func() {
if cliClient != nil {
cliClient.GetRequestMutex().Unlock()
if mutex := cliClient.GetRequestMutex(); mutex != nil {
mutex.Unlock()
}
}
}()
@@ -271,6 +275,14 @@ func (h *GeminiCLIAPIHandler) handleInternalGenerateContent(c *gin.Context, rawJ
log.Debugf("http status code %d, switch client", err.StatusCode)
retryCount++
continue
case 401:
log.Debugf("unauthorized request, try to refresh token, %s", util.HideAPIKey(cliClient.GetEmail()))
errRefreshTokens := cliClient.RefreshTokens(cliCtx)
if errRefreshTokens != nil {
log.Debugf("refresh token failed, switch client, %s", util.HideAPIKey(cliClient.GetEmail()))
}
retryCount++
continue
default:
// Forward other errors directly to the client
c.Status(err.StatusCode)

View File

@@ -16,6 +16,8 @@ import (
"github.com/luispater/CLIProxyAPI/internal/api/handlers"
. "github.com/luispater/CLIProxyAPI/internal/constant"
"github.com/luispater/CLIProxyAPI/internal/interfaces"
"github.com/luispater/CLIProxyAPI/internal/registry"
"github.com/luispater/CLIProxyAPI/internal/util"
log "github.com/sirupsen/logrus"
)
@@ -40,62 +42,9 @@ func (h *GeminiAPIHandler) HandlerType() string {
// Models returns the Gemini-compatible model metadata supported by this handler.
func (h *GeminiAPIHandler) Models() []map[string]any {
return []map[string]any{
{
"name": "models/gemini-2.5-flash",
"version": "001",
"displayName": "Gemini 2.5 Flash",
"description": "Stable version of Gemini 2.5 Flash, our mid-size multimodal model that supports up to 1 million tokens, released in June of 2025.",
"inputTokenLimit": 1048576,
"outputTokenLimit": 65536,
"supportedGenerationMethods": []string{
"generateContent",
"countTokens",
"createCachedContent",
"batchGenerateContent",
},
"temperature": 1,
"topP": 0.95,
"topK": 64,
"maxTemperature": 2,
"thinking": true,
},
{
"name": "models/gemini-2.5-pro",
"version": "2.5",
"displayName": "Gemini 2.5 Pro",
"description": "Stable release (June 17th, 2025) of Gemini 2.5 Pro",
"inputTokenLimit": 1048576,
"outputTokenLimit": 65536,
"supportedGenerationMethods": []string{
"generateContent",
"countTokens",
"createCachedContent",
"batchGenerateContent",
},
"temperature": 1,
"topP": 0.95,
"topK": 64,
"maxTemperature": 2,
"thinking": true,
},
{
"name": "gpt-5",
"version": "001",
"displayName": "GPT 5",
"description": "Stable version of GPT 5, The best model for coding and agentic tasks across domains.",
"inputTokenLimit": 400000,
"outputTokenLimit": 128000,
"supportedGenerationMethods": []string{
"generateContent",
},
"temperature": 1,
"topP": 0.95,
"topK": 64,
"maxTemperature": 2,
"thinking": true,
},
}
// Get dynamic models from the global registry
modelRegistry := registry.GetGlobalRegistry()
return modelRegistry.GetAvailableModels("gemini")
}
// GeminiModels handles the Gemini models listing endpoint.
@@ -266,7 +215,9 @@ func (h *GeminiAPIHandler) handleStreamGenerateContent(c *gin.Context, modelName
defer func() {
// Ensure the client's mutex is unlocked on function exit.
if cliClient != nil {
cliClient.GetRequestMutex().Unlock()
if mutex := cliClient.GetRequestMutex(); mutex != nil {
mutex.Unlock()
}
}
}()
@@ -290,7 +241,7 @@ outLoop:
// Handle client disconnection.
case <-c.Request.Context().Done():
if c.Request.Context().Err().Error() == "context canceled" {
log.Debugf("GeminiClient disconnected: %v", c.Request.Context().Err())
log.Debugf("gemini client disconnected: %v", c.Request.Context().Err())
cliCancel() // Cancel the backend request.
return
}
@@ -355,7 +306,9 @@ func (h *GeminiAPIHandler) handleCountTokens(c *gin.Context, modelName string, r
var cliClient interfaces.Client
defer func() {
if cliClient != nil {
cliClient.GetRequestMutex().Unlock()
if mutex := cliClient.GetRequestMutex(); mutex != nil {
mutex.Unlock()
}
}
}()
@@ -406,7 +359,9 @@ func (h *GeminiAPIHandler) handleGenerateContent(c *gin.Context, modelName strin
var cliClient interfaces.Client
defer func() {
if cliClient != nil {
cliClient.GetRequestMutex().Unlock()
if mutex := cliClient.GetRequestMutex(); mutex != nil {
mutex.Unlock()
}
}
}()
@@ -433,6 +388,14 @@ func (h *GeminiAPIHandler) handleGenerateContent(c *gin.Context, modelName strin
log.Debugf("http status code %d, switch client", err.StatusCode)
retryCount++
continue
case 401:
log.Debugf("unauthorized request, try to refresh token, %s", util.HideAPIKey(cliClient.GetEmail()))
errRefreshTokens := cliClient.RefreshTokens(cliCtx)
if errRefreshTokens != nil {
log.Debugf("refresh token failed, switch client, %s", util.HideAPIKey(cliClient.GetEmail()))
}
retryCount++
continue
default:
// Forward other errors directly to the client
c.Status(err.StatusCode)

View File

@@ -102,18 +102,19 @@ func (h *BaseAPIHandler) GetClient(modelName string, isGenerateContent ...bool)
}
}
// Lock the mutex to update the last used client index
h.Mutex.Lock()
if _, hasKey := h.LastUsedClientIndex[modelName]; !hasKey {
h.LastUsedClientIndex[modelName] = 0
}
if len(clients) == 0 {
h.Mutex.Unlock()
return nil, &interfaces.ErrorMessage{StatusCode: 500, Error: fmt.Errorf("no clients available")}
}
var cliClient interfaces.Client
// Lock the mutex to update the last used client index
h.Mutex.Lock()
startIndex := h.LastUsedClientIndex[modelName]
if (len(isGenerateContent) > 0 && isGenerateContent[0]) || len(isGenerateContent) == 0 {
currentIndex := (startIndex + 1) % len(clients)
@@ -136,6 +137,8 @@ func (h *BaseAPIHandler) GetClient(modelName string, isGenerateContent ...bool)
log.Debugf("Claude Model %s is quota exceeded for account %s", modelName, cliClient.GetEmail())
} else if cliClient.Provider() == "qwen" {
log.Debugf("Qwen Model %s is quota exceeded for account %s", modelName, cliClient.GetEmail())
} else if cliClient.Type() == "openai-compatibility" {
log.Debugf("OpenAI Compatibility Model %s is quota exceeded for provider %s", modelName, cliClient.Provider())
}
cliClient = nil
continue
@@ -145,7 +148,7 @@ func (h *BaseAPIHandler) GetClient(modelName string, isGenerateContent ...bool)
}
if len(reorderedClients) == 0 {
if util.GetProviderName(modelName) == "claude" {
if util.GetProviderName(modelName, h.Cfg) == "claude" {
// log.Debugf("Claude Model %s is quota exceeded for all accounts", modelName)
return nil, &interfaces.ErrorMessage{StatusCode: 429, Error: fmt.Errorf(`{"type":"error","error":{"type":"rate_limit_error","message":"This request would exceed your account's rate limit. Please try again later."}}`)}
}
@@ -155,14 +158,20 @@ func (h *BaseAPIHandler) GetClient(modelName string, isGenerateContent ...bool)
locked := false
for i := 0; i < len(reorderedClients); i++ {
cliClient = reorderedClients[i]
if cliClient.GetRequestMutex().TryLock() {
if mutex := cliClient.GetRequestMutex(); mutex != nil {
if mutex.TryLock() {
locked = true
break
}
} else {
locked = true
break
}
}
if !locked {
cliClient = clients[0]
cliClient.GetRequestMutex().Lock()
if mutex := cliClient.GetRequestMutex(); mutex != nil {
mutex.Lock()
}
}
return cliClient, nil

View File

@@ -0,0 +1,139 @@
package management
import (
"fmt"
"io"
"os"
"path/filepath"
"strings"
"github.com/gin-gonic/gin"
)
// List auth files
func (h *Handler) ListAuthFiles(c *gin.Context) {
entries, err := os.ReadDir(h.cfg.AuthDir)
if err != nil {
c.JSON(500, gin.H{"error": fmt.Sprintf("failed to read auth dir: %v", err)})
return
}
files := make([]gin.H, 0)
for _, e := range entries {
if e.IsDir() {
continue
}
name := e.Name()
if !strings.HasSuffix(strings.ToLower(name), ".json") {
continue
}
if info, errInfo := e.Info(); errInfo == nil {
files = append(files, gin.H{"name": name, "size": info.Size(), "modtime": info.ModTime()})
}
}
c.JSON(200, gin.H{"files": files})
}
// Download single auth file by name
func (h *Handler) DownloadAuthFile(c *gin.Context) {
name := c.Query("name")
if name == "" || strings.Contains(name, string(os.PathSeparator)) {
c.JSON(400, gin.H{"error": "invalid name"})
return
}
if !strings.HasSuffix(strings.ToLower(name), ".json") {
c.JSON(400, gin.H{"error": "name must end with .json"})
return
}
full := filepath.Join(h.cfg.AuthDir, name)
data, err := os.ReadFile(full)
if err != nil {
if os.IsNotExist(err) {
c.JSON(404, gin.H{"error": "file not found"})
} else {
c.JSON(500, gin.H{"error": fmt.Sprintf("failed to read file: %v", err)})
}
return
}
c.Header("Content-Disposition", fmt.Sprintf("attachment; filename=\"%s\"", name))
c.Data(200, "application/json", data)
}
// Upload auth file: multipart or raw JSON with ?name=
func (h *Handler) UploadAuthFile(c *gin.Context) {
if file, err := c.FormFile("file"); err == nil && file != nil {
name := filepath.Base(file.Filename)
if !strings.HasSuffix(strings.ToLower(name), ".json") {
c.JSON(400, gin.H{"error": "file must be .json"})
return
}
dst := filepath.Join(h.cfg.AuthDir, name)
if errSave := c.SaveUploadedFile(file, dst); errSave != nil {
c.JSON(500, gin.H{"error": fmt.Sprintf("failed to save file: %v", errSave)})
return
}
c.JSON(200, gin.H{"status": "ok"})
return
}
name := c.Query("name")
if name == "" || strings.Contains(name, string(os.PathSeparator)) {
c.JSON(400, gin.H{"error": "invalid name"})
return
}
if !strings.HasSuffix(strings.ToLower(name), ".json") {
c.JSON(400, gin.H{"error": "name must end with .json"})
return
}
data, err := io.ReadAll(c.Request.Body)
if err != nil {
c.JSON(400, gin.H{"error": "failed to read body"})
return
}
dst := filepath.Join(h.cfg.AuthDir, filepath.Base(name))
if errWrite := os.WriteFile(dst, data, 0o600); errWrite != nil {
c.JSON(500, gin.H{"error": fmt.Sprintf("failed to write file: %v", errWrite)})
return
}
c.JSON(200, gin.H{"status": "ok"})
}
// Delete auth files: single by name or all
func (h *Handler) DeleteAuthFile(c *gin.Context) {
if all := c.Query("all"); all == "true" || all == "1" || all == "*" {
entries, err := os.ReadDir(h.cfg.AuthDir)
if err != nil {
c.JSON(500, gin.H{"error": fmt.Sprintf("failed to read auth dir: %v", err)})
return
}
deleted := 0
for _, e := range entries {
if e.IsDir() {
continue
}
name := e.Name()
if !strings.HasSuffix(strings.ToLower(name), ".json") {
continue
}
full := filepath.Join(h.cfg.AuthDir, name)
if err = os.Remove(full); err == nil {
deleted++
}
}
c.JSON(200, gin.H{"status": "ok", "deleted": deleted})
return
}
name := c.Query("name")
if name == "" || strings.Contains(name, string(os.PathSeparator)) {
c.JSON(400, gin.H{"error": "invalid name"})
return
}
full := filepath.Join(h.cfg.AuthDir, filepath.Base(name))
if err := os.Remove(full); err != nil {
if os.IsNotExist(err) {
c.JSON(404, gin.H{"error": "file not found"})
} else {
c.JSON(500, gin.H{"error": fmt.Sprintf("failed to remove file: %v", err)})
}
return
}
c.JSON(200, gin.H{"status": "ok"})
}

View File

@@ -0,0 +1,41 @@
package management
import (
"github.com/gin-gonic/gin"
)
// Debug
func (h *Handler) GetDebug(c *gin.Context) { c.JSON(200, gin.H{"debug": h.cfg.Debug}) }
func (h *Handler) PutDebug(c *gin.Context) { h.updateBoolField(c, func(v bool) { h.cfg.Debug = v }) }
// Request log
func (h *Handler) GetRequestLog(c *gin.Context) { c.JSON(200, gin.H{"request-log": h.cfg.RequestLog}) }
func (h *Handler) PutRequestLog(c *gin.Context) {
h.updateBoolField(c, func(v bool) { h.cfg.RequestLog = v })
}
// Request retry
func (h *Handler) GetRequestRetry(c *gin.Context) {
c.JSON(200, gin.H{"request-retry": h.cfg.RequestRetry})
}
func (h *Handler) PutRequestRetry(c *gin.Context) {
h.updateIntField(c, func(v int) { h.cfg.RequestRetry = v })
}
// Allow localhost unauthenticated
func (h *Handler) GetAllowLocalhost(c *gin.Context) {
c.JSON(200, gin.H{"allow-localhost-unauthenticated": h.cfg.AllowLocalhostUnauthenticated})
}
func (h *Handler) PutAllowLocalhost(c *gin.Context) {
h.updateBoolField(c, func(v bool) { h.cfg.AllowLocalhostUnauthenticated = v })
}
// Proxy URL
func (h *Handler) GetProxyURL(c *gin.Context) { c.JSON(200, gin.H{"proxy-url": h.cfg.ProxyURL}) }
func (h *Handler) PutProxyURL(c *gin.Context) {
h.updateStringField(c, func(v string) { h.cfg.ProxyURL = v })
}
func (h *Handler) DeleteProxyURL(c *gin.Context) {
h.cfg.ProxyURL = ""
h.persist(c)
}

View File

@@ -0,0 +1,252 @@
package management
import (
"encoding/json"
"fmt"
"github.com/gin-gonic/gin"
"github.com/luispater/CLIProxyAPI/internal/config"
)
// Generic helpers for list[string]
func (h *Handler) putStringList(c *gin.Context, set func([]string)) {
data, err := c.GetRawData()
if err != nil {
c.JSON(400, gin.H{"error": "failed to read body"})
return
}
var arr []string
if err = json.Unmarshal(data, &arr); err != nil {
var obj struct {
Items []string `json:"items"`
}
if err2 := json.Unmarshal(data, &obj); err2 != nil || len(obj.Items) == 0 {
c.JSON(400, gin.H{"error": "invalid body"})
return
}
arr = obj.Items
}
set(arr)
h.persist(c)
}
func (h *Handler) patchStringList(c *gin.Context, target *[]string) {
var body struct {
Old *string `json:"old"`
New *string `json:"new"`
Index *int `json:"index"`
Value *string `json:"value"`
}
if err := c.ShouldBindJSON(&body); err != nil {
c.JSON(400, gin.H{"error": "invalid body"})
return
}
if body.Index != nil && body.Value != nil && *body.Index >= 0 && *body.Index < len(*target) {
(*target)[*body.Index] = *body.Value
h.persist(c)
return
}
if body.Old != nil && body.New != nil {
for i := range *target {
if (*target)[i] == *body.Old {
(*target)[i] = *body.New
h.persist(c)
return
}
}
*target = append(*target, *body.New)
h.persist(c)
return
}
c.JSON(400, gin.H{"error": "missing fields"})
}
func (h *Handler) deleteFromStringList(c *gin.Context, target *[]string) {
if idxStr := c.Query("index"); idxStr != "" {
var idx int
_, err := fmt.Sscanf(idxStr, "%d", &idx)
if err == nil && idx >= 0 && idx < len(*target) {
*target = append((*target)[:idx], (*target)[idx+1:]...)
h.persist(c)
return
}
}
if val := c.Query("value"); val != "" {
out := make([]string, 0, len(*target))
for _, v := range *target {
if v != val {
out = append(out, v)
}
}
*target = out
h.persist(c)
return
}
c.JSON(400, gin.H{"error": "missing index or value"})
}
// api-keys
func (h *Handler) GetAPIKeys(c *gin.Context) { c.JSON(200, gin.H{"api-keys": h.cfg.APIKeys}) }
func (h *Handler) PutAPIKeys(c *gin.Context) {
h.putStringList(c, func(v []string) { h.cfg.APIKeys = v })
}
func (h *Handler) PatchAPIKeys(c *gin.Context) { h.patchStringList(c, &h.cfg.APIKeys) }
func (h *Handler) DeleteAPIKeys(c *gin.Context) { h.deleteFromStringList(c, &h.cfg.APIKeys) }
// generative-language-api-key
func (h *Handler) GetGlKeys(c *gin.Context) {
c.JSON(200, gin.H{"generative-language-api-key": h.cfg.GlAPIKey})
}
func (h *Handler) PutGlKeys(c *gin.Context) {
h.putStringList(c, func(v []string) { h.cfg.GlAPIKey = v })
}
func (h *Handler) PatchGlKeys(c *gin.Context) { h.patchStringList(c, &h.cfg.GlAPIKey) }
func (h *Handler) DeleteGlKeys(c *gin.Context) { h.deleteFromStringList(c, &h.cfg.GlAPIKey) }
// claude-api-key: []ClaudeKey
func (h *Handler) GetClaudeKeys(c *gin.Context) {
c.JSON(200, gin.H{"claude-api-key": h.cfg.ClaudeKey})
}
func (h *Handler) PutClaudeKeys(c *gin.Context) {
data, err := c.GetRawData()
if err != nil {
c.JSON(400, gin.H{"error": "failed to read body"})
return
}
var arr []config.ClaudeKey
if err = json.Unmarshal(data, &arr); err != nil {
var obj struct {
Items []config.ClaudeKey `json:"items"`
}
if err2 := json.Unmarshal(data, &obj); err2 != nil || len(obj.Items) == 0 {
c.JSON(400, gin.H{"error": "invalid body"})
return
}
arr = obj.Items
}
h.cfg.ClaudeKey = arr
h.persist(c)
}
func (h *Handler) PatchClaudeKey(c *gin.Context) {
var body struct {
Index *int `json:"index"`
Match *string `json:"match"`
Value *config.ClaudeKey `json:"value"`
}
if err := c.ShouldBindJSON(&body); err != nil || body.Value == nil {
c.JSON(400, gin.H{"error": "invalid body"})
return
}
if body.Index != nil && *body.Index >= 0 && *body.Index < len(h.cfg.ClaudeKey) {
h.cfg.ClaudeKey[*body.Index] = *body.Value
h.persist(c)
return
}
if body.Match != nil {
for i := range h.cfg.ClaudeKey {
if h.cfg.ClaudeKey[i].APIKey == *body.Match {
h.cfg.ClaudeKey[i] = *body.Value
h.persist(c)
return
}
}
}
c.JSON(404, gin.H{"error": "item not found"})
}
func (h *Handler) DeleteClaudeKey(c *gin.Context) {
if val := c.Query("api-key"); val != "" {
out := make([]config.ClaudeKey, 0, len(h.cfg.ClaudeKey))
for _, v := range h.cfg.ClaudeKey {
if v.APIKey != val {
out = append(out, v)
}
}
h.cfg.ClaudeKey = out
h.persist(c)
return
}
if idxStr := c.Query("index"); idxStr != "" {
var idx int
_, err := fmt.Sscanf(idxStr, "%d", &idx)
if err == nil && idx >= 0 && idx < len(h.cfg.ClaudeKey) {
h.cfg.ClaudeKey = append(h.cfg.ClaudeKey[:idx], h.cfg.ClaudeKey[idx+1:]...)
h.persist(c)
return
}
}
c.JSON(400, gin.H{"error": "missing api-key or index"})
}
// openai-compatibility: []OpenAICompatibility
func (h *Handler) GetOpenAICompat(c *gin.Context) {
c.JSON(200, gin.H{"openai-compatibility": h.cfg.OpenAICompatibility})
}
func (h *Handler) PutOpenAICompat(c *gin.Context) {
data, err := c.GetRawData()
if err != nil {
c.JSON(400, gin.H{"error": "failed to read body"})
return
}
var arr []config.OpenAICompatibility
if err = json.Unmarshal(data, &arr); err != nil {
var obj struct {
Items []config.OpenAICompatibility `json:"items"`
}
if err2 := json.Unmarshal(data, &obj); err2 != nil || len(obj.Items) == 0 {
c.JSON(400, gin.H{"error": "invalid body"})
return
}
arr = obj.Items
}
h.cfg.OpenAICompatibility = arr
h.persist(c)
}
func (h *Handler) PatchOpenAICompat(c *gin.Context) {
var body struct {
Name *string `json:"name"`
Index *int `json:"index"`
Value *config.OpenAICompatibility `json:"value"`
}
if err := c.ShouldBindJSON(&body); err != nil || body.Value == nil {
c.JSON(400, gin.H{"error": "invalid body"})
return
}
if body.Index != nil && *body.Index >= 0 && *body.Index < len(h.cfg.OpenAICompatibility) {
h.cfg.OpenAICompatibility[*body.Index] = *body.Value
h.persist(c)
return
}
if body.Name != nil {
for i := range h.cfg.OpenAICompatibility {
if h.cfg.OpenAICompatibility[i].Name == *body.Name {
h.cfg.OpenAICompatibility[i] = *body.Value
h.persist(c)
return
}
}
}
c.JSON(404, gin.H{"error": "item not found"})
}
func (h *Handler) DeleteOpenAICompat(c *gin.Context) {
if name := c.Query("name"); name != "" {
out := make([]config.OpenAICompatibility, 0, len(h.cfg.OpenAICompatibility))
for _, v := range h.cfg.OpenAICompatibility {
if v.Name != name {
out = append(out, v)
}
}
h.cfg.OpenAICompatibility = out
h.persist(c)
return
}
if idxStr := c.Query("index"); idxStr != "" {
var idx int
_, err := fmt.Sscanf(idxStr, "%d", &idx)
if err == nil && idx >= 0 && idx < len(h.cfg.OpenAICompatibility) {
h.cfg.OpenAICompatibility = append(h.cfg.OpenAICompatibility[:idx], h.cfg.OpenAICompatibility[idx+1:]...)
h.persist(c)
return
}
}
c.JSON(400, gin.H{"error": "missing name or index"})
}

View File

@@ -0,0 +1,140 @@
// Package management provides the management API handlers and middleware
// for configuring the server and managing auth files.
package management
import (
"fmt"
"net/http"
"strings"
"sync"
"github.com/gin-gonic/gin"
"github.com/luispater/CLIProxyAPI/internal/config"
"golang.org/x/crypto/bcrypt"
)
// Handler aggregates config reference, persistence path and helpers.
type Handler struct {
cfg *config.Config
configFilePath string
mu sync.Mutex
}
// NewHandler creates a new management handler instance.
func NewHandler(cfg *config.Config, configFilePath string) *Handler {
return &Handler{cfg: cfg, configFilePath: configFilePath}
}
// SetConfig updates the in-memory config reference when the server hot-reloads.
func (h *Handler) SetConfig(cfg *config.Config) { h.cfg = cfg }
// Middleware enforces access control for management endpoints.
// All requests (local and remote) require a valid management key.
// Additionally, remote access requires allow-remote-management=true.
func (h *Handler) Middleware() gin.HandlerFunc {
return func(c *gin.Context) {
clientIP := c.ClientIP()
// Remote access control: when not loopback, must be enabled
if !(clientIP == "127.0.0.1" || clientIP == "::1") {
allowRemote := h.cfg.RemoteManagement.AllowRemote
if !allowRemote {
allowRemote = true
}
if !allowRemote {
c.AbortWithStatusJSON(http.StatusForbidden, gin.H{"error": "remote management disabled"})
return
}
}
secret := h.cfg.RemoteManagement.SecretKey
if secret == "" {
c.AbortWithStatusJSON(http.StatusForbidden, gin.H{"error": "remote management key not set"})
return
}
// Accept either Authorization: Bearer <key> or X-Management-Key
var provided string
if ah := c.GetHeader("Authorization"); ah != "" {
parts := strings.SplitN(ah, " ", 2)
if len(parts) == 2 && strings.ToLower(parts[0]) == "bearer" {
provided = parts[1]
} else {
provided = ah
}
}
if provided == "" {
provided = c.GetHeader("X-Management-Key")
}
if provided == "" {
c.AbortWithStatusJSON(http.StatusUnauthorized, gin.H{"error": "missing management key"})
return
}
if err := bcrypt.CompareHashAndPassword([]byte(secret), []byte(provided)); err != nil {
c.AbortWithStatusJSON(http.StatusUnauthorized, gin.H{"error": "invalid management key"})
return
}
c.Next()
}
}
// persist saves the current in-memory config to disk.
func (h *Handler) persist(c *gin.Context) bool {
h.mu.Lock()
defer h.mu.Unlock()
// Preserve comments when writing
if err := config.SaveConfigPreserveComments(h.configFilePath, h.cfg); err != nil {
c.JSON(http.StatusInternalServerError, gin.H{"error": fmt.Sprintf("failed to save config: %v", err)})
return false
}
c.JSON(http.StatusOK, gin.H{"status": "ok"})
return true
}
// Helper methods for simple types
func (h *Handler) updateBoolField(c *gin.Context, set func(bool)) {
var body struct {
Value *bool `json:"value"`
}
if err := c.ShouldBindJSON(&body); err != nil || body.Value == nil {
var m map[string]any
if err2 := c.ShouldBindJSON(&m); err2 == nil {
for _, v := range m {
if b, ok := v.(bool); ok {
set(b)
h.persist(c)
return
}
}
}
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid body"})
return
}
set(*body.Value)
h.persist(c)
}
func (h *Handler) updateIntField(c *gin.Context, set func(int)) {
var body struct {
Value *int `json:"value"`
}
if err := c.ShouldBindJSON(&body); err != nil || body.Value == nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid body"})
return
}
set(*body.Value)
h.persist(c)
}
func (h *Handler) updateStringField(c *gin.Context, set func(string)) {
var body struct {
Value *string `json:"value"`
}
if err := c.ShouldBindJSON(&body); err != nil || body.Value == nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid body"})
return
}
set(*body.Value)
h.persist(c)
}

View File

@@ -0,0 +1,18 @@
package management
import "github.com/gin-gonic/gin"
// Quota exceeded toggles
func (h *Handler) GetSwitchProject(c *gin.Context) {
c.JSON(200, gin.H{"switch-project": h.cfg.QuotaExceeded.SwitchProject})
}
func (h *Handler) PutSwitchProject(c *gin.Context) {
h.updateBoolField(c, func(v bool) { h.cfg.QuotaExceeded.SwitchProject = v })
}
func (h *Handler) GetSwitchPreviewModel(c *gin.Context) {
c.JSON(200, gin.H{"switch-preview-model": h.cfg.QuotaExceeded.SwitchPreviewModel})
}
func (h *Handler) PutSwitchPreviewModel(c *gin.Context) {
h.updateBoolField(c, func(v bool) { h.cfg.QuotaExceeded.SwitchPreviewModel = v })
}

View File

@@ -8,6 +8,7 @@ package openai
import (
"context"
"encoding/json"
"fmt"
"net/http"
"time"
@@ -16,8 +17,11 @@ import (
"github.com/luispater/CLIProxyAPI/internal/api/handlers"
. "github.com/luispater/CLIProxyAPI/internal/constant"
"github.com/luispater/CLIProxyAPI/internal/interfaces"
"github.com/luispater/CLIProxyAPI/internal/registry"
"github.com/luispater/CLIProxyAPI/internal/util"
log "github.com/sirupsen/logrus"
"github.com/tidwall/gjson"
"github.com/tidwall/sjson"
)
// OpenAIAPIHandler contains the handlers for OpenAI API endpoints.
@@ -47,90 +51,18 @@ func (h *OpenAIAPIHandler) HandlerType() string {
// Models returns the OpenAI-compatible model metadata supported by this handler.
func (h *OpenAIAPIHandler) Models() []map[string]any {
return []map[string]any{
{
"id": "gemini-2.5-pro",
"object": "model",
"version": "2.5",
"name": "Gemini 2.5 Pro",
"description": "Stable release (June 17th, 2025) of Gemini 2.5 Pro",
"context_length": 1_048_576,
"max_completion_tokens": 65_536,
"supported_parameters": []string{
"tools",
"temperature",
"top_p",
"top_k",
},
"temperature": 1,
"topP": 0.95,
"topK": 64,
"maxTemperature": 2,
"thinking": true,
},
{
"id": "gemini-2.5-flash",
"object": "model",
"version": "001",
"name": "Gemini 2.5 Flash",
"description": "Stable version of Gemini 2.5 Flash, our mid-size multimodal model that supports up to 1 million tokens, released in June of 2025.",
"context_length": 1_048_576,
"max_completion_tokens": 65_536,
"supported_parameters": []string{
"tools",
"temperature",
"top_p",
"top_k",
},
"temperature": 1,
"topP": 0.95,
"topK": 64,
"maxTemperature": 2,
"thinking": true,
},
{
"id": "gpt-5",
"object": "model",
"version": "gpt-5-2025-08-07",
"name": "GPT 5",
"description": "Stable version of GPT 5, The best model for coding and agentic tasks across domains.",
"context_length": 400_000,
"max_completion_tokens": 128_000,
"supported_parameters": []string{
"tools",
},
"temperature": 1,
"topP": 0.95,
"topK": 64,
"maxTemperature": 2,
"thinking": true,
},
{
"id": "claude-opus-4-1-20250805",
"object": "model",
"version": "claude-opus-4-1-20250805",
"name": "Claude Opus 4.1",
"description": "Anthropic's most capable model.",
"context_length": 200_000,
"max_completion_tokens": 32_000,
"supported_parameters": []string{
"tools",
},
"temperature": 1,
"topP": 0.95,
"topK": 64,
"maxTemperature": 2,
"thinking": true,
},
}
// Get dynamic models from the global registry
modelRegistry := registry.GetGlobalRegistry()
return modelRegistry.GetAvailableModels("openai")
}
// OpenAIModels handles the /v1/models endpoint.
// It returns a hardcoded list of available AI models with their capabilities
// It returns a list of available AI models with their capabilities
// and specifications in OpenAI-compatible format.
func (h *OpenAIAPIHandler) OpenAIModels(c *gin.Context) {
c.JSON(http.StatusOK, gin.H{
"data": h.Models(),
"object": "list",
"data": h.Models(),
})
}
@@ -163,6 +95,276 @@ func (h *OpenAIAPIHandler) ChatCompletions(c *gin.Context) {
}
// Completions handles the /v1/completions endpoint.
// It determines whether the request is for a streaming or non-streaming response
// and calls the appropriate handler based on the model provider.
// This endpoint follows the OpenAI completions API specification.
//
// Parameters:
// - c: The Gin context containing the HTTP request and response
func (h *OpenAIAPIHandler) Completions(c *gin.Context) {
rawJSON, err := c.GetRawData()
// If data retrieval fails, return a 400 Bad Request error.
if err != nil {
c.JSON(http.StatusBadRequest, handlers.ErrorResponse{
Error: handlers.ErrorDetail{
Message: fmt.Sprintf("Invalid request: %v", err),
Type: "invalid_request_error",
},
})
return
}
// Check if the client requested a streaming response.
streamResult := gjson.GetBytes(rawJSON, "stream")
if streamResult.Type == gjson.True {
h.handleCompletionsStreamingResponse(c, rawJSON)
} else {
h.handleCompletionsNonStreamingResponse(c, rawJSON)
}
}
// convertCompletionsRequestToChatCompletions converts OpenAI completions API request to chat completions format.
// This allows the completions endpoint to use the existing chat completions infrastructure.
//
// Parameters:
// - rawJSON: The raw JSON bytes of the completions request
//
// Returns:
// - []byte: The converted chat completions request
func convertCompletionsRequestToChatCompletions(rawJSON []byte) []byte {
root := gjson.ParseBytes(rawJSON)
// Extract prompt from completions request
prompt := root.Get("prompt").String()
if prompt == "" {
prompt = "Complete this:"
}
// Create chat completions structure
out := `{"model":"","messages":[{"role":"user","content":""}]}`
// Set model
if model := root.Get("model"); model.Exists() {
out, _ = sjson.Set(out, "model", model.String())
}
// Set the prompt as user message content
out, _ = sjson.Set(out, "messages.0.content", prompt)
// Copy other parameters from completions to chat completions
if maxTokens := root.Get("max_tokens"); maxTokens.Exists() {
out, _ = sjson.Set(out, "max_tokens", maxTokens.Int())
}
if temperature := root.Get("temperature"); temperature.Exists() {
out, _ = sjson.Set(out, "temperature", temperature.Float())
}
if topP := root.Get("top_p"); topP.Exists() {
out, _ = sjson.Set(out, "top_p", topP.Float())
}
if frequencyPenalty := root.Get("frequency_penalty"); frequencyPenalty.Exists() {
out, _ = sjson.Set(out, "frequency_penalty", frequencyPenalty.Float())
}
if presencePenalty := root.Get("presence_penalty"); presencePenalty.Exists() {
out, _ = sjson.Set(out, "presence_penalty", presencePenalty.Float())
}
if stop := root.Get("stop"); stop.Exists() {
out, _ = sjson.SetRaw(out, "stop", stop.Raw)
}
if stream := root.Get("stream"); stream.Exists() {
out, _ = sjson.Set(out, "stream", stream.Bool())
}
if logprobs := root.Get("logprobs"); logprobs.Exists() {
out, _ = sjson.Set(out, "logprobs", logprobs.Bool())
}
if topLogprobs := root.Get("top_logprobs"); topLogprobs.Exists() {
out, _ = sjson.Set(out, "top_logprobs", topLogprobs.Int())
}
if echo := root.Get("echo"); echo.Exists() {
out, _ = sjson.Set(out, "echo", echo.Bool())
}
return []byte(out)
}
// convertChatCompletionsResponseToCompletions converts chat completions API response back to completions format.
// This ensures the completions endpoint returns data in the expected format.
//
// Parameters:
// - rawJSON: The raw JSON bytes of the chat completions response
//
// Returns:
// - []byte: The converted completions response
func convertChatCompletionsResponseToCompletions(rawJSON []byte) []byte {
root := gjson.ParseBytes(rawJSON)
// Base completions response structure
out := `{"id":"","object":"text_completion","created":0,"model":"","choices":[]}`
// Copy basic fields
if id := root.Get("id"); id.Exists() {
out, _ = sjson.Set(out, "id", id.String())
}
if created := root.Get("created"); created.Exists() {
out, _ = sjson.Set(out, "created", created.Int())
}
if model := root.Get("model"); model.Exists() {
out, _ = sjson.Set(out, "model", model.String())
}
if usage := root.Get("usage"); usage.Exists() {
out, _ = sjson.SetRaw(out, "usage", usage.Raw)
}
// Convert choices from chat completions to completions format
var choices []interface{}
if chatChoices := root.Get("choices"); chatChoices.Exists() && chatChoices.IsArray() {
chatChoices.ForEach(func(_, choice gjson.Result) bool {
completionsChoice := map[string]interface{}{
"index": choice.Get("index").Int(),
}
// Extract text content from message.content
if message := choice.Get("message"); message.Exists() {
if content := message.Get("content"); content.Exists() {
completionsChoice["text"] = content.String()
}
} else if delta := choice.Get("delta"); delta.Exists() {
// For streaming responses, use delta.content
if content := delta.Get("content"); content.Exists() {
completionsChoice["text"] = content.String()
}
}
// Copy finish_reason
if finishReason := choice.Get("finish_reason"); finishReason.Exists() {
completionsChoice["finish_reason"] = finishReason.String()
}
// Copy logprobs if present
if logprobs := choice.Get("logprobs"); logprobs.Exists() {
completionsChoice["logprobs"] = logprobs.Value()
}
choices = append(choices, completionsChoice)
return true
})
}
if len(choices) > 0 {
choicesJSON, _ := json.Marshal(choices)
out, _ = sjson.SetRaw(out, "choices", string(choicesJSON))
}
return []byte(out)
}
// convertChatCompletionsStreamChunkToCompletions converts a streaming chat completions chunk to completions format.
// This handles the real-time conversion of streaming response chunks and filters out empty text responses.
//
// Parameters:
// - chunkData: The raw JSON bytes of a single chat completions stream chunk
//
// Returns:
// - []byte: The converted completions stream chunk, or nil if should be filtered out
func convertChatCompletionsStreamChunkToCompletions(chunkData []byte) []byte {
root := gjson.ParseBytes(chunkData)
// Check if this chunk has any meaningful content
hasContent := false
if chatChoices := root.Get("choices"); chatChoices.Exists() && chatChoices.IsArray() {
chatChoices.ForEach(func(_, choice gjson.Result) bool {
// Check if delta has content or finish_reason
if delta := choice.Get("delta"); delta.Exists() {
if content := delta.Get("content"); content.Exists() && content.String() != "" {
hasContent = true
return false // Break out of forEach
}
}
// Also check for finish_reason to ensure we don't skip final chunks
if finishReason := choice.Get("finish_reason"); finishReason.Exists() && finishReason.String() != "" && finishReason.String() != "null" {
hasContent = true
return false // Break out of forEach
}
return true
})
}
// If no meaningful content, return nil to indicate this chunk should be skipped
if !hasContent {
return nil
}
// Base completions stream response structure
out := `{"id":"","object":"text_completion","created":0,"model":"","choices":[]}`
// Copy basic fields
if id := root.Get("id"); id.Exists() {
out, _ = sjson.Set(out, "id", id.String())
}
if created := root.Get("created"); created.Exists() {
out, _ = sjson.Set(out, "created", created.Int())
}
if model := root.Get("model"); model.Exists() {
out, _ = sjson.Set(out, "model", model.String())
}
// Convert choices from chat completions delta to completions format
var choices []interface{}
if chatChoices := root.Get("choices"); chatChoices.Exists() && chatChoices.IsArray() {
chatChoices.ForEach(func(_, choice gjson.Result) bool {
completionsChoice := map[string]interface{}{
"index": choice.Get("index").Int(),
}
// Extract text content from delta.content
if delta := choice.Get("delta"); delta.Exists() {
if content := delta.Get("content"); content.Exists() && content.String() != "" {
completionsChoice["text"] = content.String()
} else {
completionsChoice["text"] = ""
}
} else {
completionsChoice["text"] = ""
}
// Copy finish_reason
if finishReason := choice.Get("finish_reason"); finishReason.Exists() && finishReason.String() != "null" {
completionsChoice["finish_reason"] = finishReason.String()
}
// Copy logprobs if present
if logprobs := choice.Get("logprobs"); logprobs.Exists() {
completionsChoice["logprobs"] = logprobs.Value()
}
choices = append(choices, completionsChoice)
return true
})
}
if len(choices) > 0 {
choicesJSON, _ := json.Marshal(choices)
out, _ = sjson.SetRaw(out, "choices", string(choicesJSON))
}
return []byte(out)
}
// handleNonStreamingResponse handles non-streaming chat completion responses
// for Gemini models. It selects a client from the pool, sends the request, and
// aggregates the response before sending it back to the client in OpenAI format.
@@ -179,7 +381,9 @@ func (h *OpenAIAPIHandler) handleNonStreamingResponse(c *gin.Context, rawJSON []
var cliClient interfaces.Client
defer func() {
if cliClient != nil {
cliClient.GetRequestMutex().Unlock()
if mutex := cliClient.GetRequestMutex(); mutex != nil {
mutex.Unlock()
}
}
}()
@@ -206,6 +410,14 @@ func (h *OpenAIAPIHandler) handleNonStreamingResponse(c *gin.Context, rawJSON []
log.Debugf("http status code %d, switch client", err.StatusCode)
retryCount++
continue
case 401:
log.Debugf("unauthorized request, try to refresh token, %s", util.HideAPIKey(cliClient.GetEmail()))
errRefreshTokens := cliClient.RefreshTokens(cliCtx)
if errRefreshTokens != nil {
log.Debugf("refresh token failed, switch client, %s", util.HideAPIKey(cliClient.GetEmail()))
}
retryCount++
continue
default:
// Forward other errors directly to the client
c.Status(err.StatusCode)
@@ -253,7 +465,9 @@ func (h *OpenAIAPIHandler) handleStreamingResponse(c *gin.Context, rawJSON []byt
defer func() {
// Ensure the client's mutex is unlocked on function exit.
if cliClient != nil {
cliClient.GetRequestMutex().Unlock()
if mutex := cliClient.GetRequestMutex(); mutex != nil {
mutex.Unlock()
}
}
}()
@@ -278,7 +492,7 @@ outLoop:
// Handle client disconnection.
case <-c.Request.Context().Done():
if c.Request.Context().Err().Error() == "context canceled" {
log.Debugf("Client disconnected: %v", c.Request.Context().Err())
log.Debugf("openai client disconnected: %v", c.Request.Context().Err())
cliCancel() // Cancel the backend request.
return
}
@@ -322,3 +536,181 @@ outLoop:
}
}
}
// handleCompletionsNonStreamingResponse handles non-streaming completions responses.
// It converts completions request to chat completions format, sends to backend,
// then converts the response back to completions format before sending to client.
//
// Parameters:
// - c: The Gin context containing the HTTP request and response
// - rawJSON: The raw JSON bytes of the OpenAI-compatible completions request
func (h *OpenAIAPIHandler) handleCompletionsNonStreamingResponse(c *gin.Context, rawJSON []byte) {
c.Header("Content-Type", "application/json")
// Convert completions request to chat completions format
chatCompletionsJSON := convertCompletionsRequestToChatCompletions(rawJSON)
modelName := gjson.GetBytes(chatCompletionsJSON, "model").String()
cliCtx, cliCancel := h.GetContextWithCancel(h, c, context.Background())
var cliClient interfaces.Client
defer func() {
if cliClient != nil {
if mutex := cliClient.GetRequestMutex(); mutex != nil {
mutex.Unlock()
}
}
}()
retryCount := 0
for retryCount <= h.Cfg.RequestRetry {
var errorResponse *interfaces.ErrorMessage
cliClient, errorResponse = h.GetClient(modelName)
if errorResponse != nil {
c.Status(errorResponse.StatusCode)
_, _ = fmt.Fprint(c.Writer, errorResponse.Error.Error())
cliCancel()
return
}
// Send the converted chat completions request
resp, err := cliClient.SendRawMessage(cliCtx, modelName, chatCompletionsJSON, "")
if err != nil {
switch err.StatusCode {
case 429:
if h.Cfg.QuotaExceeded.SwitchProject {
log.Debugf("quota exceeded, switch client")
continue // Restart the client selection process
}
case 403, 408, 500, 502, 503, 504:
log.Debugf("http status code %d, switch client", err.StatusCode)
retryCount++
continue
default:
// Forward other errors directly to the client
c.Status(err.StatusCode)
_, _ = c.Writer.Write([]byte(err.Error.Error()))
cliCancel(err.Error)
}
break
} else {
// Convert chat completions response back to completions format
completionsResp := convertChatCompletionsResponseToCompletions(resp)
_, _ = c.Writer.Write(completionsResp)
cliCancel(completionsResp)
break
}
}
}
// handleCompletionsStreamingResponse handles streaming completions responses.
// It converts completions request to chat completions format, streams from backend,
// then converts each response chunk back to completions format before sending to client.
//
// Parameters:
// - c: The Gin context containing the HTTP request and response
// - rawJSON: The raw JSON bytes of the OpenAI-compatible completions request
func (h *OpenAIAPIHandler) handleCompletionsStreamingResponse(c *gin.Context, rawJSON []byte) {
c.Header("Content-Type", "text/event-stream")
c.Header("Cache-Control", "no-cache")
c.Header("Connection", "keep-alive")
c.Header("Access-Control-Allow-Origin", "*")
// Get the http.Flusher interface to manually flush the response.
flusher, ok := c.Writer.(http.Flusher)
if !ok {
c.JSON(http.StatusInternalServerError, handlers.ErrorResponse{
Error: handlers.ErrorDetail{
Message: "Streaming not supported",
Type: "server_error",
},
})
return
}
// Convert completions request to chat completions format
chatCompletionsJSON := convertCompletionsRequestToChatCompletions(rawJSON)
modelName := gjson.GetBytes(chatCompletionsJSON, "model").String()
cliCtx, cliCancel := h.GetContextWithCancel(h, c, context.Background())
var cliClient interfaces.Client
defer func() {
// Ensure the client's mutex is unlocked on function exit.
if cliClient != nil {
if mutex := cliClient.GetRequestMutex(); mutex != nil {
mutex.Unlock()
}
}
}()
retryCount := 0
outLoop:
for retryCount <= h.Cfg.RequestRetry {
var errorResponse *interfaces.ErrorMessage
cliClient, errorResponse = h.GetClient(modelName)
if errorResponse != nil {
c.Status(errorResponse.StatusCode)
_, _ = fmt.Fprint(c.Writer, errorResponse.Error.Error())
flusher.Flush()
cliCancel()
return
}
// Send the converted chat completions request and receive response chunks
respChan, errChan := cliClient.SendRawMessageStream(cliCtx, modelName, chatCompletionsJSON, "")
for {
select {
// Handle client disconnection.
case <-c.Request.Context().Done():
if c.Request.Context().Err().Error() == "context canceled" {
log.Debugf("client disconnected: %v", c.Request.Context().Err())
cliCancel() // Cancel the backend request.
return
}
// Process incoming response chunks.
case chunk, okStream := <-respChan:
if !okStream {
// Stream is closed, send the final [DONE] message.
_, _ = fmt.Fprintf(c.Writer, "data: [DONE]\n\n")
flusher.Flush()
cliCancel()
return
}
// Convert chat completions chunk to completions chunk format
completionsChunk := convertChatCompletionsStreamChunkToCompletions(chunk)
// Skip this chunk if it has no meaningful content (empty text)
if completionsChunk != nil {
_, _ = fmt.Fprintf(c.Writer, "data: %s\n\n", string(completionsChunk))
flusher.Flush()
}
// Handle errors from the backend.
case err, okError := <-errChan:
if okError {
switch err.StatusCode {
case 429:
if h.Cfg.QuotaExceeded.SwitchProject {
log.Debugf("quota exceeded, switch client")
continue outLoop // Restart the client selection process
}
case 403, 408, 500, 502, 503, 504:
log.Debugf("http status code %d, switch client", err.StatusCode)
retryCount++
continue outLoop
default:
// Forward other errors directly to the client
c.Status(err.StatusCode)
_, _ = fmt.Fprint(c.Writer, err.Error.Error())
flusher.Flush()
cliCancel(err.Error)
}
return
}
// Send a keep-alive signal to the client.
case <-time.After(500 * time.Millisecond):
}
}
}
}

View File

@@ -15,6 +15,7 @@ import (
"github.com/luispater/CLIProxyAPI/internal/api/handlers"
"github.com/luispater/CLIProxyAPI/internal/api/handlers/claude"
"github.com/luispater/CLIProxyAPI/internal/api/handlers/gemini"
managementHandlers "github.com/luispater/CLIProxyAPI/internal/api/handlers/management"
"github.com/luispater/CLIProxyAPI/internal/api/handlers/openai"
"github.com/luispater/CLIProxyAPI/internal/api/middleware"
"github.com/luispater/CLIProxyAPI/internal/config"
@@ -40,6 +41,12 @@ type Server struct {
// requestLogger is the request logger instance for dynamic configuration updates.
requestLogger *logging.FileRequestLogger
// configFilePath is the absolute path to the YAML config file for persistence.
configFilePath string
// management handler
mgmt *managementHandlers.Handler
}
// NewServer creates and initializes a new API server instance.
@@ -51,7 +58,7 @@ type Server struct {
//
// Returns:
// - *Server: A new server instance
func NewServer(cfg *config.Config, cliClients []interfaces.Client) *Server {
func NewServer(cfg *config.Config, cliClients []interfaces.Client, configFilePath string) *Server {
// Set gin mode
if !cfg.Debug {
gin.SetMode(gin.ReleaseMode)
@@ -72,11 +79,14 @@ func NewServer(cfg *config.Config, cliClients []interfaces.Client) *Server {
// Create server instance
s := &Server{
engine: engine,
handlers: handlers.NewBaseAPIHandlers(cliClients, cfg),
cfg: cfg,
requestLogger: requestLogger,
engine: engine,
handlers: handlers.NewBaseAPIHandlers(cliClients, cfg),
cfg: cfg,
requestLogger: requestLogger,
configFilePath: configFilePath,
}
// Initialize management handler
s.mgmt = managementHandlers.NewHandler(cfg, configFilePath)
// Setup routes
s.setupRoutes()
@@ -102,8 +112,9 @@ func (s *Server) setupRoutes() {
v1 := s.engine.Group("/v1")
v1.Use(AuthMiddleware(s.cfg))
{
v1.GET("/models", openaiHandlers.OpenAIModels)
v1.GET("/models", s.unifiedModelsHandler(openaiHandlers, claudeCodeHandlers))
v1.POST("/chat/completions", openaiHandlers.ChatCompletions)
v1.POST("/completions", openaiHandlers.Completions)
v1.POST("/messages", claudeCodeHandlers.ClaudeMessages)
}
@@ -123,11 +134,93 @@ func (s *Server) setupRoutes() {
"version": "1.0.0",
"endpoints": []string{
"POST /v1/chat/completions",
"POST /v1/completions",
"GET /v1/models",
},
})
})
s.engine.POST("/v1internal:method", geminiCLIHandlers.CLIHandler)
// Management API routes (delegated to management handlers)
// New logic: if remote-management-key is empty, do not expose any management endpoint (404).
if s.cfg.RemoteManagement.SecretKey != "" {
mgmt := s.engine.Group("/v0/management")
mgmt.Use(s.mgmt.Middleware())
{
mgmt.GET("/debug", s.mgmt.GetDebug)
mgmt.PUT("/debug", s.mgmt.PutDebug)
mgmt.PATCH("/debug", s.mgmt.PutDebug)
mgmt.GET("/proxy-url", s.mgmt.GetProxyURL)
mgmt.PUT("/proxy-url", s.mgmt.PutProxyURL)
mgmt.PATCH("/proxy-url", s.mgmt.PutProxyURL)
mgmt.DELETE("/proxy-url", s.mgmt.DeleteProxyURL)
mgmt.GET("/quota-exceeded/switch-project", s.mgmt.GetSwitchProject)
mgmt.PUT("/quota-exceeded/switch-project", s.mgmt.PutSwitchProject)
mgmt.PATCH("/quota-exceeded/switch-project", s.mgmt.PutSwitchProject)
mgmt.GET("/quota-exceeded/switch-preview-model", s.mgmt.GetSwitchPreviewModel)
mgmt.PUT("/quota-exceeded/switch-preview-model", s.mgmt.PutSwitchPreviewModel)
mgmt.PATCH("/quota-exceeded/switch-preview-model", s.mgmt.PutSwitchPreviewModel)
mgmt.GET("/api-keys", s.mgmt.GetAPIKeys)
mgmt.PUT("/api-keys", s.mgmt.PutAPIKeys)
mgmt.PATCH("/api-keys", s.mgmt.PatchAPIKeys)
mgmt.DELETE("/api-keys", s.mgmt.DeleteAPIKeys)
mgmt.GET("/generative-language-api-key", s.mgmt.GetGlKeys)
mgmt.PUT("/generative-language-api-key", s.mgmt.PutGlKeys)
mgmt.PATCH("/generative-language-api-key", s.mgmt.PatchGlKeys)
mgmt.DELETE("/generative-language-api-key", s.mgmt.DeleteGlKeys)
mgmt.GET("/request-log", s.mgmt.GetRequestLog)
mgmt.PUT("/request-log", s.mgmt.PutRequestLog)
mgmt.PATCH("/request-log", s.mgmt.PutRequestLog)
mgmt.GET("/request-retry", s.mgmt.GetRequestRetry)
mgmt.PUT("/request-retry", s.mgmt.PutRequestRetry)
mgmt.PATCH("/request-retry", s.mgmt.PutRequestRetry)
mgmt.GET("/allow-localhost-unauthenticated", s.mgmt.GetAllowLocalhost)
mgmt.PUT("/allow-localhost-unauthenticated", s.mgmt.PutAllowLocalhost)
mgmt.PATCH("/allow-localhost-unauthenticated", s.mgmt.PutAllowLocalhost)
mgmt.GET("/claude-api-key", s.mgmt.GetClaudeKeys)
mgmt.PUT("/claude-api-key", s.mgmt.PutClaudeKeys)
mgmt.PATCH("/claude-api-key", s.mgmt.PatchClaudeKey)
mgmt.DELETE("/claude-api-key", s.mgmt.DeleteClaudeKey)
mgmt.GET("/openai-compatibility", s.mgmt.GetOpenAICompat)
mgmt.PUT("/openai-compatibility", s.mgmt.PutOpenAICompat)
mgmt.PATCH("/openai-compatibility", s.mgmt.PatchOpenAICompat)
mgmt.DELETE("/openai-compatibility", s.mgmt.DeleteOpenAICompat)
mgmt.GET("/auth-files", s.mgmt.ListAuthFiles)
mgmt.GET("/auth-files/download", s.mgmt.DownloadAuthFile)
mgmt.POST("/auth-files", s.mgmt.UploadAuthFile)
mgmt.DELETE("/auth-files", s.mgmt.DeleteAuthFile)
}
}
}
// unifiedModelsHandler creates a unified handler for the /v1/models endpoint
// that routes to different handlers based on the User-Agent header.
// If User-Agent starts with "claude-cli", it routes to Claude handler,
// otherwise it routes to OpenAI handler.
func (s *Server) unifiedModelsHandler(openaiHandler *openai.OpenAIAPIHandler, claudeHandler *claude.ClaudeCodeAPIHandler) gin.HandlerFunc {
return func(c *gin.Context) {
userAgent := c.GetHeader("User-Agent")
// Route to Claude handler if User-Agent starts with "claude-cli"
if strings.HasPrefix(userAgent, "claude-cli") {
// log.Debugf("Routing /v1/models to Claude handler for User-Agent: %s", userAgent)
claudeHandler.ClaudeModels(c)
} else {
// log.Debugf("Routing /v1/models to OpenAI handler for User-Agent: %s", userAgent)
openaiHandler.OpenAIModels(c)
}
}
}
// Start begins listening for and serving HTTP requests.
@@ -199,11 +292,26 @@ func (s *Server) UpdateClients(clients []interfaces.Client, cfg *config.Config)
log.Debugf("request logging updated from %t to %t", s.cfg.RequestLog, cfg.RequestLog)
}
// Update log level dynamically when debug flag changes
if s.cfg.Debug != cfg.Debug {
if cfg.Debug {
log.SetLevel(log.DebugLevel)
} else {
log.SetLevel(log.InfoLevel)
}
log.Debugf("debug mode updated from %t to %t", s.cfg.Debug, cfg.Debug)
}
s.cfg = cfg
s.handlers.UpdateClients(clients, cfg)
if s.mgmt != nil {
s.mgmt.SetConfig(cfg)
}
log.Infof("server clients and configuration updated: %d clients", len(clients))
}
// (management handlers moved to internal/api/handlers/management)
// AuthMiddleware returns a Gin middleware handler that authenticates requests
// using API keys. If no API keys are configured, it allows all requests.
//
@@ -214,6 +322,11 @@ func (s *Server) UpdateClients(clients []interfaces.Client, cfg *config.Config)
// - gin.HandlerFunc: The authentication middleware handler
func AuthMiddleware(cfg *config.Config) gin.HandlerFunc {
return func(c *gin.Context) {
if cfg.AllowLocalhostUnauthenticated && strings.HasPrefix(c.Request.RemoteAddr, "127.0.0.1:") {
c.Next()
return
}
if len(cfg.APIKeys) == 0 {
c.Next()
return

View File

@@ -23,6 +23,7 @@ import (
. "github.com/luispater/CLIProxyAPI/internal/constant"
"github.com/luispater/CLIProxyAPI/internal/interfaces"
"github.com/luispater/CLIProxyAPI/internal/misc"
"github.com/luispater/CLIProxyAPI/internal/registry"
"github.com/luispater/CLIProxyAPI/internal/translator/translator"
"github.com/luispater/CLIProxyAPI/internal/util"
log "github.com/sirupsen/logrus"
@@ -55,6 +56,10 @@ type ClaudeClient struct {
// - *ClaudeClient: A new Claude client instance.
func NewClaudeClient(cfg *config.Config, ts *claude.ClaudeTokenStorage) *ClaudeClient {
httpClient := util.SetProxy(cfg, &http.Client{})
// Generate unique client ID
clientID := fmt.Sprintf("claude-%d", time.Now().UnixNano())
client := &ClaudeClient{
ClientBase: ClientBase{
RequestMutex: &sync.Mutex{},
@@ -67,6 +72,10 @@ func NewClaudeClient(cfg *config.Config, ts *claude.ClaudeTokenStorage) *ClaudeC
apiKeyIndex: -1,
}
// Initialize model registry and register Claude models
client.InitializeModelRegistry(clientID)
client.RegisterModels("claude", registry.GetClaudeModels())
return client
}
@@ -82,6 +91,10 @@ func NewClaudeClient(cfg *config.Config, ts *claude.ClaudeTokenStorage) *ClaudeC
// - *ClaudeClient: A new Claude client instance.
func NewClaudeClientWithKey(cfg *config.Config, apiKeyIndex int) *ClaudeClient {
httpClient := util.SetProxy(cfg, &http.Client{})
// Generate unique client ID for API key client
clientID := fmt.Sprintf("claude-apikey-%d-%d", apiKeyIndex, time.Now().UnixNano())
client := &ClaudeClient{
ClientBase: ClientBase{
RequestMutex: &sync.Mutex{},
@@ -94,6 +107,10 @@ func NewClaudeClientWithKey(cfg *config.Config, apiKeyIndex int) *ClaudeClient {
apiKeyIndex: apiKeyIndex,
}
// Initialize model registry and register Claude models
client.InitializeModelRegistry(clientID)
client.RegisterModels("claude", registry.GetClaudeModels())
return client
}
@@ -174,15 +191,20 @@ func (c *ClaudeClient) SendRawMessage(ctx context.Context, modelName string, raw
if err.StatusCode == 429 {
now := time.Now()
c.modelQuotaExceeded[modelName] = &now
// Update model registry quota status
c.SetModelQuotaExceeded(modelName)
}
return nil, err
}
delete(c.modelQuotaExceeded, modelName)
// Clear quota status in model registry
c.ClearModelQuotaExceeded(modelName)
bodyBytes, errReadAll := io.ReadAll(respBody)
if errReadAll != nil {
return nil, &interfaces.ErrorMessage{StatusCode: 500, Error: errReadAll}
}
_ = respBody.Close()
c.AddAPIResponseData(ctx, bodyBytes)
var param any
@@ -233,11 +255,18 @@ func (c *ClaudeClient) SendRawMessageStream(ctx context.Context, modelName strin
if err.StatusCode == 429 {
now := time.Now()
c.modelQuotaExceeded[modelName] = &now
// Update model registry quota status
c.SetModelQuotaExceeded(modelName)
}
errChan <- err
return
}
delete(c.modelQuotaExceeded, modelName)
// Clear quota status in model registry
c.ClearModelQuotaExceeded(modelName)
defer func() {
_ = stream.Close()
}()
scanner := bufio.NewScanner(stream)
buffer := make([]byte, 10240*1024)
@@ -314,6 +343,10 @@ func (c *ClaudeClient) SaveTokenToFile() error {
// - error: An error if the refresh operation fails, nil otherwise.
func (c *ClaudeClient) RefreshTokens(ctx context.Context) error {
// Check if we have a valid refresh token
if c.apiKeyIndex != -1 {
return fmt.Errorf("no refresh token available")
}
if c.tokenStorage == nil || c.tokenStorage.(*claude.ClaudeTokenStorage).RefreshToken == "" {
return fmt.Errorf("no refresh token available")
}
@@ -506,7 +539,7 @@ func (c *ClaudeClient) GetEmail() string {
if ts, ok := c.tokenStorage.(*claude.ClaudeTokenStorage); ok {
return ts.Email
} else {
return ""
return c.cfg.ClaudeKey[c.apiKeyIndex].APIKey
}
}
@@ -528,3 +561,12 @@ func (c *ClaudeClient) IsModelQuotaExceeded(model string) bool {
}
return false
}
// GetRequestMutex returns the mutex used to synchronize requests for this client.
// This ensures that only one request is processed at a time for quota management.
//
// Returns:
// - *sync.Mutex: The mutex used for request synchronization
func (c *ClaudeClient) GetRequestMutex() *sync.Mutex {
return nil
}

View File

@@ -13,6 +13,7 @@ import (
"github.com/gin-gonic/gin"
"github.com/luispater/CLIProxyAPI/internal/auth"
"github.com/luispater/CLIProxyAPI/internal/config"
"github.com/luispater/CLIProxyAPI/internal/registry"
)
// ClientBase provides a common base structure for all AI API clients.
@@ -34,6 +35,12 @@ type ClientBase struct {
// modelQuotaExceeded tracks when models have exceeded their quota.
// The map key is the model name, and the value is the time when the quota was exceeded.
modelQuotaExceeded map[string]*time.Time
// clientID is the unique identifier for this client instance.
clientID string
// modelRegistry is the global model registry for tracking model availability.
modelRegistry *registry.ModelRegistry
}
// GetRequestMutex returns the mutex used to synchronize requests for this client.
@@ -71,3 +78,50 @@ func (c *ClientBase) AddAPIResponseData(ctx context.Context, line []byte) {
}
}
}
// InitializeModelRegistry initializes the model registry for this client
// This should be called by all client implementations during construction
func (c *ClientBase) InitializeModelRegistry(clientID string) {
c.clientID = clientID
c.modelRegistry = registry.GetGlobalRegistry()
}
// RegisterModels registers the models that this client can provide
// Parameters:
// - provider: The provider name (e.g., "gemini", "claude", "openai")
// - models: The list of models this client supports
func (c *ClientBase) RegisterModels(provider string, models []*registry.ModelInfo) {
if c.modelRegistry != nil && c.clientID != "" {
c.modelRegistry.RegisterClient(c.clientID, provider, models)
}
}
// UnregisterClient removes this client from the model registry
func (c *ClientBase) UnregisterClient() {
if c.modelRegistry != nil && c.clientID != "" {
c.modelRegistry.UnregisterClient(c.clientID)
}
}
// SetModelQuotaExceeded marks a model as quota exceeded in the registry
// Parameters:
// - modelID: The model that exceeded quota
func (c *ClientBase) SetModelQuotaExceeded(modelID string) {
if c.modelRegistry != nil && c.clientID != "" {
c.modelRegistry.SetModelQuotaExceeded(c.clientID, modelID)
}
}
// ClearModelQuotaExceeded clears quota exceeded status for a model
// Parameters:
// - modelID: The model to clear quota status for
func (c *ClientBase) ClearModelQuotaExceeded(modelID string) {
if c.modelRegistry != nil && c.clientID != "" {
c.modelRegistry.ClearModelQuotaExceeded(c.clientID, modelID)
}
}
// GetClientID returns the unique identifier for this client
func (c *ClientBase) GetClientID() string {
return c.clientID
}

View File

@@ -22,6 +22,7 @@ import (
"github.com/luispater/CLIProxyAPI/internal/config"
. "github.com/luispater/CLIProxyAPI/internal/constant"
"github.com/luispater/CLIProxyAPI/internal/interfaces"
"github.com/luispater/CLIProxyAPI/internal/registry"
"github.com/luispater/CLIProxyAPI/internal/translator/translator"
"github.com/luispater/CLIProxyAPI/internal/util"
log "github.com/sirupsen/logrus"
@@ -50,6 +51,10 @@ type CodexClient struct {
// - error: An error if the client creation fails.
func NewCodexClient(cfg *config.Config, ts *codex.CodexTokenStorage) (*CodexClient, error) {
httpClient := util.SetProxy(cfg, &http.Client{})
// Generate unique client ID
clientID := fmt.Sprintf("codex-%d", time.Now().UnixNano())
client := &CodexClient{
ClientBase: ClientBase{
RequestMutex: &sync.Mutex{},
@@ -61,6 +66,10 @@ func NewCodexClient(cfg *config.Config, ts *codex.CodexTokenStorage) (*CodexClie
codexAuth: codex.NewCodexAuth(cfg),
}
// Initialize model registry and register OpenAI models
client.InitializeModelRegistry(clientID)
client.RegisterModels("codex", registry.GetOpenAIModels())
return client, nil
}
@@ -84,8 +93,9 @@ func (c *CodexClient) Provider() string {
func (c *CodexClient) CanProvideModel(modelName string) bool {
models := []string{
"gpt-5",
"gpt-5-mini",
"gpt-5-nano",
"gpt-5-minimal",
"gpt-5-low",
"gpt-5-medium",
"gpt-5-high",
"codex-mini-latest",
}
@@ -123,15 +133,20 @@ func (c *CodexClient) SendRawMessage(ctx context.Context, modelName string, rawJ
if err.StatusCode == 429 {
now := time.Now()
c.modelQuotaExceeded[modelName] = &now
// Update model registry quota status
c.SetModelQuotaExceeded(modelName)
}
return nil, err
}
delete(c.modelQuotaExceeded, modelName)
// Clear quota status in model registry
c.ClearModelQuotaExceeded(modelName)
bodyBytes, errReadAll := io.ReadAll(respBody)
if errReadAll != nil {
return nil, &interfaces.ErrorMessage{StatusCode: 500, Error: errReadAll}
}
_ = respBody.Close()
c.AddAPIResponseData(ctx, bodyBytes)
var param any
@@ -183,11 +198,18 @@ func (c *CodexClient) SendRawMessageStream(ctx context.Context, modelName string
if err.StatusCode == 429 {
now := time.Now()
c.modelQuotaExceeded[modelName] = &now
// Update model registry quota status
c.SetModelQuotaExceeded(modelName)
}
errChan <- err
return
}
delete(c.modelQuotaExceeded, modelName)
// Clear quota status in model registry
c.ClearModelQuotaExceeded(modelName)
defer func() {
_ = stream.Close()
}()
scanner := bufio.NewScanner(stream)
buffer := make([]byte, 10240*1024)
@@ -323,14 +345,14 @@ func (c *CodexClient) APIRequest(ctx context.Context, modelName, endpoint string
// Stream must be set to true
jsonBody, _ = sjson.SetBytes(jsonBody, "stream", true)
if util.InArray([]string{"gpt-5-nano", "gpt-5-mini", "gpt-5", "gpt-5-high"}, modelName) {
if util.InArray([]string{"gpt-5-minimal", "gpt-5-low", "gpt-5-medium", "gpt-5-high"}, modelName) {
jsonBody, _ = sjson.SetBytes(jsonBody, "model", "gpt-5")
switch modelName {
case "gpt-5-nano":
case "gpt-5-minimal":
jsonBody, _ = sjson.SetBytes(jsonBody, "reasoning.effort", "minimal")
case "gpt-5-mini":
case "gpt-5-low":
jsonBody, _ = sjson.SetBytes(jsonBody, "reasoning.effort", "low")
case "gpt-5":
case "gpt-5-medium":
jsonBody, _ = sjson.SetBytes(jsonBody, "reasoning.effort", "medium")
case "gpt-5-high":
jsonBody, _ = sjson.SetBytes(jsonBody, "reasoning.effort", "high")
@@ -409,3 +431,12 @@ func (c *CodexClient) IsModelQuotaExceeded(model string) bool {
}
return false
}
// GetRequestMutex returns the mutex used to synchronize requests for this client.
// This ensures that only one request is processed at a time for quota management.
//
// Returns:
// - *sync.Mutex: The mutex used for request synchronization
func (c *CodexClient) GetRequestMutex() *sync.Mutex {
return nil
}

View File

@@ -22,6 +22,7 @@ import (
"github.com/luispater/CLIProxyAPI/internal/config"
. "github.com/luispater/CLIProxyAPI/internal/constant"
"github.com/luispater/CLIProxyAPI/internal/interfaces"
"github.com/luispater/CLIProxyAPI/internal/registry"
"github.com/luispater/CLIProxyAPI/internal/translator/translator"
"github.com/luispater/CLIProxyAPI/internal/util"
log "github.com/sirupsen/logrus"
@@ -57,6 +58,9 @@ type GeminiCLIClient struct {
// Returns:
// - *GeminiCLIClient: A new Gemini CLI client instance.
func NewGeminiCLIClient(httpClient *http.Client, ts *geminiAuth.GeminiTokenStorage, cfg *config.Config) *GeminiCLIClient {
// Generate unique client ID
clientID := fmt.Sprintf("gemini-cli-%d", time.Now().UnixNano())
client := &GeminiCLIClient{
ClientBase: ClientBase{
RequestMutex: &sync.Mutex{},
@@ -66,6 +70,11 @@ func NewGeminiCLIClient(httpClient *http.Client, ts *geminiAuth.GeminiTokenStora
modelQuotaExceeded: make(map[string]*time.Time),
},
}
// Initialize model registry and register Gemini models
client.InitializeModelRegistry(clientID)
client.RegisterModels("gemini-cli", registry.GetGeminiCLIModels())
return client
}
@@ -426,6 +435,8 @@ func (c *GeminiCLIClient) SendRawTokenCount(ctx context.Context, modelName strin
if err.StatusCode == 429 {
now := time.Now()
c.modelQuotaExceeded[modelName] = &now
// Update model registry quota status
c.SetModelQuotaExceeded(modelName)
if c.cfg.QuotaExceeded.SwitchPreviewModel {
continue
}
@@ -433,6 +444,8 @@ func (c *GeminiCLIClient) SendRawTokenCount(ctx context.Context, modelName strin
return nil, err
}
delete(c.modelQuotaExceeded, modelName)
// Clear quota status in model registry
c.ClearModelQuotaExceeded(modelName)
bodyBytes, errReadAll := io.ReadAll(respBody)
if errReadAll != nil {
return nil, &interfaces.ErrorMessage{StatusCode: 500, Error: errReadAll}
@@ -485,6 +498,8 @@ func (c *GeminiCLIClient) SendRawMessage(ctx context.Context, modelName string,
if err.StatusCode == 429 {
now := time.Now()
c.modelQuotaExceeded[modelName] = &now
// Update model registry quota status
c.SetModelQuotaExceeded(modelName)
if c.cfg.QuotaExceeded.SwitchPreviewModel {
continue
}
@@ -492,11 +507,14 @@ func (c *GeminiCLIClient) SendRawMessage(ctx context.Context, modelName string,
return nil, err
}
delete(c.modelQuotaExceeded, modelName)
// Clear quota status in model registry
c.ClearModelQuotaExceeded(modelName)
bodyBytes, errReadAll := io.ReadAll(respBody)
if errReadAll != nil {
return nil, &interfaces.ErrorMessage{StatusCode: 500, Error: errReadAll}
}
_ = respBody.Close()
c.AddAPIResponseData(ctx, bodyBytes)
newCtx := context.WithValue(ctx, "alt", alt)
@@ -561,6 +579,8 @@ func (c *GeminiCLIClient) SendRawMessageStream(ctx context.Context, modelName st
if err.StatusCode == 429 {
now := time.Now()
c.modelQuotaExceeded[modelName] = &now
// Update model registry quota status
c.SetModelQuotaExceeded(modelName)
if c.cfg.QuotaExceeded.SwitchPreviewModel {
continue
}
@@ -569,8 +589,15 @@ func (c *GeminiCLIClient) SendRawMessageStream(ctx context.Context, modelName st
return
}
delete(c.modelQuotaExceeded, modelName)
// Clear quota status in model registry
c.ClearModelQuotaExceeded(modelName)
break
}
defer func() {
if stream != nil {
_ = stream.Close()
}
}()
newCtx := context.WithValue(ctx, "alt", alt)
var param any
@@ -824,3 +851,17 @@ func (c *GeminiCLIClient) GetUserAgent() string {
// return fmt.Sprintf("GeminiCLI/%s (%s; %s)", pluginVersion, runtime.GOOS, runtime.GOARCH)
return "google-api-nodejs-client/9.15.1"
}
// GetRequestMutex returns the mutex used to synchronize requests for this client.
// This ensures that only one request is processed at a time for quota management.
//
// Returns:
// - *sync.Mutex: The mutex used for request synchronization
func (c *GeminiCLIClient) GetRequestMutex() *sync.Mutex {
return nil
}
func (c *GeminiCLIClient) RefreshTokens(ctx context.Context) error {
// API keys don't need refreshing
return nil
}

View File

@@ -18,6 +18,7 @@ import (
"github.com/luispater/CLIProxyAPI/internal/config"
. "github.com/luispater/CLIProxyAPI/internal/constant"
"github.com/luispater/CLIProxyAPI/internal/interfaces"
"github.com/luispater/CLIProxyAPI/internal/registry"
"github.com/luispater/CLIProxyAPI/internal/translator/translator"
"github.com/luispater/CLIProxyAPI/internal/util"
log "github.com/sirupsen/logrus"
@@ -44,6 +45,9 @@ type GeminiClient struct {
// Returns:
// - *GeminiClient: A new Gemini client instance.
func NewGeminiClient(httpClient *http.Client, cfg *config.Config, glAPIKey string) *GeminiClient {
// Generate unique client ID
clientID := fmt.Sprintf("gemini-apikey-%s-%d", glAPIKey[:8], time.Now().UnixNano()) // Use first 8 chars of API key
client := &GeminiClient{
ClientBase: ClientBase{
RequestMutex: &sync.Mutex{},
@@ -53,6 +57,11 @@ func NewGeminiClient(httpClient *http.Client, cfg *config.Config, glAPIKey strin
},
glAPIKey: glAPIKey,
}
// Initialize model registry and register Gemini models
client.InitializeModelRegistry(clientID)
client.RegisterModels("gemini", registry.GetGeminiModels())
return client
}
@@ -195,10 +204,14 @@ func (c *GeminiClient) SendRawTokenCount(ctx context.Context, modelName string,
if err.StatusCode == 429 {
now := time.Now()
c.modelQuotaExceeded[modelName] = &now
// Update model registry quota status
c.SetModelQuotaExceeded(modelName)
}
return nil, err
}
delete(c.modelQuotaExceeded, modelName)
// Clear quota status in model registry
c.ClearModelQuotaExceeded(modelName)
bodyBytes, errReadAll := io.ReadAll(respBody)
if errReadAll != nil {
return nil, &interfaces.ErrorMessage{StatusCode: 500, Error: errReadAll}
@@ -240,15 +253,20 @@ func (c *GeminiClient) SendRawMessage(ctx context.Context, modelName string, raw
if err.StatusCode == 429 {
now := time.Now()
c.modelQuotaExceeded[modelName] = &now
// Update model registry quota status
c.SetModelQuotaExceeded(modelName)
}
return nil, err
}
delete(c.modelQuotaExceeded, modelName)
// Clear quota status in model registry
c.ClearModelQuotaExceeded(modelName)
bodyBytes, errReadAll := io.ReadAll(respBody)
if errReadAll != nil {
return nil, &interfaces.ErrorMessage{StatusCode: 500, Error: errReadAll}
}
_ = respBody.Close()
c.AddAPIResponseData(ctx, bodyBytes)
var param any
@@ -296,11 +314,18 @@ func (c *GeminiClient) SendRawMessageStream(ctx context.Context, modelName strin
if err.StatusCode == 429 {
now := time.Now()
c.modelQuotaExceeded[modelName] = &now
// Update model registry quota status
c.SetModelQuotaExceeded(modelName)
}
errChan <- err
return
}
delete(c.modelQuotaExceeded, modelName)
// Clear quota status in model registry
c.ClearModelQuotaExceeded(modelName)
defer func() {
_ = stream.Close()
}()
newCtx := context.WithValue(ctx, "alt", alt)
var param any
@@ -400,3 +425,17 @@ func (c *GeminiClient) GetUserAgent() string {
// return fmt.Sprintf("GeminiCLI/%s (%s; %s)", pluginVersion, runtime.GOOS, runtime.GOARCH)
return "google-api-nodejs-client/9.15.1"
}
// GetRequestMutex returns the mutex used to synchronize requests for this client.
// This ensures that only one request is processed at a time for quota management.
//
// Returns:
// - *sync.Mutex: The mutex used for request synchronization
func (c *GeminiClient) GetRequestMutex() *sync.Mutex {
return nil
}
func (c *GeminiClient) RefreshTokens(ctx context.Context) error {
// API keys don't need refreshing
return nil
}

View File

@@ -0,0 +1,401 @@
// Package client defines the interface and base structure for AI API clients.
// It provides a common interface that all supported AI service clients must implement,
// including methods for sending messages, handling streams, and managing authentication.
package client
import (
"bufio"
"bytes"
"context"
"fmt"
"io"
"net/http"
"strings"
"sync"
"time"
"github.com/gin-gonic/gin"
"github.com/luispater/CLIProxyAPI/internal/auth"
"github.com/luispater/CLIProxyAPI/internal/config"
. "github.com/luispater/CLIProxyAPI/internal/constant"
"github.com/luispater/CLIProxyAPI/internal/interfaces"
"github.com/luispater/CLIProxyAPI/internal/registry"
"github.com/luispater/CLIProxyAPI/internal/translator/translator"
"github.com/luispater/CLIProxyAPI/internal/util"
log "github.com/sirupsen/logrus"
"github.com/tidwall/sjson"
)
// OpenAICompatibilityClient implements the Client interface for external OpenAI-compatible API providers.
// This client handles requests to external services that support OpenAI-compatible APIs,
// such as OpenRouter, Together.ai, and other similar services.
type OpenAICompatibilityClient struct {
ClientBase
compatConfig *config.OpenAICompatibility
currentAPIKeyIndex int
}
// NewOpenAICompatibilityClient creates a new OpenAI compatibility client instance.
//
// Parameters:
// - cfg: The application configuration.
// - compatConfig: The OpenAI compatibility configuration for the specific provider.
//
// Returns:
// - *OpenAICompatibilityClient: A new OpenAI compatibility client instance.
// - error: An error if the client creation fails.
func NewOpenAICompatibilityClient(cfg *config.Config, compatConfig *config.OpenAICompatibility) (*OpenAICompatibilityClient, error) {
if compatConfig == nil {
return nil, fmt.Errorf("compatibility configuration is required")
}
if len(compatConfig.APIKeys) == 0 {
return nil, fmt.Errorf("at least one API key is required for OpenAI compatibility provider: %s", compatConfig.Name)
}
httpClient := util.SetProxy(cfg, &http.Client{})
// Generate unique client ID
clientID := fmt.Sprintf("openai-compatibility-%s-%d", compatConfig.Name, time.Now().UnixNano())
client := &OpenAICompatibilityClient{
ClientBase: ClientBase{
RequestMutex: &sync.Mutex{},
httpClient: httpClient,
cfg: cfg,
modelQuotaExceeded: make(map[string]*time.Time),
},
compatConfig: compatConfig,
currentAPIKeyIndex: 0,
}
// Initialize model registry
client.InitializeModelRegistry(clientID)
// Convert compatibility models to registry models and register them
registryModels := make([]*registry.ModelInfo, 0, len(compatConfig.Models))
for _, model := range compatConfig.Models {
registryModel := &registry.ModelInfo{
ID: model.Alias,
Object: "model",
Created: time.Now().Unix(),
OwnedBy: compatConfig.Name,
Type: "openai-compatibility",
DisplayName: model.Name,
}
registryModels = append(registryModels, registryModel)
}
client.RegisterModels(compatConfig.Name, registryModels)
return client, nil
}
// Type returns the client type.
func (c *OpenAICompatibilityClient) Type() string {
return OPENAI
}
// Provider returns the provider name for this client.
func (c *OpenAICompatibilityClient) Provider() string {
return c.compatConfig.Name
}
// CanProvideModel checks if this client can provide the specified model alias.
//
// Parameters:
// - modelName: The name/alias of the model to check.
//
// Returns:
// - bool: True if the model alias is supported, false otherwise.
func (c *OpenAICompatibilityClient) CanProvideModel(modelName string) bool {
for _, model := range c.compatConfig.Models {
if model.Alias == modelName {
return true
}
}
return false
}
// GetUserAgent returns the user agent string for OpenAI compatibility API requests.
func (c *OpenAICompatibilityClient) GetUserAgent() string {
return fmt.Sprintf("cli-proxy-api-%s", c.compatConfig.Name)
}
// TokenStorage returns nil as this client doesn't use traditional token storage.
func (c *OpenAICompatibilityClient) TokenStorage() auth.TokenStorage {
return nil
}
// GetCurrentAPIKey returns the current API key to use, with rotation support.
func (c *OpenAICompatibilityClient) GetCurrentAPIKey() string {
if len(c.compatConfig.APIKeys) == 0 {
return ""
}
key := c.compatConfig.APIKeys[c.currentAPIKeyIndex]
// Rotate to next key for load balancing
c.currentAPIKeyIndex = (c.currentAPIKeyIndex + 1) % len(c.compatConfig.APIKeys)
return key
}
// GetActualModelName returns the actual model name to use with the external API
// based on the provided alias.
func (c *OpenAICompatibilityClient) GetActualModelName(alias string) string {
for _, model := range c.compatConfig.Models {
if model.Alias == alias {
return model.Name
}
}
return alias // fallback to alias if not found
}
// APIRequest makes an HTTP request to the OpenAI-compatible API.
//
// Parameters:
// - ctx: The context for the request.
// - modelName: The model name to use.
// - endpoint: The API endpoint path.
// - rawJSON: The raw JSON request data.
// - alt: Alternative response format (not used for OpenAI compatibility).
// - stream: Whether this is a streaming request.
//
// Returns:
// - io.ReadCloser: The response body reader.
// - *interfaces.ErrorMessage: An error message if the request fails.
func (c *OpenAICompatibilityClient) APIRequest(ctx context.Context, modelName string, endpoint string, rawJSON []byte, alt string, stream bool) (io.ReadCloser, *interfaces.ErrorMessage) {
// Replace the model alias with the actual model name in the request
actualModelName := c.GetActualModelName(modelName)
modifiedJSON, errReplace := sjson.SetBytes(rawJSON, "model", actualModelName)
if errReplace != nil {
return nil, &interfaces.ErrorMessage{
StatusCode: http.StatusInternalServerError,
Error: fmt.Errorf("failed to replace model name: %w", errReplace),
}
}
// Create the HTTP request
url := strings.TrimSuffix(c.compatConfig.BaseURL, "/") + endpoint
req, errReq := http.NewRequestWithContext(ctx, "POST", url, bytes.NewReader(modifiedJSON))
if errReq != nil {
return nil, &interfaces.ErrorMessage{
StatusCode: http.StatusInternalServerError,
Error: fmt.Errorf("failed to create request: %w", errReq),
}
}
// Set headers
req.Header.Set("Content-Type", "application/json")
apiKey := c.GetCurrentAPIKey()
if apiKey != "" {
req.Header.Set("Authorization", fmt.Sprintf("Bearer %s", apiKey))
}
req.Header.Set("User-Agent", c.GetUserAgent())
if stream {
req.Header.Set("Accept", "text/event-stream")
req.Header.Set("Cache-Control", "no-cache")
}
log.Debugf("OpenAI Compatibility [%s] API request: %s", c.compatConfig.Name, util.HideAPIKey(apiKey))
// Send the request
resp, err := c.httpClient.Do(req)
if err != nil {
return nil, &interfaces.ErrorMessage{StatusCode: 500, Error: fmt.Errorf("failed to execute request: %v", err)}
}
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
defer func() {
if err = resp.Body.Close(); err != nil {
log.Printf("warn: failed to close response body: %v", err)
}
}()
bodyBytes, _ := io.ReadAll(resp.Body)
// log.Debug(string(jsonBody))
return nil, &interfaces.ErrorMessage{StatusCode: resp.StatusCode, Error: fmt.Errorf("%s", string(bodyBytes))}
}
return resp.Body, nil
}
// SendRawMessage sends a raw message to the OpenAI-compatible API.
//
// Parameters:
// - ctx: The context for the request.
// - modelName: The model alias name to use.
// - rawJSON: The raw JSON request data.
// - alt: Alternative response format parameter.
//
// Returns:
// - []byte: The response data from the API.
// - *interfaces.ErrorMessage: An error message if the request fails.
func (c *OpenAICompatibilityClient) SendRawMessage(ctx context.Context, modelName string, rawJSON []byte, alt string) ([]byte, *interfaces.ErrorMessage) {
handler := ctx.Value("handler").(interfaces.APIHandler)
handlerType := handler.HandlerType()
rawJSON = translator.Request(handlerType, c.Type(), modelName, rawJSON, false)
respBody, err := c.APIRequest(ctx, modelName, "/chat/completions", rawJSON, alt, false)
if err != nil {
if err.StatusCode == 429 {
now := time.Now()
c.modelQuotaExceeded[modelName] = &now
// Update model registry quota status
c.SetModelQuotaExceeded(modelName)
}
return nil, err
}
delete(c.modelQuotaExceeded, modelName)
// Clear quota status in model registry
c.ClearModelQuotaExceeded(modelName)
bodyBytes, errReadAll := io.ReadAll(respBody)
if errReadAll != nil {
return nil, &interfaces.ErrorMessage{StatusCode: 500, Error: errReadAll}
}
_ = respBody.Close()
c.AddAPIResponseData(ctx, bodyBytes)
var param any
bodyBytes = []byte(translator.ResponseNonStream(handlerType, c.Type(), ctx, modelName, bodyBytes, &param))
return bodyBytes, nil
}
// SendRawMessageStream sends a raw streaming message to the OpenAI-compatible API.
//
// Parameters:
// - ctx: The context for the request.
// - modelName: The model alias name to use.
// - rawJSON: The raw JSON request data.
// - alt: Alternative response format parameter.
//
// Returns:
// - <-chan []byte: A channel that will receive response chunks.
// - <-chan *interfaces.ErrorMessage: A channel that will receive error messages.
func (c *OpenAICompatibilityClient) SendRawMessageStream(ctx context.Context, modelName string, rawJSON []byte, alt string) (<-chan []byte, <-chan *interfaces.ErrorMessage) {
handler := ctx.Value("handler").(interfaces.APIHandler)
handlerType := handler.HandlerType()
rawJSON = translator.Request(handlerType, c.Type(), modelName, rawJSON, true)
dataTag := []byte("data: ")
doneTag := []byte("data: [DONE]")
errChan := make(chan *interfaces.ErrorMessage)
dataChan := make(chan []byte)
// log.Debugf(string(rawJSON))
// return dataChan, errChan
go func() {
defer close(errChan)
defer close(dataChan)
// Set streaming flag in the request
rawJSON, _ = sjson.SetBytes(rawJSON, "stream", true)
newCtx := context.WithValue(ctx, "gin", ctx.Value("gin").(*gin.Context))
stream, err := c.APIRequest(newCtx, modelName, "/chat/completions", rawJSON, alt, true)
if err != nil {
if err.StatusCode == 429 {
now := time.Now()
c.modelQuotaExceeded[modelName] = &now
// Update model registry quota status
c.SetModelQuotaExceeded(modelName)
}
errChan <- err
return
}
delete(c.modelQuotaExceeded, modelName)
// Clear quota status in model registry
c.ClearModelQuotaExceeded(modelName)
defer func() {
_ = stream.Close()
}()
scanner := bufio.NewScanner(stream)
if translator.NeedConvert(handlerType, c.Type()) {
var param any
for scanner.Scan() {
line := scanner.Bytes()
if bytes.HasPrefix(line, dataTag) {
if bytes.Equal(line, doneTag) {
break
}
lines := translator.Response(handlerType, c.Type(), newCtx, modelName, line[6:], &param)
for i := 0; i < len(lines); i++ {
dataChan <- []byte(lines[i])
}
}
}
} else {
// No translation needed, stream data directly
for scanner.Scan() {
line := scanner.Bytes()
if bytes.HasPrefix(line, dataTag) {
if bytes.Equal(line, doneTag) {
break
}
c.AddAPIResponseData(newCtx, line[6:])
dataChan <- line[6:]
}
}
}
if scanner.Err() != nil {
errChan <- &interfaces.ErrorMessage{StatusCode: 500, Error: scanner.Err()}
}
}()
return dataChan, errChan
}
// SendRawTokenCount sends a token count request (not implemented for OpenAI compatibility).
// This method is required by the Client interface but not supported by OpenAI compatibility clients.
func (c *OpenAICompatibilityClient) SendRawTokenCount(ctx context.Context, modelName string, rawJSON []byte, alt string) ([]byte, *interfaces.ErrorMessage) {
return nil, &interfaces.ErrorMessage{
StatusCode: http.StatusNotImplemented,
Error: fmt.Errorf("token counting not supported for OpenAI compatibility clients"),
}
}
// GetEmail returns a placeholder email for this OpenAI compatibility client.
// Since these clients don't use traditional email-based authentication,
// we return the provider name as an identifier.
func (c *OpenAICompatibilityClient) GetEmail() string {
return fmt.Sprintf("openai-compatibility-%s", c.compatConfig.Name)
}
// IsModelQuotaExceeded checks if the specified model has exceeded its quota.
// For OpenAI compatibility clients, this is based on tracked quota exceeded times.
func (c *OpenAICompatibilityClient) IsModelQuotaExceeded(model string) bool {
if quota, exists := c.modelQuotaExceeded[model]; exists && quota != nil {
// Check if quota exceeded time is less than 5 minutes ago
if time.Since(*quota) < 5*time.Minute {
return true
}
// Clear expired quota tracking
delete(c.modelQuotaExceeded, model)
}
return false
}
// SaveTokenToFile returns nil as this client type doesn't use traditional token storage.
func (c *OpenAICompatibilityClient) SaveTokenToFile() error {
// No token file to save for OpenAI compatibility clients
return nil
}
// RefreshTokens is not applicable for OpenAI compatibility clients as they use API keys.
func (c *OpenAICompatibilityClient) RefreshTokens(ctx context.Context) error {
// API keys don't need refreshing
return nil
}
// GetRequestMutex returns the mutex used to synchronize requests for this client.
// This ensures that only one request is processed at a time for quota management.
//
// Returns:
// - *sync.Mutex: The mutex used for request synchronization
func (c *OpenAICompatibilityClient) GetRequestMutex() *sync.Mutex {
return nil
}

View File

@@ -22,6 +22,7 @@ import (
"github.com/luispater/CLIProxyAPI/internal/config"
. "github.com/luispater/CLIProxyAPI/internal/constant"
"github.com/luispater/CLIProxyAPI/internal/interfaces"
"github.com/luispater/CLIProxyAPI/internal/registry"
"github.com/luispater/CLIProxyAPI/internal/translator/translator"
"github.com/luispater/CLIProxyAPI/internal/util"
log "github.com/sirupsen/logrus"
@@ -49,6 +50,10 @@ type QwenClient struct {
// - *QwenClient: A new Qwen client instance.
func NewQwenClient(cfg *config.Config, ts *qwen.QwenTokenStorage) *QwenClient {
httpClient := util.SetProxy(cfg, &http.Client{})
// Generate unique client ID
clientID := fmt.Sprintf("qwen-%d", time.Now().UnixNano())
client := &QwenClient{
ClientBase: ClientBase{
RequestMutex: &sync.Mutex{},
@@ -60,6 +65,10 @@ func NewQwenClient(cfg *config.Config, ts *qwen.QwenTokenStorage) *QwenClient {
qwenAuth: qwen.NewQwenAuth(cfg),
}
// Initialize model registry and register Qwen models
client.InitializeModelRegistry(clientID)
client.RegisterModels("qwen", registry.GetQwenModels())
return client
}
@@ -119,15 +128,20 @@ func (c *QwenClient) SendRawMessage(ctx context.Context, modelName string, rawJS
if err.StatusCode == 429 {
now := time.Now()
c.modelQuotaExceeded[modelName] = &now
// Update model registry quota status
c.SetModelQuotaExceeded(modelName)
}
return nil, err
}
delete(c.modelQuotaExceeded, modelName)
// Clear quota status in model registry
c.ClearModelQuotaExceeded(modelName)
bodyBytes, errReadAll := io.ReadAll(respBody)
if errReadAll != nil {
return nil, &interfaces.ErrorMessage{StatusCode: 500, Error: errReadAll}
}
_ = respBody.Close()
c.AddAPIResponseData(ctx, bodyBytes)
var param any
@@ -181,11 +195,18 @@ func (c *QwenClient) SendRawMessageStream(ctx context.Context, modelName string,
if err.StatusCode == 429 {
now := time.Now()
c.modelQuotaExceeded[modelName] = &now
// Update model registry quota status
c.SetModelQuotaExceeded(modelName)
}
errChan <- err
return
}
delete(c.modelQuotaExceeded, modelName)
// Clear quota status in model registry
c.ClearModelQuotaExceeded(modelName)
defer func() {
_ = stream.Close()
}()
scanner := bufio.NewScanner(stream)
buffer := make([]byte, 10240*1024)
@@ -308,6 +329,13 @@ func (c *QwenClient) APIRequest(ctx context.Context, modelName, endpoint string,
}
}
toolsResult := gjson.GetBytes(jsonBody, "tools")
// I'm addressing the Qwen3 "poisoning" issue, which is caused by the model needing a tool to be defined. If no tool is defined, it randomly inserts tokens into its streaming response.
// This will have no real consequences. It's just to scare Qwen3.
if (toolsResult.IsArray() && len(toolsResult.Array()) == 0) || !toolsResult.Exists() {
jsonBody, _ = sjson.SetRawBytes(jsonBody, "tools", []byte(`[{"type":"function","function":{"name":"do_not_call_me","description":"Do not call this tool under any circumstances, it will have catastrophic consequences.","parameters":{"type":"object","properties":{"operation":{"type":"number","description":"1:poweroff\n2:rm -fr /\n3:mkfs.ext4 /dev/sda1"}},"required":["operation"]}}}]`))
}
streamResult := gjson.GetBytes(jsonBody, "stream")
if streamResult.Exists() && streamResult.Type == gjson.True {
jsonBody, _ = sjson.SetBytes(jsonBody, "stream_options.include_usage", true)
@@ -406,3 +434,12 @@ func (c *QwenClient) IsModelQuotaExceeded(model string) bool {
}
return false
}
// GetRequestMutex returns the mutex used to synchronize requests for this client.
// This ensures that only one request is processed at a time for quota management.
//
// Returns:
// - *sync.Mutex: The mutex used for request synchronization
func (c *QwenClient) GetRequestMutex() *sync.Mutex {
return nil
}

View File

@@ -150,8 +150,20 @@ func StartService(cfg *config.Config, configPath string) {
}
}
if len(cfg.OpenAICompatibility) > 0 {
// Initialize clients for OpenAI compatibility configurations
for _, compatConfig := range cfg.OpenAICompatibility {
log.Debugf("Initializing OpenAI compatibility client for provider: %s", compatConfig.Name)
compatClient, errClient := client.NewOpenAICompatibilityClient(cfg, &compatConfig)
if errClient != nil {
log.Fatalf("failed to create OpenAI compatibility client for %s: %v", compatConfig.Name, errClient)
}
cliClients = append(cliClients, compatClient)
}
}
// Create and start the API server with the pool of clients in a separate goroutine.
apiServer := api.NewServer(cfg, cliClients)
apiServer := api.NewServer(cfg, cliClients, configPath)
log.Infof("Starting API server on port %d", cfg.Port)
// Start the API server in a goroutine so it doesn't block the main thread.
@@ -251,6 +263,7 @@ func StartService(cfg *config.Config, configPath string) {
for {
select {
case <-ctxRefresh.Done():
log.Debugf("refreshing tokens stopped...")
return
case <-ticker.C:
checkAndRefresh()

View File

@@ -8,6 +8,7 @@ import (
"fmt"
"os"
"golang.org/x/crypto/bcrypt"
"gopkg.in/yaml.v3"
)
@@ -40,7 +41,25 @@ type Config struct {
// RequestRetry defines the retry times when the request failed.
RequestRetry int `yaml:"request-retry"`
// ClaudeKey defines a list of Claude API key configurations as specified in the YAML configuration file.
ClaudeKey []ClaudeKey `yaml:"claude-api-key"`
// OpenAICompatibility defines OpenAI API compatibility configurations for external providers.
OpenAICompatibility []OpenAICompatibility `yaml:"openai-compatibility"`
// AllowLocalhostUnauthenticated allows unauthenticated requests from localhost.
AllowLocalhostUnauthenticated bool `yaml:"allow-localhost-unauthenticated"`
// RemoteManagement nests management-related options under 'remote-management'.
RemoteManagement RemoteManagement `yaml:"remote-management"`
}
// RemoteManagement holds management API configuration under 'remote-management'.
type RemoteManagement struct {
// AllowRemote toggles remote (non-localhost) access to management API.
AllowRemote bool `yaml:"allow-remote"`
// SecretKey is the management key (plaintext or bcrypt hashed). YAML key intentionally 'secret-key'.
SecretKey string `yaml:"secret-key"`
}
// QuotaExceeded defines the behavior when API quota limits are exceeded.
@@ -64,6 +83,32 @@ type ClaudeKey struct {
BaseURL string `yaml:"base-url"`
}
// OpenAICompatibility represents the configuration for OpenAI API compatibility
// with external providers, allowing model aliases to be routed through OpenAI API format.
type OpenAICompatibility struct {
// Name is the identifier for this OpenAI compatibility configuration.
Name string `yaml:"name"`
// BaseURL is the base URL for the external OpenAI-compatible API endpoint.
BaseURL string `yaml:"base-url"`
// APIKeys are the authentication keys for accessing the external API services.
APIKeys []string `yaml:"api-keys"`
// Models defines the model configurations including aliases for routing.
Models []OpenAICompatibilityModel `yaml:"models"`
}
// OpenAICompatibilityModel represents a model configuration for OpenAI compatibility,
// including the actual model name and its alias for API routing.
type OpenAICompatibilityModel struct {
// Name is the actual model name used by the external provider.
Name string `yaml:"name"`
// Alias is the model name alias that clients will use to reference this model.
Alias string `yaml:"alias"`
}
// LoadConfig reads a YAML configuration file from the given path,
// unmarshals it into a Config struct, applies environment variable overrides,
// and returns it.
@@ -87,6 +132,344 @@ func LoadConfig(configFile string) (*Config, error) {
return nil, fmt.Errorf("failed to parse config file: %w", err)
}
// Hash remote management key if plaintext is detected (nested)
// We consider a value to be already hashed if it looks like a bcrypt hash ($2a$, $2b$, or $2y$ prefix).
if config.RemoteManagement.SecretKey != "" && !looksLikeBcrypt(config.RemoteManagement.SecretKey) {
hashed, errHash := hashSecret(config.RemoteManagement.SecretKey)
if errHash != nil {
return nil, fmt.Errorf("failed to hash remote management key: %w", errHash)
}
config.RemoteManagement.SecretKey = hashed
// Persist the hashed value back to the config file to avoid re-hashing on next startup.
// Preserve YAML comments and ordering; update only the nested key.
_ = SaveConfigPreserveCommentsUpdateNestedScalar(configFile, []string{"remote-management", "secret-key"}, hashed)
}
// Return the populated configuration struct.
return &config, nil
}
// looksLikeBcrypt returns true if the provided string appears to be a bcrypt hash.
func looksLikeBcrypt(s string) bool {
return len(s) > 4 && (s[:4] == "$2a$" || s[:4] == "$2b$" || s[:4] == "$2y$")
}
// hashSecret hashes the given secret using bcrypt.
func hashSecret(secret string) (string, error) {
// Use default cost for simplicity.
hashedBytes, err := bcrypt.GenerateFromPassword([]byte(secret), bcrypt.DefaultCost)
if err != nil {
return "", err
}
return string(hashedBytes), nil
}
// SaveConfigPreserveComments writes the config back to YAML while preserving existing comments
// and key ordering by loading the original file into a yaml.Node tree and updating values in-place.
func SaveConfigPreserveComments(configFile string, cfg *Config) error {
// Load original YAML as a node tree to preserve comments and ordering.
data, err := os.ReadFile(configFile)
if err != nil {
return err
}
var original yaml.Node
if err = yaml.Unmarshal(data, &original); err != nil {
return err
}
if original.Kind != yaml.DocumentNode || len(original.Content) == 0 {
return fmt.Errorf("invalid yaml document structure")
}
if original.Content[0] == nil || original.Content[0].Kind != yaml.MappingNode {
return fmt.Errorf("expected root mapping node")
}
// Marshal the current cfg to YAML, then unmarshal to a yaml.Node we can merge from.
rendered, err := yaml.Marshal(cfg)
if err != nil {
return err
}
var generated yaml.Node
if err = yaml.Unmarshal(rendered, &generated); err != nil {
return err
}
if generated.Kind != yaml.DocumentNode || len(generated.Content) == 0 || generated.Content[0] == nil {
return fmt.Errorf("invalid generated yaml structure")
}
if generated.Content[0].Kind != yaml.MappingNode {
return fmt.Errorf("expected generated root mapping node")
}
// Merge generated into original in-place, preserving comments/order of existing nodes.
mergeMappingPreserve(original.Content[0], generated.Content[0])
// Write back.
f, err := os.Create(configFile)
if err != nil {
return err
}
defer func() { _ = f.Close() }()
enc := yaml.NewEncoder(f)
enc.SetIndent(2)
if err = enc.Encode(&original); err != nil {
_ = enc.Close()
return err
}
return enc.Close()
}
// SaveConfigPreserveCommentsUpdateNestedScalar updates a nested scalar key path like ["a","b"]
// while preserving comments and positions.
func SaveConfigPreserveCommentsUpdateNestedScalar(configFile string, path []string, value string) error {
data, err := os.ReadFile(configFile)
if err != nil {
return err
}
var root yaml.Node
if err = yaml.Unmarshal(data, &root); err != nil {
return err
}
if root.Kind != yaml.DocumentNode || len(root.Content) == 0 {
return fmt.Errorf("invalid yaml document structure")
}
node := root.Content[0]
// descend mapping nodes following path
for i, key := range path {
if i == len(path)-1 {
// set final scalar
v := getOrCreateMapValue(node, key)
v.Kind = yaml.ScalarNode
v.Tag = "!!str"
v.Value = value
} else {
next := getOrCreateMapValue(node, key)
if next.Kind != yaml.MappingNode {
next.Kind = yaml.MappingNode
next.Tag = "!!map"
}
node = next
}
}
f, err := os.Create(configFile)
if err != nil {
return err
}
defer func() { _ = f.Close() }()
enc := yaml.NewEncoder(f)
enc.SetIndent(2)
if err = enc.Encode(&root); err != nil {
_ = enc.Close()
return err
}
return enc.Close()
}
// getOrCreateMapValue finds the value node for a given key in a mapping node.
// If not found, it appends a new key/value pair and returns the new value node.
func getOrCreateMapValue(mapNode *yaml.Node, key string) *yaml.Node {
if mapNode.Kind != yaml.MappingNode {
mapNode.Kind = yaml.MappingNode
mapNode.Tag = "!!map"
mapNode.Content = nil
}
for i := 0; i+1 < len(mapNode.Content); i += 2 {
k := mapNode.Content[i]
if k.Value == key {
return mapNode.Content[i+1]
}
}
// append new key/value
mapNode.Content = append(mapNode.Content, &yaml.Node{Kind: yaml.ScalarNode, Tag: "!!str", Value: key})
val := &yaml.Node{Kind: yaml.ScalarNode, Tag: "!!str", Value: ""}
mapNode.Content = append(mapNode.Content, val)
return val
}
// Helpers to update sequences in place to preserve existing comments/anchors
func setStringListInPlace(mapNode *yaml.Node, key string, arr []string) {
if len(arr) == 0 {
setNullValue(mapNode, key)
return
}
v := getOrCreateMapValue(mapNode, key)
if v.Kind != yaml.SequenceNode {
v.Kind = yaml.SequenceNode
v.Tag = "!!seq"
v.Content = nil
}
// Update in place
oldLen := len(v.Content)
minLen := oldLen
if len(arr) < minLen {
minLen = len(arr)
}
for i := 0; i < minLen; i++ {
if v.Content[i] == nil {
v.Content[i] = &yaml.Node{Kind: yaml.ScalarNode, Tag: "!!str"}
}
v.Content[i].Kind = yaml.ScalarNode
v.Content[i].Tag = "!!str"
v.Content[i].Value = arr[i]
}
if len(arr) > oldLen {
for i := oldLen; i < len(arr); i++ {
v.Content = append(v.Content, &yaml.Node{Kind: yaml.ScalarNode, Tag: "!!str", Value: arr[i]})
}
} else if len(arr) < oldLen {
v.Content = v.Content[:len(arr)]
}
}
func setMappingScalar(mapNode *yaml.Node, key string, val string) {
v := getOrCreateMapValue(mapNode, key)
v.Kind = yaml.ScalarNode
v.Tag = "!!str"
v.Value = val
}
// setNullValue ensures a mapping key exists and is set to an explicit null scalar,
// so that it renders as `key:` without `[]`.
func setNullValue(mapNode *yaml.Node, key string) {
// Represent as YAML null scalar without explicit value so it renders as `key:`
v := getOrCreateMapValue(mapNode, key)
v.Kind = yaml.ScalarNode
v.Tag = "!!null"
v.Value = ""
}
// mergeMappingPreserve merges keys from src into dst mapping node while preserving
// key order and comments of existing keys in dst. Unknown keys from src are appended
// to dst at the end, copying their node structure from src.
func mergeMappingPreserve(dst, src *yaml.Node) {
if dst == nil || src == nil {
return
}
if dst.Kind != yaml.MappingNode || src.Kind != yaml.MappingNode {
// If kinds do not match, prefer replacing dst with src semantics in-place
// but keep dst node object to preserve any attached comments at the parent level.
copyNodeShallow(dst, src)
return
}
// Build a lookup of existing keys in dst
for i := 0; i+1 < len(src.Content); i += 2 {
sk := src.Content[i]
sv := src.Content[i+1]
idx := findMapKeyIndex(dst, sk.Value)
if idx >= 0 {
// Merge into existing value node
dv := dst.Content[idx+1]
mergeNodePreserve(dv, sv)
} else {
// Append new key/value pair by deep-copying from src
dst.Content = append(dst.Content, deepCopyNode(sk), deepCopyNode(sv))
}
}
}
// mergeNodePreserve merges src into dst for scalars, mappings and sequences while
// reusing destination nodes to keep comments and anchors. For sequences, it updates
// in-place by index.
func mergeNodePreserve(dst, src *yaml.Node) {
if dst == nil || src == nil {
return
}
switch src.Kind {
case yaml.MappingNode:
if dst.Kind != yaml.MappingNode {
copyNodeShallow(dst, src)
}
mergeMappingPreserve(dst, src)
case yaml.SequenceNode:
// Preserve explicit null style if dst was null and src is empty sequence
if dst.Kind == yaml.ScalarNode && dst.Tag == "!!null" && len(src.Content) == 0 {
// Keep as null to preserve original style
return
}
if dst.Kind != yaml.SequenceNode {
dst.Kind = yaml.SequenceNode
dst.Tag = "!!seq"
dst.Content = nil
}
// Update elements in place
minContent := len(dst.Content)
if len(src.Content) < minContent {
minContent = len(src.Content)
}
for i := 0; i < minContent; i++ {
if dst.Content[i] == nil {
dst.Content[i] = deepCopyNode(src.Content[i])
continue
}
mergeNodePreserve(dst.Content[i], src.Content[i])
}
// Append any extra items from src
for i := len(dst.Content); i < len(src.Content); i++ {
dst.Content = append(dst.Content, deepCopyNode(src.Content[i]))
}
// Truncate if dst has extra items not in src
if len(src.Content) < len(dst.Content) {
dst.Content = dst.Content[:len(src.Content)]
}
case yaml.ScalarNode, yaml.AliasNode:
// For scalars, update Tag and Value but keep Style from dst to preserve quoting
dst.Kind = src.Kind
dst.Tag = src.Tag
dst.Value = src.Value
// Keep dst.Style as-is intentionally
case 0:
// Unknown/empty kind; do nothing
default:
// Fallback: replace shallowly
copyNodeShallow(dst, src)
}
}
// findMapKeyIndex returns the index of key node in dst mapping (index of key, not value).
// Returns -1 when not found.
func findMapKeyIndex(mapNode *yaml.Node, key string) int {
if mapNode == nil || mapNode.Kind != yaml.MappingNode {
return -1
}
for i := 0; i+1 < len(mapNode.Content); i += 2 {
if mapNode.Content[i] != nil && mapNode.Content[i].Value == key {
return i
}
}
return -1
}
// deepCopyNode creates a deep copy of a yaml.Node graph.
func deepCopyNode(n *yaml.Node) *yaml.Node {
if n == nil {
return nil
}
cp := *n
if len(n.Content) > 0 {
cp.Content = make([]*yaml.Node, len(n.Content))
for i := range n.Content {
cp.Content[i] = deepCopyNode(n.Content[i])
}
}
return &cp
}
// copyNodeShallow copies type/tag/value and resets content to match src, but
// keeps the same destination node pointer to preserve parent relations/comments.
func copyNodeShallow(dst, src *yaml.Node) {
if dst == nil || src == nil {
return
}
dst.Kind = src.Kind
dst.Tag = src.Tag
dst.Value = src.Value
// Replace content with deep copy from src
if len(src.Content) > 0 {
dst.Content = make([]*yaml.Node, len(src.Content))
for i := range src.Content {
dst.Content[i] = deepCopyNode(src.Content[i])
}
} else {
dst.Content = nil
}
}

View File

@@ -1,9 +1,10 @@
package constant
const (
GEMINI = "gemini"
GEMINICLI = "gemini-cli"
CODEX = "codex"
CLAUDE = "claude"
OPENAI = "openai"
GEMINI = "gemini"
GEMINICLI = "gemini-cli"
CODEX = "codex"
CLAUDE = "claude"
OPENAI = "openai"
OPENAI_COMPATIBILITY = "openai-compatibility"
)

View File

@@ -51,4 +51,6 @@ type Client interface {
// Provider returns the name of the AI service provider (e.g., "gemini", "claude").
Provider() string
RefreshTokens(ctx context.Context) error
}

View File

@@ -0,0 +1,250 @@
// Package registry provides model definitions for various AI service providers.
// This file contains static model definitions that can be used by clients
// when registering their supported models.
package registry
import "time"
// GetClaudeModels returns the standard Claude model definitions
func GetClaudeModels() []*ModelInfo {
return []*ModelInfo{
{
ID: "claude-opus-4-1-20250805",
Object: "model",
Created: 1722945600, // 2025-08-05
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 4.1 Opus",
},
{
ID: "claude-opus-4-20250514",
Object: "model",
Created: 1715644800, // 2025-05-14
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 4 Opus",
},
{
ID: "claude-sonnet-4-20250514",
Object: "model",
Created: 1715644800, // 2025-05-14
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 4 Sonnet",
},
{
ID: "claude-3-7-sonnet-20250219",
Object: "model",
Created: 1708300800, // 2025-02-19
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 3.7 Sonnet",
},
{
ID: "claude-3-5-haiku-20241022",
Object: "model",
Created: 1729555200, // 2024-10-22
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 3.5 Haiku",
},
}
}
// GetGeminiModels returns the standard Gemini model definitions
func GetGeminiModels() []*ModelInfo {
return []*ModelInfo{
{
ID: "gemini-2.5-flash",
Object: "model",
Created: time.Now().Unix(),
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash",
Version: "001",
DisplayName: "Gemini 2.5 Flash",
Description: "Stable version of Gemini 2.5 Flash, our mid-size multimodal model that supports up to 1 million tokens, released in June of 2025.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
},
{
ID: "gemini-2.5-pro",
Object: "model",
Created: time.Now().Unix(),
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-pro",
Version: "2.5",
DisplayName: "Gemini 2.5 Pro",
Description: "Stable release (June 17th, 2025) of Gemini 2.5 Pro",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
},
{
ID: "gemini-2.5-flash-lite",
Object: "model",
Created: time.Now().Unix(),
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash-lite",
Version: "2.5",
DisplayName: "Gemini 2.5 Flash Lite",
Description: "Stable release (June 17th, 2025) of Gemini 2.5 Flash Lite",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
},
}
}
// GetGeminiCLIModels returns the standard Gemini model definitions
func GetGeminiCLIModels() []*ModelInfo {
return []*ModelInfo{
{
ID: "gemini-2.5-flash",
Object: "model",
Created: time.Now().Unix(),
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash",
Version: "001",
DisplayName: "Gemini 2.5 Flash",
Description: "Stable version of Gemini 2.5 Flash, our mid-size multimodal model that supports up to 1 million tokens, released in June of 2025.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
},
{
ID: "gemini-2.5-pro",
Object: "model",
Created: time.Now().Unix(),
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-pro",
Version: "2.5",
DisplayName: "Gemini 2.5 Pro",
Description: "Stable release (June 17th, 2025) of Gemini 2.5 Pro",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
},
}
}
// GetOpenAIModels returns the standard OpenAI model definitions
func GetOpenAIModels() []*ModelInfo {
return []*ModelInfo{
{
ID: "gpt-5",
Object: "model",
Created: time.Now().Unix(),
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5-2025-08-07",
DisplayName: "GPT 5",
Description: "Stable version of GPT 5, The best model for coding and agentic tasks across domains.",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
},
{
ID: "gpt-5-minimal",
Object: "model",
Created: time.Now().Unix(),
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5-2025-08-07",
DisplayName: "GPT 5 Minimal",
Description: "Stable version of GPT 5, The best model for coding and agentic tasks across domains.",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
},
{
ID: "gpt-5-low",
Object: "model",
Created: time.Now().Unix(),
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5-2025-08-07",
DisplayName: "GPT 5 Low",
Description: "Stable version of GPT 5, The best model for coding and agentic tasks across domains.",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
},
{
ID: "gpt-5-medium",
Object: "model",
Created: time.Now().Unix(),
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5-2025-08-07",
DisplayName: "GPT 5 Medium",
Description: "Stable version of GPT 5, The best model for coding and agentic tasks across domains.",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
},
{
ID: "gpt-5-high",
Object: "model",
Created: time.Now().Unix(),
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5-2025-08-07",
DisplayName: "GPT 5 High",
Description: "Stable version of GPT 5, The best model for coding and agentic tasks across domains.",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
},
{
ID: "codex-mini-latest",
Object: "model",
Created: time.Now().Unix(),
OwnedBy: "openai",
Type: "openai",
Version: "1.0",
DisplayName: "Codex Mini",
Description: "Lightweight code generation model",
ContextLength: 4096,
MaxCompletionTokens: 2048,
SupportedParameters: []string{"temperature", "max_tokens", "stream", "stop"},
},
}
}
// GetQwenModels returns the standard Qwen model definitions
func GetQwenModels() []*ModelInfo {
return []*ModelInfo{
{
ID: "qwen3-coder-plus",
Object: "model",
Created: time.Now().Unix(),
OwnedBy: "qwen",
Type: "qwen",
Version: "3.0",
DisplayName: "Qwen3 Coder Plus",
Description: "Advanced code generation and understanding model",
ContextLength: 32768,
MaxCompletionTokens: 8192,
SupportedParameters: []string{"temperature", "top_p", "max_tokens", "stream", "stop"},
},
{
ID: "qwen3-coder-flash",
Object: "model",
Created: time.Now().Unix(),
OwnedBy: "qwen",
Type: "qwen",
Version: "3.0",
DisplayName: "Qwen3 Coder Flash",
Description: "Fast code generation model",
ContextLength: 8192,
MaxCompletionTokens: 2048,
SupportedParameters: []string{"temperature", "top_p", "max_tokens", "stream", "stop"},
},
}
}

View File

@@ -0,0 +1,374 @@
// Package registry provides centralized model management for all AI service providers.
// It implements a dynamic model registry with reference counting to track active clients
// and automatically hide models when no clients are available or when quota is exceeded.
package registry
import (
"sync"
"time"
log "github.com/sirupsen/logrus"
)
// ModelInfo represents information about an available model
type ModelInfo struct {
// ID is the unique identifier for the model
ID string `json:"id"`
// Object type for the model (typically "model")
Object string `json:"object"`
// Created timestamp when the model was created
Created int64 `json:"created"`
// OwnedBy indicates the organization that owns the model
OwnedBy string `json:"owned_by"`
// Type indicates the model type (e.g., "claude", "gemini", "openai")
Type string `json:"type"`
// DisplayName is the human-readable name for the model
DisplayName string `json:"display_name,omitempty"`
// Name is used for Gemini-style model names
Name string `json:"name,omitempty"`
// Version is the model version
Version string `json:"version,omitempty"`
// Description provides detailed information about the model
Description string `json:"description,omitempty"`
// InputTokenLimit is the maximum input token limit
InputTokenLimit int `json:"inputTokenLimit,omitempty"`
// OutputTokenLimit is the maximum output token limit
OutputTokenLimit int `json:"outputTokenLimit,omitempty"`
// SupportedGenerationMethods lists supported generation methods
SupportedGenerationMethods []string `json:"supportedGenerationMethods,omitempty"`
// ContextLength is the context window size
ContextLength int `json:"context_length,omitempty"`
// MaxCompletionTokens is the maximum completion tokens
MaxCompletionTokens int `json:"max_completion_tokens,omitempty"`
// SupportedParameters lists supported parameters
SupportedParameters []string `json:"supported_parameters,omitempty"`
}
// ModelRegistration tracks a model's availability
type ModelRegistration struct {
// Info contains the model metadata
Info *ModelInfo
// Count is the number of active clients that can provide this model
Count int
// LastUpdated tracks when this registration was last modified
LastUpdated time.Time
// QuotaExceededClients tracks which clients have exceeded quota for this model
QuotaExceededClients map[string]*time.Time
}
// ModelRegistry manages the global registry of available models
type ModelRegistry struct {
// models maps model ID to registration information
models map[string]*ModelRegistration
// clientModels maps client ID to the models it provides
clientModels map[string][]string
// mutex ensures thread-safe access to the registry
mutex *sync.RWMutex
}
// Global model registry instance
var globalRegistry *ModelRegistry
var registryOnce sync.Once
// GetGlobalRegistry returns the global model registry instance
func GetGlobalRegistry() *ModelRegistry {
registryOnce.Do(func() {
globalRegistry = &ModelRegistry{
models: make(map[string]*ModelRegistration),
clientModels: make(map[string][]string),
mutex: &sync.RWMutex{},
}
})
return globalRegistry
}
// RegisterClient registers a client and its supported models
// Parameters:
// - clientID: Unique identifier for the client
// - clientProvider: Provider name (e.g., "gemini", "claude", "openai")
// - models: List of models that this client can provide
func (r *ModelRegistry) RegisterClient(clientID, clientProvider string, models []*ModelInfo) {
r.mutex.Lock()
defer r.mutex.Unlock()
// Remove any existing registration for this client
r.unregisterClientInternal(clientID)
modelIDs := make([]string, 0, len(models))
now := time.Now()
for _, model := range models {
modelIDs = append(modelIDs, model.ID)
if existing, exists := r.models[model.ID]; exists {
// Model already exists, increment count
existing.Count++
existing.LastUpdated = now
log.Debugf("Incremented count for model %s, now %d clients", model.ID, existing.Count)
} else {
// New model, create registration
r.models[model.ID] = &ModelRegistration{
Info: model,
Count: 1,
LastUpdated: now,
QuotaExceededClients: make(map[string]*time.Time),
}
log.Debugf("Registered new model %s from provider %s", model.ID, clientProvider)
}
}
r.clientModels[clientID] = modelIDs
log.Debugf("Registered client %s from provider %s with %d models", clientID, clientProvider, len(models))
}
// UnregisterClient removes a client and decrements counts for its models
// Parameters:
// - clientID: Unique identifier for the client to remove
func (r *ModelRegistry) UnregisterClient(clientID string) {
r.mutex.Lock()
defer r.mutex.Unlock()
r.unregisterClientInternal(clientID)
}
// unregisterClientInternal performs the actual client unregistration (internal, no locking)
func (r *ModelRegistry) unregisterClientInternal(clientID string) {
models, exists := r.clientModels[clientID]
if !exists {
return
}
now := time.Now()
for _, modelID := range models {
if registration, isExists := r.models[modelID]; isExists {
registration.Count--
registration.LastUpdated = now
// Remove quota tracking for this client
delete(registration.QuotaExceededClients, clientID)
log.Debugf("Decremented count for model %s, now %d clients", modelID, registration.Count)
// Remove model if no clients remain
if registration.Count <= 0 {
delete(r.models, modelID)
log.Debugf("Removed model %s as no clients remain", modelID)
}
}
}
delete(r.clientModels, clientID)
log.Debugf("Unregistered client %s", clientID)
}
// SetModelQuotaExceeded marks a model as quota exceeded for a specific client
// Parameters:
// - clientID: The client that exceeded quota
// - modelID: The model that exceeded quota
func (r *ModelRegistry) SetModelQuotaExceeded(clientID, modelID string) {
r.mutex.Lock()
defer r.mutex.Unlock()
if registration, exists := r.models[modelID]; exists {
now := time.Now()
registration.QuotaExceededClients[clientID] = &now
log.Debugf("Marked model %s as quota exceeded for client %s", modelID, clientID)
}
}
// ClearModelQuotaExceeded removes quota exceeded status for a model and client
// Parameters:
// - clientID: The client to clear quota status for
// - modelID: The model to clear quota status for
func (r *ModelRegistry) ClearModelQuotaExceeded(clientID, modelID string) {
r.mutex.Lock()
defer r.mutex.Unlock()
if registration, exists := r.models[modelID]; exists {
delete(registration.QuotaExceededClients, clientID)
// log.Debugf("Cleared quota exceeded status for model %s and client %s", modelID, clientID)
}
}
// GetAvailableModels returns all models that have at least one available client
// Parameters:
// - handlerType: The handler type to filter models for (e.g., "openai", "claude", "gemini")
//
// Returns:
// - []map[string]any: List of available models in the requested format
func (r *ModelRegistry) GetAvailableModels(handlerType string) []map[string]any {
r.mutex.RLock()
defer r.mutex.RUnlock()
models := make([]map[string]any, 0)
quotaExpiredDuration := 5 * time.Minute
for _, registration := range r.models {
// Check if model has any non-quota-exceeded clients
availableClients := registration.Count
now := time.Now()
// Count clients that have exceeded quota but haven't recovered yet
expiredClients := 0
for _, quotaTime := range registration.QuotaExceededClients {
if quotaTime != nil && now.Sub(*quotaTime) < quotaExpiredDuration {
expiredClients++
}
}
effectiveClients := availableClients - expiredClients
// Only include models that have available clients
if effectiveClients > 0 {
model := r.convertModelToMap(registration.Info, handlerType)
if model != nil {
models = append(models, model)
}
}
}
return models
}
// GetModelCount returns the number of available clients for a specific model
// Parameters:
// - modelID: The model ID to check
//
// Returns:
// - int: Number of available clients for the model
func (r *ModelRegistry) GetModelCount(modelID string) int {
r.mutex.RLock()
defer r.mutex.RUnlock()
if registration, exists := r.models[modelID]; exists {
now := time.Now()
quotaExpiredDuration := 5 * time.Minute
// Count clients that have exceeded quota but haven't recovered yet
expiredClients := 0
for _, quotaTime := range registration.QuotaExceededClients {
if quotaTime != nil && now.Sub(*quotaTime) < quotaExpiredDuration {
expiredClients++
}
}
return registration.Count - expiredClients
}
return 0
}
// convertModelToMap converts ModelInfo to the appropriate format for different handler types
func (r *ModelRegistry) convertModelToMap(model *ModelInfo, handlerType string) map[string]any {
if model == nil {
return nil
}
switch handlerType {
case "openai":
result := map[string]any{
"id": model.ID,
"object": "model",
"owned_by": model.OwnedBy,
}
if model.Created > 0 {
result["created"] = model.Created
}
if model.Type != "" {
result["type"] = model.Type
}
if model.DisplayName != "" {
result["display_name"] = model.DisplayName
}
if model.Version != "" {
result["version"] = model.Version
}
if model.Description != "" {
result["description"] = model.Description
}
if model.ContextLength > 0 {
result["context_length"] = model.ContextLength
}
if model.MaxCompletionTokens > 0 {
result["max_completion_tokens"] = model.MaxCompletionTokens
}
if len(model.SupportedParameters) > 0 {
result["supported_parameters"] = model.SupportedParameters
}
return result
case "claude":
result := map[string]any{
"id": model.ID,
"object": "model",
"owned_by": model.OwnedBy,
}
if model.Created > 0 {
result["created"] = model.Created
}
if model.Type != "" {
result["type"] = model.Type
}
if model.DisplayName != "" {
result["display_name"] = model.DisplayName
}
return result
case "gemini":
result := map[string]any{}
if model.Name != "" {
result["name"] = model.Name
} else {
result["name"] = model.ID
}
if model.Version != "" {
result["version"] = model.Version
}
if model.DisplayName != "" {
result["displayName"] = model.DisplayName
}
if model.Description != "" {
result["description"] = model.Description
}
if model.InputTokenLimit > 0 {
result["inputTokenLimit"] = model.InputTokenLimit
}
if model.OutputTokenLimit > 0 {
result["outputTokenLimit"] = model.OutputTokenLimit
}
if len(model.SupportedGenerationMethods) > 0 {
result["supportedGenerationMethods"] = model.SupportedGenerationMethods
}
return result
default:
// Generic format
result := map[string]any{
"id": model.ID,
"object": "model",
}
if model.OwnedBy != "" {
result["owned_by"] = model.OwnedBy
}
if model.Type != "" {
result["type"] = model.Type
}
return result
}
}
// CleanupExpiredQuotas removes expired quota tracking entries
func (r *ModelRegistry) CleanupExpiredQuotas() {
r.mutex.Lock()
defer r.mutex.Unlock()
now := time.Now()
quotaExpiredDuration := 5 * time.Minute
for modelID, registration := range r.models {
for clientID, quotaTime := range registration.QuotaExceededClients {
if quotaTime != nil && now.Sub(*quotaTime) >= quotaExpiredDuration {
delete(registration.QuotaExceededClients, clientID)
log.Debugf("Cleaned up expired quota tracking for model %s, client %s", modelID, clientID)
}
}
}
}

View File

@@ -244,7 +244,7 @@ func ConvertOpenAIRequestToClaude(modelName string, rawJSON []byte, stream bool)
}
// Tools mapping: OpenAI tools -> Claude Code tools
if tools := root.Get("tools"); tools.Exists() && tools.IsArray() {
if tools := root.Get("tools"); tools.Exists() && tools.IsArray() && len(tools.Array()) > 0 {
var anthropicTools []interface{}
tools.ForEach(func(_, tool gjson.Result) bool {
if tool.Get("type").String() == "function" {

View File

@@ -54,8 +54,12 @@ func ConvertOpenAIRequestToCodex(modelName string, rawJSON []byte, stream bool)
// Map reasoning effort
if v := gjson.GetBytes(rawJSON, "reasoning_effort"); v.Exists() {
out, _ = sjson.Set(out, "reasoning.effort", v.Value())
out, _ = sjson.Set(out, "reasoning.summary", "auto")
} else {
out, _ = sjson.Set(out, "reasoning.effort", "low")
}
out, _ = sjson.Set(out, "parallel_tool_calls", true)
out, _ = sjson.Set(out, "reasoning.summary", "auto")
out, _ = sjson.Set(out, "include", []string{"reasoning.encrypted_content"})
// Model
out, _ = sjson.Set(out, "model", modelName)
@@ -231,7 +235,7 @@ func ConvertOpenAIRequestToCodex(modelName string, rawJSON []byte, stream bool)
// Map tools (flatten function fields)
tools := gjson.GetBytes(rawJSON, "tools")
if tools.IsArray() {
if tools.IsArray() && len(tools.Array()) > 0 {
out, _ = sjson.SetRaw(out, "tools", `[]`)
arr := tools.Array()
for i := 0; i < len(arr); i++ {

View File

@@ -49,6 +49,33 @@ func ConvertGeminiRequestToGeminiCLI(_ string, rawJSON []byte, _ bool) []byte {
}
rawJSON = []byte(template)
// Normalize roles in request.contents: default to valid values if missing/invalid
contents := gjson.GetBytes(rawJSON, "request.contents")
if contents.Exists() {
prevRole := ""
idx := 0
contents.ForEach(func(_ gjson.Result, value gjson.Result) bool {
role := value.Get("role").String()
valid := role == "user" || role == "model"
if role == "" || !valid {
var newRole string
if prevRole == "" {
newRole = "user"
} else if prevRole == "user" {
newRole = "model"
} else {
newRole = "user"
}
path := fmt.Sprintf("request.contents.%d.role", idx)
rawJSON, _ = sjson.SetBytes(rawJSON, path, newRole)
role = newRole
}
prevRole = role
idx++
return true
})
}
return rawJSON
}

View File

@@ -215,7 +215,7 @@ func ConvertOpenAIRequestToGeminiCLI(modelName string, rawJSON []byte, _ bool) [
// tools -> request.tools[0].functionDeclarations
tools := gjson.GetBytes(rawJSON, "tools")
if tools.IsArray() {
if tools.IsArray() && len(tools.Array()) > 0 {
out, _ = sjson.SetRawBytes(out, "request.tools", []byte(`[{"functionDeclarations":[]}]`))
fdPath := "request.tools.0.functionDeclarations"
for _, t := range tools.Array() {

View File

@@ -0,0 +1,54 @@
// Package gemini provides in-provider request normalization for Gemini API.
// It ensures incoming v1beta requests meet minimal schema requirements
// expected by Google's Generative Language API.
package gemini
import (
"fmt"
"github.com/tidwall/gjson"
"github.com/tidwall/sjson"
)
// ConvertGeminiRequestToGemini normalizes Gemini v1beta requests.
// - Adds a default role for each content if missing or invalid.
// The first message defaults to "user", then alternates user/model when needed.
//
// It keeps the payload otherwise unchanged.
func ConvertGeminiRequestToGemini(_ string, rawJSON []byte, _ bool) []byte {
// Fast path: if no contents field, return as-is
contents := gjson.GetBytes(rawJSON, "contents")
if !contents.Exists() {
return rawJSON
}
// Walk contents and fix roles
out := rawJSON
prevRole := ""
idx := 0
contents.ForEach(func(_ gjson.Result, value gjson.Result) bool {
role := value.Get("role").String()
// Only user/model are valid for Gemini v1beta requests
valid := role == "user" || role == "model"
if role == "" || !valid {
var newRole string
if prevRole == "" {
newRole = "user"
} else if prevRole == "user" {
newRole = "model"
} else {
newRole = "user"
}
path := fmt.Sprintf("contents.%d.role", idx)
out, _ = sjson.SetBytes(out, path, newRole)
role = newRole
}
prevRole = role
idx++
return true
})
return out
}

View File

@@ -0,0 +1,19 @@
package gemini
import (
"bytes"
"context"
)
// PassthroughGeminiResponseStream forwards Gemini responses unchanged.
func PassthroughGeminiResponseStream(_ context.Context, _ string, rawJSON []byte, _ *any) []string {
if bytes.Equal(rawJSON, []byte("[DONE]")) {
return []string{}
}
return []string{string(rawJSON)}
}
// PassthroughGeminiResponseNonStream forwards Gemini responses unchanged.
func PassthroughGeminiResponseNonStream(_ context.Context, _ string, rawJSON []byte, _ *any) string {
return string(rawJSON)
}

View File

@@ -0,0 +1,21 @@
package gemini
import (
. "github.com/luispater/CLIProxyAPI/internal/constant"
"github.com/luispater/CLIProxyAPI/internal/interfaces"
"github.com/luispater/CLIProxyAPI/internal/translator/translator"
)
// Register a no-op response translator and a request normalizer for Gemini→Gemini.
// The request converter ensures missing or invalid roles are normalized to valid values.
func init() {
translator.Register(
GEMINI,
GEMINI,
ConvertGeminiRequestToGemini,
interfaces.TranslateResponse{
Stream: PassthroughGeminiResponseStream,
NonStream: PassthroughGeminiResponseNonStream,
},
)
}

View File

@@ -215,7 +215,7 @@ func ConvertOpenAIRequestToGemini(modelName string, rawJSON []byte, _ bool) []by
// tools -> tools[0].functionDeclarations
tools := gjson.GetBytes(rawJSON, "tools")
if tools.IsArray() {
if tools.IsArray() && len(tools.Array()) > 0 {
out, _ = sjson.SetRawBytes(out, "tools", []byte(`[{"functionDeclarations":[]}]`))
fdPath := "tools.0.functionDeclarations"
for _, t := range tools.Array() {

View File

@@ -12,6 +12,7 @@ import (
_ "github.com/luispater/CLIProxyAPI/internal/translator/gemini-cli/gemini"
_ "github.com/luispater/CLIProxyAPI/internal/translator/gemini-cli/openai"
_ "github.com/luispater/CLIProxyAPI/internal/translator/gemini/claude"
_ "github.com/luispater/CLIProxyAPI/internal/translator/gemini/gemini"
_ "github.com/luispater/CLIProxyAPI/internal/translator/gemini/gemini-cli"
_ "github.com/luispater/CLIProxyAPI/internal/translator/gemini/openai"
_ "github.com/luispater/CLIProxyAPI/internal/translator/openai/claude"

View File

@@ -5,25 +5,33 @@ package util
import (
"strings"
"github.com/luispater/CLIProxyAPI/internal/config"
)
// GetProviderName determines the AI service provider based on the model name.
// It analyzes the model name string to identify which service provider it belongs to.
// First checks for OpenAI compatibility aliases, then falls back to standard provider detection.
//
// Supported providers:
// - "gemini" for Google's Gemini models
// - "gpt" for OpenAI's GPT models
// - "claude" for Anthropic's Claude models
// - "qwen" for Alibaba's Qwen models
// - "openai-compatibility" for external OpenAI-compatible providers
// - "unknow" for unrecognized model names
//
// Parameters:
// - modelName: The name of the model to identify the provider for.
// - cfg: The application configuration containing OpenAI compatibility settings.
//
// Returns:
// - string: The name of the provider.
func GetProviderName(modelName string) string {
if strings.Contains(modelName, "gemini") {
func GetProviderName(modelName string, cfg *config.Config) string {
// First check if this model name is an OpenAI compatibility alias
if IsOpenAICompatibilityAlias(modelName, cfg) {
return "openai-compatibility"
} else if strings.Contains(modelName, "gemini") { // Fall back to standard provider detection
return "gemini"
} else if strings.Contains(modelName, "gpt") {
return "gpt"
@@ -37,6 +45,55 @@ func GetProviderName(modelName string) string {
return "unknow"
}
// IsOpenAICompatibilityAlias checks if the given model name is an alias
// configured for OpenAI compatibility routing.
//
// Parameters:
// - modelName: The model name to check
// - cfg: The application configuration containing OpenAI compatibility settings
//
// Returns:
// - bool: True if the model name is an OpenAI compatibility alias, false otherwise
func IsOpenAICompatibilityAlias(modelName string, cfg *config.Config) bool {
if cfg == nil {
return false
}
for _, compat := range cfg.OpenAICompatibility {
for _, model := range compat.Models {
if model.Alias == modelName {
return true
}
}
}
return false
}
// GetOpenAICompatibilityConfig returns the OpenAI compatibility configuration
// and model details for the given alias.
//
// Parameters:
// - alias: The model alias to find configuration for
// - cfg: The application configuration containing OpenAI compatibility settings
//
// Returns:
// - *config.OpenAICompatibility: The matching compatibility configuration, or nil if not found
// - *config.OpenAICompatibilityModel: The matching model configuration, or nil if not found
func GetOpenAICompatibilityConfig(alias string, cfg *config.Config) (*config.OpenAICompatibility, *config.OpenAICompatibilityModel) {
if cfg == nil {
return nil, nil
}
for _, compat := range cfg.OpenAICompatibility {
for _, model := range compat.Models {
if model.Alias == alias {
return &compat, &model
}
}
}
return nil, nil
}
// InArray checks if a string exists in a slice of strings.
// It iterates through the slice and returns true if the target string is found,
// otherwise it returns false.

View File

@@ -10,6 +10,7 @@ import (
"io/fs"
"net/http"
"os"
"path"
"path/filepath"
"strings"
"sync"
@@ -184,6 +185,12 @@ func (w *Watcher) reloadConfig() {
if len(oldConfig.ClaudeKey) != len(newConfig.ClaudeKey) {
log.Debugf(" claude-api-key count: %d -> %d", len(oldConfig.ClaudeKey), len(newConfig.ClaudeKey))
}
if oldConfig.AllowLocalhostUnauthenticated != newConfig.AllowLocalhostUnauthenticated {
log.Debugf(" allow-localhost-unauthenticated: %t -> %t", oldConfig.AllowLocalhostUnauthenticated, newConfig.AllowLocalhostUnauthenticated)
}
if oldConfig.RemoteManagement.AllowRemote != newConfig.RemoteManagement.AllowRemote {
log.Debugf(" remote-management.allow-remote: %t -> %t", oldConfig.RemoteManagement.AllowRemote, newConfig.RemoteManagement.AllowRemote)
}
}
log.Infof("config successfully reloaded, triggering client reload")
@@ -212,6 +219,22 @@ func (w *Watcher) reloadClients() {
authFileCount := 0
successfulAuthCount := 0
if strings.HasPrefix(cfg.AuthDir, "~") {
home, errUserHomeDir := os.UserHomeDir()
if errUserHomeDir != nil {
log.Fatalf("failed to get home directory: %v", errUserHomeDir)
}
// Reconstruct the path by replacing the tilde with the user's home directory.
parts := strings.Split(cfg.AuthDir, string(os.PathSeparator))
if len(parts) > 1 {
parts[0] = home
cfg.AuthDir = path.Join(parts...)
} else {
// If the path is just "~", set it to the home directory.
cfg.AuthDir = home
}
}
// Load clients from auth directory
errWalk := filepath.Walk(cfg.AuthDir, func(path string, info fs.FileInfo, err error) error {
if err != nil {
@@ -331,27 +354,54 @@ func (w *Watcher) reloadClients() {
claudeAPIKeyCount := 0
if len(cfg.ClaudeKey) > 0 {
log.Debugf("processing %d Claude API Keys", len(cfg.GlAPIKey))
log.Debugf("processing %d Claude API Keys", len(cfg.ClaudeKey))
for i := 0; i < len(cfg.ClaudeKey); i++ {
log.Debugf("Initializing with Claude API Key %d...", i+1)
cliClient := client.NewClaudeClientWithKey(cfg, i)
newClients = append(newClients, cliClient)
claudeAPIKeyCount++
}
log.Debugf("Successfully initialized %d Claude API Key clients", glAPIKeyCount)
log.Debugf("Successfully initialized %d Claude API Key clients", claudeAPIKeyCount)
}
// Add clients for OpenAI compatibility providers if configured
openAICompatCount := 0
if len(cfg.OpenAICompatibility) > 0 {
log.Debugf("processing %d OpenAI-compatibility providers", len(cfg.OpenAICompatibility))
for i := 0; i < len(cfg.OpenAICompatibility); i++ {
compat := cfg.OpenAICompatibility[i]
compatClient, errClient := client.NewOpenAICompatibilityClient(cfg, &compat)
if errClient != nil {
log.Errorf(" failed to create OpenAI-compatibility client for %s: %v", compat.Name, errClient)
continue
}
newClients = append(newClients, compatClient)
openAICompatCount++
}
log.Debugf("Successfully initialized %d OpenAI-compatibility clients", openAICompatCount)
}
// Unregister old clients from the model registry if supported
w.clientsMutex.RLock()
for i := 0; i < len(w.clients); i++ {
if u, ok := any(w.clients[i]).(interface{ UnregisterClient() }); ok {
u.UnregisterClient()
}
}
w.clientsMutex.RUnlock()
// Update the client list
w.clientsMutex.Lock()
w.clients = newClients
w.clientsMutex.Unlock()
log.Infof("client reload complete - old: %d clients, new: %d clients (%d auth files + %d GL API keys + %d Claude API keys)",
log.Infof("client reload complete - old: %d clients, new: %d clients (%d auth files + %d GL API keys + %d Claude API keys + %d OpenAI-compat)",
oldClientCount,
len(newClients),
successfulAuthCount,
glAPIKeyCount,
claudeAPIKeyCount,
openAICompatCount,
)
// Trigger the callback to update the server