llm-gateway/memory/2026-05-01.md
root bebe8c1bb5 docs: add LLM Gateway design document
Design a unified LLM Gateway with:
- Multi-format API support (OpenAI, Anthropic, Responses API)
- 5 provider adapters (OpenAI, Anthropic, Azure, Gemini, Bedrock)
- Model aliasing, routing, and load balancing
- RPM/TPM rate limiting and budget control (key/project level)
- Fallback/retry with circuit breaker
- Request logging and usage statistics
- Admin API for provider/key/model management

Tech stack: Python (FastAPI) + SQLite

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 14:52:53 +08:00

1.1 KiB
Raw Permalink Blame History

2026-05-01

纪要

  • 工作区已初始化,可在此持续记录当天上下文。

决策

LLM Gateway 技术选型

  • 结论:采用 Python (FastAPI) + SQLite 技术栈实现 LLM Gateway
  • 理由
    1. Python 开发效率高,一期功能多,需快速落地
    2. SQLite 零配置,轻量级,适合一期验证
    3. FastAPI 原生异步,自动文档,生态成熟
  • 影响:后续可平滑迁移到 PostgreSQL + Redis 方案
  • 替代方案Go高性能但开发成本高、Rust极致性能但开发周期长

LLM Gateway 架构设计

  • 结论:采用统一 API 入口 + 多 Provider Adapter 架构
  • 理由
    1. 支持 OpenAI-compatible、OpenAI Responses API、Anthropic Messages API 三种格式
    2. 通过 Transformer 层实现格式互转
    3. Router 层实现模型别名和路由
    4. Load Balancer + Circuit Breaker 实现高可用
  • 一期 ProviderOpenAI、Anthropic、Azure OpenAI、Google Gemini、AWS Bedrock
  • 二期扩展Structured output、插件系统、账单结算、组织级 RBAC