Implement a unified LLM Gateway supporting multiple API formats and providers: Features: - OpenAI Chat Completions, Responses API, and Anthropic Messages API - Provider adapters for OpenAI, Anthropic, Azure OpenAI, Google Gemini, AWS Bedrock - Model aliasing with weighted round-robin load balancing - Virtual API keys with RPM/TPM rate limiting - Budget control at key and project levels - Request logging, usage statistics, and audit logs - Fallback/retry with circuit breaker pattern - Admin CRUD APIs for providers, projects, keys, models, usage - Provider health checks Tech stack: - FastAPI with async SQLAlchemy 2.0 - SQLite with aiosqlite - bcrypt for API key hashing, AES-256 for provider key encryption - Docker containerization Tests: 18 passing integration tests for admin API endpoints Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
30 lines
517 B
Plaintext
30 lines
517 B
Plaintext
# Application
|
|
APP_NAME=LLM Gateway
|
|
DEBUG=false
|
|
LOG_LEVEL=INFO
|
|
|
|
# Server
|
|
HOST=0.0.0.0
|
|
PORT=8000
|
|
|
|
# Database
|
|
DATABASE_URL=sqlite:///data/gateway.db
|
|
|
|
# Security
|
|
# Generate with: python -c "import secrets; print(secrets.token_hex(32))"
|
|
MASTER_KEY=your-master-key-here-at-least-32-characters
|
|
|
|
# Rate Limiting
|
|
RATE_LIMIT_WINDOW_SECONDS=60
|
|
# GLOBAL_RPM_LIMIT=1000
|
|
# GLOBAL_TPM_LIMIT=1000000
|
|
|
|
# Retry
|
|
MAX_RETRIES=3
|
|
RETRY_INITIAL_DELAY=1.0
|
|
RETRY_MAX_DELAY=30.0
|
|
|
|
# Health Check
|
|
HEALTH_CHECK_INTERVAL=30
|
|
HEALTH_CHECK_TIMEOUT=10
|