Skip to content

Implement Centralized LLM API Key Pool Manager Service #6

@ruvasqm

Description

@ruvasqm

Description:
This issue covers the development of a standalone backend service responsible for managing a pool of user-donated LLM API keys. This service will act as an intermediary between our Discord bot and various LLM providers (e.g., OpenAI, Anthropic). Its primary functions include securely storing keys, selecting an available key for any incoming request (regardless of the requesting user), tracking usage per key, enforcing daily/monthly limits per key, and proxying requests to the actual LLM APIs.

Key Design Principle: The key pool is not user-segmented for request fulfillment. Any valid request from the bot will consume tokens from an available key in the pool, chosen based on its current usage and limits. Users' keys are primarily linked to them for auditing their contributions and for setting their individual key's limits, not for exclusive usage by that user.

Acceptance Criteria:

  • The service can be deployed and run on the EC2 instance.
  • The service exposes a secure HTTP/HTTPS endpoint for the Discord bot to make LLM generation requests.
  • The service can securely store and retrieve encrypted LLM API keys in the SQLite database.
  • The service can dynamically select an active, non-exhausted key from the pool for each LLM request.
  • The service accurately tracks token usage (prompt + completion) for each individual key in the pool.
  • The service enforces configurable daily and monthly token limits for each key.
  • When a key reaches its limit, it is temporarily marked as 'exhausted' and is not selected for further requests until its limit resets.
  • The service gracefully handles cases where no keys are available (e.g., all exhausted, no keys added).
  • The service proxies requests to the actual LLM APIs (e.g., OpenAI, Anthropic) and returns their responses to the bot.
  • All LLM API calls and key management events are logged for auditing purposes (see Issue Add Repo Commands #2).
  • The service includes an endpoint for the bot to add/remove/update user-donated keys.

Tasks:

  1. Database Schema Definition (SQLite):

    • Define llm_api_keys table:
      • id (PK)
      • encrypted_key (BLOB/TEXT - stores the encrypted API key)
      • owner_discord_id (TEXT - Discord User ID of the key donor, for audit/attribution)
      • daily_limit_tokens (INTEGER - configurable daily limit for this key)
      • monthly_limit_tokens (INTEGER - configurable monthly limit for this key)
      • current_daily_usage_tokens (INTEGER - tracks usage for the current day, reset daily)
      • current_monthly_usage_tokens (INTEGER - tracks usage for the current month, reset monthly)
      • last_reset_day (TEXT - YYYY-MM-DD, for daily reset logic)
      • last_reset_month (TEXT - YYYY-MM, for monthly reset logic)
      • status (TEXT - e.g., 'active', 'paused_daily_limit', 'paused_monthly_limit', 'invalid', 'revoked')
      • added_at (DATETIME)
      • last_used_at (DATETIME)
    • (Note: key_usage_logs will be handled by Issue Add Repo Commands #2, but this service will write to it.)
  2. Secure Key Storage & Encryption:

    • Implement a robust symmetric encryption scheme for API keys. Given SQLite, this will be application-level encryption.
    • Crucial: Securely manage the master encryption key. This key should not be stored in the database. Retrieve it from AWS SSM Parameter Store (as a Secure String) at application startup.
    • Implement functions for encrypting keys before storing them and decrypting them only in memory when needed for an LLM call.
  3. Key Selection Logic:

    • Develop an algorithm to select the "best" available key for a given request.
    • Prioritize active keys.
    • Consider strategies like:
      • Least recently used among active keys below limits.
      • Round-robin among active keys below limits.
      • Keys with the most remaining quota.
    • Ensure atomic updates to current_daily_usage_tokens and current_monthly_usage_tokens during key selection and usage.
  4. LLM API Proxy Implementation:

    • Create a generic interface for interacting with different LLM providers (e.g., OpenAIClient, AnthropicClient).
    • Handle request forwarding, response parsing, and token counting (prompt and completion tokens).
    • Implement robust error handling for LLM API responses (rate limits, invalid keys, server errors, etc.).
  5. Usage Tracking & Quota Enforcement:

    • After each successful LLM call, update the current_daily_usage_tokens and current_monthly_usage_tokens for the used key.
    • Implement daily and monthly reset logic for current_daily_usage_tokens and current_monthly_usage_tokens (e.g., a scheduled task that runs at midnight UTC and on the 1st of the month UTC).
    • Update status field of keys when limits are hit or reset.
  6. API Endpoints for Bot:

    • /api/llm/generate: Accepts LLM request payload, returns LLM response.
    • /api/llm/key/add: Accepts owner_discord_id and encrypted key string, adds to pool. (Consider a secure web form for key submission rather than direct Discord command for security).
    • /api/llm/key/remove: Accepts owner_discord_id, marks key as revoked.
    • /api/llm/key/update_limits: Allows changing limits for a specific owner_discord_id's key.
  7. Error Handling & Fallbacks:

    • Implement robust error handling for database operations, encryption/decryption, and external API calls.
    • Define behavior when no keys are available or all keys are exhausted.

Dependencies:

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions