Skip to content
This repository was archived by the owner on Jul 22, 2025. It is now read-only.

Conversation

@SamSaffron
Copy link
Member

@SamSaffron SamSaffron commented Jan 2, 2025

Adds a comprehensive quota management system for LLM models that allows:

  • Setting per-group token and usage limits with configurable durations
  • Tracking and enforcing token/usage limits across user groups
  • Quota reset periods (hourly, daily, weekly, or custom)
  • Admin UI for managing quotas with real-time updates
  • Full test coverage for quota models and controllers

This system provides granular control over LLM API usage by allowing admins
to define limits on both total tokens and number of requests per group.
Supports multiple concurrent quotas per model and automatically handles
quota resets.

image

image

image

@SamSaffron SamSaffron marked this pull request as ready for review January 10, 2025 05:56
@SaifMurtaza
Copy link

Are we able to provide guidelines for setting the max tokens? Some tooltips would be helpful here, i.e - for the non-technical admin,concepts such as tokens might be missed. They are probably left wondering "How many words does this actually mean?" or "How many times can I message the persona?"

Copy link
Member

@keegangeorge keegangeorge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, just a few suggestions here and there

SamSaffron and others added 24 commits January 13, 2025 17:53
This introduces a new feature for per group quotas
we need to route through llm model for simplicity
Co-authored-by: Keegan George <[email protected]>
@SamSaffron
Copy link
Member Author

OK @keegangeorge I think it is all addressed! thanks for the review

@SamSaffron
Copy link
Member Author

thanks @SaifMurtaza , will add more clarity, I am on the fence on implementing an optional "absolute quota" that is shared on the group... then you can guarantee you would never spend more than N$ on AI a day even if you have very large groups.

@davidtaylorhq
Copy link
Member

As mentioned on dev, the use of { i18n } from "discourse-i18n"; should be reverted until after the next core version bump.

Copy link
Member

@keegangeorge keegangeorge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just two small fixes to fix linting and .discourse-compatibility needed before merging 🚀

import { getOwner } from "@ember/owner";
import { service } from "@ember/service";
import I18n from "discourse-i18n";
import { i18n } from "discourse-i18n";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See internal: /t/124366/17 we should pin discourse-ai for core < beta4 in .discourse-compatibility

@SamSaffron SamSaffron merged commit d07cf51 into main Jan 14, 2025
6 checks passed
@SamSaffron SamSaffron deleted the quotas2 branch January 14, 2025 04:54
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Development

Successfully merging this pull request may close these issues.

5 participants