Skip to content

Commit 7b14a32

Browse files
authored
Merge pull request #794 from tisnik/lcore-510-docstring-about-quota-limiter
LCORE-510: docstring about quota limiter
2 parents 3318cde + 014a852 commit 7b14a32

File tree

2 files changed

+49
-2
lines changed

2 files changed

+49
-2
lines changed

src/quota/__init__.py

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,18 @@
1-
"""Quota management."""
1+
"""Quota management.
2+
3+
Tokens and token quota limits
4+
5+
Tokens are small chunks of text, which can be as small as one character or as
6+
large as one word. Tokens are the units of measurement used to quantify the
7+
amount of text that the service sends to, or receives from, a large language
8+
model (LLM). Every interaction with the Service and the LLM is counted in
9+
tokens.
10+
11+
LLM providers typically charge for their services using a token-based pricing model.
12+
13+
Token quota limits define the number of tokens that can be used in a certain
14+
timeframe. Implementing token quota limits helps control costs, encourage more
15+
efficient use of queries, and regulate demand on the system. In a multi-user
16+
configuration, token quota limits help provide equal access to all users
17+
ensuring everyone has an opportunity to submit queries.
18+
"""

src/quota/quota_limiter.py

Lines changed: 31 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,34 @@
1-
"""Abstract class that is the parent for all quota limiter implementations."""
1+
"""Abstract class that is the parent for all quota limiter implementations.
2+
3+
It is possible to limit quota usage per user or per service or services (that
4+
typically run in one cluster). Each limit is configured as a separate _quota
5+
limiter_. It can be of type `user_limiter` or `cluster_limiter` (which is name
6+
that makes sense in OpenShift deployment). There are three configuration
7+
options for each limiter:
8+
9+
1. `period` specified in a human-readable form, see
10+
https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
11+
for all possible options. When the end of the period is reached, quota is reset
12+
or increased
13+
1. `initial_quota` is set at beginning of the period
14+
1. `quota_increase` this value (if specified) is used to increase quota when period is reached
15+
16+
There are two basic use cases:
17+
18+
1. When quota needs to be reset specific value periodically (for example on
19+
weekly on monthly basis), specify `initial_quota` to the required value
20+
1. When quota needs to be increased by specific value periodically (for example
21+
on daily basis), specify `quota_increase`
22+
23+
Technically it is possible to specify both `initial_quota` and
24+
`quota_increase`. It means that at the end of time period the quota will be
25+
*reset* to `initial_quota + quota_increase`.
26+
27+
Please note that any number of quota limiters can be configured. For example,
28+
two user quota limiters can be set to:
29+
- increase quota by 100,000 tokens each day
30+
- reset quota to 10,000,000 tokens each month
31+
"""
232

333
from abc import ABC, abstractmethod
434

0 commit comments

Comments
 (0)