Skip to content

Commit 014a852

Browse files
committed
Info about quotas
1 parent 763febd commit 014a852

File tree

1 file changed

+18
-1
lines changed

1 file changed

+18
-1
lines changed

src/quota/__init__.py

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,18 @@
1-
"""Quota management."""
1+
"""Quota management.
2+
3+
Tokens and token quota limits
4+
5+
Tokens are small chunks of text, which can be as small as one character or as
6+
large as one word. Tokens are the units of measurement used to quantify the
7+
amount of text that the service sends to, or receives from, a large language
8+
model (LLM). Every interaction with the Service and the LLM is counted in
9+
tokens.
10+
11+
LLM providers typically charge for their services using a token-based pricing model.
12+
13+
Token quota limits define the number of tokens that can be used in a certain
14+
timeframe. Implementing token quota limits helps control costs, encourage more
15+
efficient use of queries, and regulate demand on the system. In a multi-user
16+
configuration, token quota limits help provide equal access to all users
17+
ensuring everyone has an opportunity to submit queries.
18+
"""

0 commit comments

Comments
 (0)