WIP: Add count_tokens for openAI models #3447

wirthual · 2025-11-16T23:48:29Z

Work for #3430

Took method 1:1 from OpenAI cookbook. However the example only shows for certain models. How should other models be handled? Quick test with gpt-5 showed a token number that differed from this method.

gpt-5
Warning: gpt-5 may update over time. Returning num tokens assuming gpt-5-2025-08-07.
110 prompt tokens counted by num_tokens_from_messages().
109 prompt tokens counted by the OpenAI API.

Test script

from openai import OpenAI
import os
import tiktoken

def num_tokens_from_messages(messages, model="gpt-4o-mini-2024-07-18"):
    """Return the number of tokens used by a list of messages."""
    try:
        encoding = tiktoken.encoding_for_model(model)
    except KeyError:
        print("Warning: model not found. Using o200k_base encoding.")
        encoding = tiktoken.get_encoding("o200k_base")
    if model in {
        "gpt-3.5-turbo-0125",
        "gpt-4-0314",
        "gpt-4-32k-0314",
        "gpt-4-0613",
        "gpt-4-32k-0613",
        "gpt-4o-mini-2024-07-18",
        "gpt-4o-2024-08-06",
        "gpt-4.1-2025-04-14",
        "gpt-5-2025-08-07",
        }:
        tokens_per_message = 3
        tokens_per_name = 1
    elif "gpt-3.5-turbo" in model:
        print("Warning: gpt-3.5-turbo may update over time. Returning num tokens assuming gpt-3.5-turbo-0125.")
        return num_tokens_from_messages(messages, model="gpt-3.5-turbo-0125")
    elif "gpt-4o-mini" in model:
        print("Warning: gpt-4o-mini may update over time. Returning num tokens assuming gpt-4o-mini-2024-07-18.")
        return num_tokens_from_messages(messages, model="gpt-4o-mini-2024-07-18")
    elif "gpt-4o" in model:
        print("Warning: gpt-4o and gpt-4o-mini may update over time. Returning num tokens assuming gpt-4o-2024-08-06.")
        return num_tokens_from_messages(messages, model="gpt-4o-2024-08-06")
    elif "gpt-4" in model:
        print("Warning: gpt-4 may update over time. Returning num tokens assuming gpt-4-0613.")
        return num_tokens_from_messages(messages, model="gpt-4-0613")
    elif "gpt-5" in model:
        print("Warning: gpt-5 may update over time. Returning num tokens assuming gpt-5-2025-08-07.")
        return num_tokens_from_messages(messages, model="gpt-5-2025-08-07")
    else:
        raise NotImplementedError(
            f"""num_tokens_from_messages() is not implemented for model {model}."""
        )
    num_tokens = 0
    for message in messages:
        num_tokens += tokens_per_message
        for key, value in message.items():
            num_tokens += len(encoding.encode(value))
            if key == "name":
                num_tokens += tokens_per_name
    num_tokens += 3  # every reply is primed with <|start|>assistant<|message|>
    return num_tokens

client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY", "<your OpenAI API key if not set as env var>"))

example_messages = [
    {
        "role": "system",
        "content": "You are a helpful, pattern-following assistant that translates corporate jargon into plain English.",
    },
    {
        "role": "system",
        "content": "New synergies will help drive top-line growth.",
    },
    {
        "role": "system",
        "content": "Things working well together will increase revenue.",
    },
    {
        "role": "system",
        "content": "Let's circle back when we have more bandwidth to touch base on opportunities for increased leverage.",
    },
    {
        "role": "system",
        "content": "Let's talk later when we're less busy about how to do better.",
    },
    {
        "role": "user",
        "content": "This late pivot means we don't have time to boil the ocean for the client deliverable.",
    },
]

for model in [
    "gpt-3.5-turbo",
    "gpt-4",
    "gpt-4o",
    "gpt-4o-mini",
    "gpt-5"
    ]:
    print(model)
    # example token count from the function defined above
    print(f"{num_tokens_from_messages(example_messages, model)} prompt tokens counted by num_tokens_from_messages().")
    # example token count from the OpenAI API
    response = client.chat.completions.create(model=model,
    messages=example_messages)
    
    print(f'{response.usage.prompt_tokens} prompt tokens counted by the OpenAI API.')
    print()

DouweM · 2025-11-18T18:30:04Z

pydantic_ai_slim/pydantic_ai/_utils.py

    return event_loop
+
+
+def num_tokens_from_messages(


This is OpenAI specific so it should live in models/openai.py

DouweM · 2025-11-18T18:30:29Z

pydantic_ai_slim/pydantic_ai/_utils.py

+
+def num_tokens_from_messages(
+    messages: list[ChatCompletionMessageParam] | list[ResponseInputItemParam],
+    model: OpenAIModelName = 'gpt-4o-mini-2024-07-18',


We don't need a default value

DouweM · 2025-11-18T18:32:35Z

pydantic_ai_slim/pydantic_ai/_utils.py

+    else:
+        raise NotImplementedError(
+            f"""num_tokens_from_messages() is not implemented for model {model}."""
+        )  # TODO: How to handle other models?


Are you able to reverse engineer the right formula for gpt-5?

As long as we document that this is a best effort calculation and may not be accurate down to the exact token, we can have one branch of logic for "everything before gpt-5" and one for every newer. If future models have different rules, we can update the logic then.

DouweM · 2025-11-18T18:32:55Z

pydantic_ai_slim/pydantic_ai/_utils.py

+    try:
+        encoding = tiktoken.encoding_for_model(model)
+    except KeyError:
+        print('Warning: model not found. Using o200k_base encoding.')  # TODO: How to handle warnings?


No warnings please, let's just make a best effort

DouweM · 2025-11-18T18:35:33Z

pydantic_ai_slim/pydantic_ai/models/openai.py

+    ) -> usage.RequestUsage:
+        """Make a request to the model for counting tokens."""
+        openai_messages = await self._map_messages(messages, model_request_parameters)
+        token_count = num_tokens_from_messages(openai_messages, self.model_name)


OpenAIChatModel and OpenAIResponsesModel can also be used with non-OpenAI models, so we should only use tiktoken if we're sure we're using an OpenAI model. We could check self.system == 'openai', but then it won't work right with OpenAI models via Azure OpenAI, OpenRouter, etc.

So the best option may be to add a new field on OpenAIModelProfile to specify whether a given model is supported by tiktoken, which we can then enable on openai_model_profile and leave false by default. That's a bit trickier to implement though, so feel free to start with the other changes I requested + the self.system check, and we can always add the model profile check later.

wirthual added 4 commits November 16, 2025 15:30

add token counting

d7f0b87

add replay casettes

80a61f1

add additional cassettes

cc8cbf0

add end to end test

c1be8c1

DouweM requested changes Nov 18, 2025

View reviewed changes

DouweM self-assigned this Nov 18, 2025

DouweM added the awaiting author revision label Nov 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WIP: Add count_tokens for openAI models #3447

WIP: Add count_tokens for openAI models #3447

wirthual commented Nov 16, 2025

Uh oh!

DouweM Nov 18, 2025

Uh oh!

DouweM Nov 18, 2025

Uh oh!

DouweM Nov 18, 2025

Uh oh!

DouweM Nov 18, 2025

Uh oh!

DouweM Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

WIP: Add count_tokens for openAI models #3447

Are you sure you want to change the base?

WIP: Add count_tokens for openAI models #3447

Conversation

wirthual commented Nov 16, 2025

Uh oh!

DouweM Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

DouweM Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

DouweM Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

DouweM Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

DouweM Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants