pythontokenopenai-apilangchainqdrant

Calculation of pricing of tokens in OPENAI calls


I'm trying to price the tokens used in a call to OPENAI. I have a txt file with plain text that was uploaded to Qdrant. When I ask the following question:

Who is Michael Jordan?

and use the get_openai_callback function to track the number of tokens and the price of the operation, one of the keys of information in the output doesn't make sense to me.

Tokens Used: 85
    Prompt Tokens: 68
    Completion Tokens: 17
Successful Requests: 1
Total Cost (USD): $0.00013600000000000003

Why does the Prompt Tokens value differ from the input value? The amount of tokens in the input text (which is what I understand as Prompt Token) is:

query = 'Who is Michael Jordan'

encoding = tiktoken.encoding_for_model('gpt-3.5-turbo-instruct')
print(f"Tokens: {len(encoding.encode(query))}")

4

, but the output in the response is like 68. I considered the idea that Prompt Tokens were the sum of the base tokens (txt file) added to the question tokens, but the math doesn't fit.

Number of tokens in the txt file: 17

Arquivo txt: 'Michael Jeffrey Jordan is an American businessman and former basketball player who played as a shooting guard'

query + file_token: 21 (4+17)

Could anyone help me understand the pricing calculation?

I tried to search OPENAI's own documentation, github and other forums, but I don't think it's easy to find information or that it's open to the public. I want to understand if I'm missing something or if it's a calculation that users don't have access to.

UPDATE For any future questions from other users:

import langchain 
langchain.debug = True

Run the get_openai_callback() function and see the entire log appear on the screen. The value of the "prompts" key is a list containing a string that is the instruction on how the response should be given. The number of tokens for this prompt is the value that appears in the Prompt Tokens.


Solution

  • Prompt Tokens includes your question and any context provided, plus additional system messages and formatting added by the API. While Completion Tokens generated in the response.

    In your example:

    Visible Query: Who is Michael Jordan? (4 tokens) Text from File: Michael Jeffrey Jordan is an American businessman and former basketball player who played as a shooting guard (17 tokens) Expected: 4+17=21 4+17=21 tokens.

    However, you see 68 prompt tokens because the API adds tokens for roles, instructions, and other metadata.To understand the exact token count, you can log the full request payload or use OpenAI's token counting tools. This extra context explains why the prompt token count is higher than expected.