.net azure openai-api chatgpt-api chat-gpt-4

Azure OpenAI and token limit

I want to use GPT model to analyze my data. Data is a suite of records (e.g. 1000 records) with 10 or even more properties. I want to say GPT (or other model):

"please, analyze this data and find and exceptions, extremums etc. Anything, what is different than common"

I use Azure.AI.OpenAI nuget package https://github.com/Azure/azure-sdk-for-net/blob/Azure.AI.OpenAI_1.0.0-beta.12/sdk/openai/Azure.AI.OpenAI/README.md

When I try model "gpt-35-turbo with the following code:

        var chatCompletionsOptions = new ChatCompletionsOptions()
        {
            DeploymentName = "gpt-35-turbo", // Use DeploymentName for "model" with non-Azure clients
            Messages =
            {
                new ChatRequestSystemMessage("You are data specialist"),
                new ChatRequestUserMessage(@"Analyze this data and find exceptions"),
                new ChatRequestUserMessage(stringBuilder.ToString())
            }
        };

        Response<ChatCompletions> aiChatResponse = await _openAIClient.GetChatCompletionsAsync(chatCompletionsOptions);
        ChatResponseMessage responseChatMessage = aiChatResponse.Value.Choices[0].Message;

where stringBuilder has JSONL model with 1000 records and even 2 columns

I get

{
  "error": {
    "message": "This model's maximum context length is 8192 tokens. However, your messages resulted in 17901 tokens. Please reduce the length of the messages.",
    "type": "invalid_request_error",
    "param": "messages",
    "code": "context_length_exceeded"
  }
}

so, as we can see, limitation is small to analyze data via chat

When I try to use model text-embedding-ada-002:

        EmbeddingsOptions embeddingsOptions = new("text-embedding-ada-002", strings);
        Response<Embeddings> responseEmbeddings = await _openAIClient.GetEmbeddingsAsync(embeddingsOptions);

        EmbeddingItem eItem = responseEmbeddings.Value.Data[0];
        ReadOnlyMemory<float> embedding = eItem.Embedding;

but it being executed long time and I cancelled it for cost increasing :) with 10 records it returns only number list...

ADDED #1

e.g. I have list of the people and all of them from Chicago, except 2, which are from other cities. Or most of them has salary approx $100000 per year, but some of them has $10000 (much less) and $100000 (much more, than approx). Or any other different exceptions and deviations, I don't know which, because otherwise I can develop it directly. I want to have ability to analyze all data as model and find anything (probably, not only by one parameter, probably, linked parameters). And, even find relations inside data, between one parameter from another (e.g. salary in New York much more, that city X). There are only examples, main goal - I DON'T KNOW which concrete relations and exceptions, AI should point me it

How to solve my task?

Solution

I'll try to answer questions to the issues you are facing. Hopefully it will give you some ideas about how to proceed.

First, regarding text-embedding-ada-002 model, you cannot use it for chat completions. It is used when you want to create vector representation of the data and feed to a vector database.

Regarding the error you are getting for token length with GPT 3.5 model, it is expected as the max token sizes (also known as context size i.e. the amount of data that can be sent to Azure OpenAI) are fixed for a model and you cannot exceed them. A few things you can do to take care of this error:

Use a model that supports larger context length. For example, you can use gpt-35-turbo-16k which supports 16k context size (double the context size of the model you are currently using) or gpt-4-32k (four time the context size of the model you are currently using). This will allow you to pass larger text to the model for comprehension. However, please note that the cost for GPT-4 models are significantly higher than GPT-3.5 models so please also take that into consideration.
Use prompt patterns. One possible solution would be make use of Map-Reduce kind of pattern where you chunk your data in smaller pieces and send each chunk to LLM with your question. Once you get response for each chunk, send the responses back to LLM to comprehend and come up with a final answer (without knowing details about the business problem you are trying to solve, I am not 100% sure that if this would solve your problem though so you may want to try it out).