openai-apichatgpt-apicompletiongpt-4few-shot-learning

How to format a few-shot prompt for GPT4 Chat Completion API?


I'm trying to use the GPT4's chat completion API for the following prompt:

For each situation, describe the intent. Examples:


Situation 1: Devin gets the newspaper.

The intent of Situation 1: Devin intends to read the newspaper.

Situation 2: Jamie works all night.

The intent of Situation 2: Jamie intends to meet a deadline.

Situation 3: Sydney destroys Ryan.

The intent of Situation 3: Sydney intends to punish Ryan.

Situation 4: Lindsay clears her mind.

The intent of Situation 4: Lindsay intends to be ready for a new task.

Situation 5: Rowan wants to start a business.

The intent of Situation 5: Rowan intends to be self sufficient.

Situation 6: Lee ensures Ali’s safety.

The intent of Situation 6: Lee intends to be helpful.

Situation 7: Riley buys lottery tickets.

The intent of Situation 7: Riley intends to become rich.

Situation 8: Alex makes Chris wait.

The intent of Situation 8: Alex intends

As you can see, I want to complete the sentence that says "Alex intends". This prompt is intuitive for GPT3's Completion API where you only had to put one prompt that has all the few-shots examples.

However, I don't know what is the best practice to perform the same prompting with GPT4's ChatCompletion API. I've checked out https://github.com/openai/openai-cookbook/blob/main/examples/How_to_format_inputs_to_ChatGPT_models.ipynb where they provided an example of how to do few-shot prompting, but my prompt is not "conversational" as you can see.

I'm not even sure whether the "name" parameter impacts the result's quality. Does anybody have an answer to this?

What I thought of so far is to format my prompt like this as the content from the above link instructed:

messages=[
        {"role": "system", "content": "For each situation, describe the intent. Examples:"},
        {"role": "system", "name":"Situation 1", "content": "Devin gets the newspaper."},
        {"role": "system", "name": "The intent of Situation 1", "content": "Devin intends to read the newspaper."},
        {"role": "system", "name":"Situation 2", "content": "Jamie works all night."},
        {"role": "system", "name": "The intent of Situation 2", "content": "Jamie intends to meet a deadline."},

...
        {"role": "system", "name":"Situation 8", "content": "Alex makes Chris wait."},
        {"role": "user", "name": "The intent of Situation 8", "content": ""},
    ]

Is this a proper way to do few-show with GPT4 ChatCompletion API? Please let me know if you have a better solution or explanations on why certain parts of my prompt needs work.

So far, I've simply put the original prompt into one user content, just like it is GPT3's Completion API:

messages=[
    {"role": "user", "content": "For each situation, describe the intent. Examples:


Situation 1: Devin gets the newspaper.

The intent of Situation 1: Devin intends to read the newspaper.

Situation 2: Jamie works all night.

The intent of Situation 2: Jamie intends to meet a deadline.

Situation 3: Sydney destroys Ryan.

The intent of Situation 3: Sydney intends to punish Ryan.

Situation 4: Lindsay clears her mind.

The intent of Situation 4: Lindsay intends to be ready for a new task.

Situation 5: Rowan wants to start a business.

The intent of Situation 5: Rowan intends to be self sufficient.

Situation 6: Lee ensures Ali’s safety.

The intent of Situation 6: Lee intends to be helpful.

Situation 7: Riley buys lottery tickets.

The intent of Situation 7: Riley intends to become rich.

Situation 8: Alex makes Chris wait.

The intent of Situation 8: Alex intends"}
]

It does work, but I was wondering if I can boost the API's performance if I follow a certain practice.


Solution

  • You might want to check out our GPT best practices guide which talks about how to do prompting effectively. The reality is that prompt engineering for tasks like this is much more art than science today. I suggest trying both approaches but my hunch is the one with each example as a user message / assistant response separately will perform better based on experiments I have seen.