OpenAI just announced that the model behind ChatGPT-3.5 is now available for fine-tuning and everyone is super excited! Oh wait, except for the developers who have to sort through all the details of what has changed and make updates.
And there are a lot of changes.
Fortunately, we’re a step ahead and happy to share our learnings.
Let’s start with that lovely JSONL file, which contains your training data and kicks each fine-tuning job off.
Let’s say we’re fine-tuning a model that tells you the capital of a state. Your JSONL file used to look like this, with one of these per line:
{prompt: “Nebraska -> ”, completion: “ Lincoln\n\n###\n\n”}
{prompt: “Colorado -> ”, completion: “ Denver\n\n###\n\n”}
{prompt: “Iceland -> ”, completion: “ Reykjavík\n\n###\n\n”}
Well, you don’t need the -> separator or \n\n###\n\n stop sequence anymore. And most likely, you don’t need to prepend a space to the completion. But you do need more boilerplate JSON to conform to the new chat-based paradigm:
{"messages": [{"role": "system", "content": ""}, {"role": "user", "content": "Nebraska"}, {"role": "assistant", "content": "Lincoln"}]}
{"messages": [{"role": "system", "content": ""}, {"role": "user", "content": "Colorado"}, {"role": "assistant", "content": "Denver"}]}
{"messages": [{"role": "system", "content": ""}, {"role": "user", "content": "Iceland"}, {"role": "assistant", "content": "Reykjavík"}]}
Now, don’t panic. You can still use your existing dataset here.
Just send your existing dataset’s prompt as the "user" and the completion as the "assistant."
The `system` role message provides an interesting opportunity to create a hybrid approach between prompt engineering and fine-tuning, where you do a little bit of both. We went ahead and passed an empty string for the system’s message content for now, until we can dig in and explore the possibilities there.
Here’s a fun change: all the fine-tuning endpoints are different if you want to use any of the new models (gpt-3.5-turbo, davinci-002, or babbage-002).
If you want to continue fine-tuning legacy models, even though they will be discontinued on January 4, 2023, you can still use the old endpoints. But that’s not why we’re here.
Make sure to use the new API endpoint URLs from the documentation. In general, instead of calling them "fine-tunes," they are now "fine-tuning jobs." Also, their IDs now start with “ftjob-” instead of just “ft-” to tell them apart.
The biggest API changes are with the new fine-tuning job creation endpoint.
Here are a few of the biggest “gotchas” we found:
Even though the file upload endpoints did not change, training files are no longer ready instantly. They need to be processed first, and you will get an `invalid_file_status` error back when it’s not ready yet, with a pretty clear message.
If your workflow tries to create a fine-tuning job immediately after uploading a file, you’ll need to ignore this error message and keep retrying until it goes through.
Another error code I hadn’t run into before with fine-tuning is “rate_limit_exceeded”. You’ll get this if you try to run too many fine-tunes at the same time. For gpt-3.50-turbo, this limit was a lowly 1 at a time.
The hyperparameters are no longer on the top level of the request body. They are now in the “hyperparameters” object. And by “they,” I mean just one. You can only set `n_epochs` for now — maybe the others are coming back soon.
The suffix limit is actually 18 characters, down from 40. The API documentation still says 40, but it lies. You’ll get an error if you try a suffix longer than 18 that tells you the limit is 18.
For now, you can’t select one of your existing fine-tuned models to re-tune. You have to start with a blank slate each time. Hopefully they bring that back, because it was pretty cool.
To get completions for gpt-3.5-turbo, you need to use the chat completions endpoint. This endpoint has been around for a bit because it was the first endpoint that let users interact with ChatGPT through the API in a conversational format.
This is a pretty easy transition to make once you’re already generating those JSON objects for the new JSONL format. Most of the keys you could send for completions before will carry over, too, such as:
temperature
n (number of completions)
max_tokens
top_p
frequency_penalty
presence_penalty
stop
Let’s pretty print that JSON example earlier, because we’ll use the same format for the chat completions endpoint:
{
"messages": [
{
"role": "system",
"content": ""
},
{
"role": "user",
"content": "What is the capitol of Nebraska?"
},
{
"role": "assistant",
"content": "Lincoln"
}
]
}
The response of completions even arrive in the same `choices` array, although you have to parse each message now.
Hey developers, I know you can handle all these updates. Obviously. But do you really want to? We’ve already implemented the new APIs at Entry Point and made fine-tuning GPT-3.5 Turbo easy breezy chicken peasy. And there’s a UI built on top of it so you could probably even convince your boss to get off your back and do some of the work too.
So how about it? Give yourself a break and check out our guide to fine-tuning GPT-3.5 with no code.
Or jump right in, try Entry Point today and make your fine-tuning workflow fly!