OpenAI just released GPT-3.5 Turbo for fine-tuning and it's a game-changer.
According to OpenAI, a fine-tuned GPT-3.5 model can easily match the quality of GPT-4 when it’s fine-tuned on high-quality examples—and you only need 50 to start seeing clear quality improvements versus few-shot learning.
Plus, it’s 90% cheaper to get completions from a fine-tuned GPT-3.5 model than a fine-tuned GPT-3 model.
Better quality and lower cost? Game on! 🤩
But hol' up a second.
OpenAI also made significant changes to the fine-tuning process for GPT-3.5 Turbo. Figuring out the nuances of formatting the new JSONL and making the API calls to fine-tune can feel like being stuck in an escape room you didn’t ask to be in.
That’s why we created this guide that will get a shiny new fine-tuned GPT-3.5 model in your hands much faster with no code — and you might even have fun doing it.
Don't have data at hand? No problem.
I have some for you, so you can fine-tune along this guide — Helpful AI Clerk.
Entry Point AI is a platform that helps you fine-tune AI models without writing a single line of code.
You can start for free, no credit card required.
Simply open the app and log in:
You'll land on your Dashboard.
Create a new project by clicking the (+) button.
You'll then get to select from a few presets.
Click on them to get familiar with typical project formats and examples.
Whenever you're uploading custom data, though, I suggest going with the Blank blueprint.
We'll name our project "AI Clerk".
After clicking "Create", you'll land on the Project Overview.
To navigate to the Data Import page either click "Import", or "Import CSV":
Now click "Choose .csv file":
Keep in mind that on the free plan, you can have a max of 50 examples in your organization at any time. According to OpenAI, that’s enough to start seeing clear improvements versus prompting alone.
Next upload the CSV of the Google Sheet I've shared with you before.
(Here's how to download your Google Sheet as a CSV:)
After your CSV is in Entry Point, select which columns in the Google Sheet should go to the prompt or completion:
This is essentially choosing your input variables and then what kind of output you expect to get back. In our case, we're giving the AI three shopping cart items and getting back a recommended item for our hypothetical customer, along with a rationalization for why that item would be good.
Note: Entry Point uses the term 'fields' instead of 'columns'. For all intents and purposes, they are the same.
Finally, you can dedicate a percentage of your examples to be Validation Examples.
Entry Point won't include these in your training data.
Instead, validation examples are used to automatically test your models after they're fine-tuned.
Click "Finish" and the import will start. Wait a few seconds and you should see your example count go up in the sidebar.
There we go — our examples:
Now we're going to prepare the example format for fine-tuning.
Open the Templates page.
You should see the default template that was created when you imported the data.
Essentially, the 'Shopping Cart' column is on the left (in the Prompt), and the 'Suggested Item' and 'Reasoning' columns are on the right (in the Completion).
The variables with double curly braces will be replaced with the row values from the Google Sheet. You can do whatever you want with this. Wrap them inside additional instructions…. Leave them blank, as is… Add little tags that help the model understand your task better (as I have in my Completion:)
Note: We all know GPT-3.5 Turbo has a chat interface, with alternating "User" and "Assistant" inputs and outputs. Yet the templates in Entry Point are tagged as "Prompt" and "Completion". This is the same. The "User" is the "Prompt", and the "Assistant" is the "Completion".
(The system message is left blank for now, although we'll add that option soon.)
Now, I'd like my model to keep the output text a bit more separated. So I'm going to add an empty line between my 'Suggested Item' and my 'Reasoning'.
After I press save, all my examples get updated.
And if you want to edit individual examples, you can hover the example and click the pen icon.
Then change the details as you wish.
In conclusion, Entry Point Templates eliminate the need to format your data using Python (phew).
We’re fine-tuning GPT-3.5 Turbo, which is a model made and hosted by OpenAI.
Meaning, we need to connect Entry Point to OpenAI.
To do this, open the Integrations page in Entry Point from the top navigation bar and click on OpenAI.
Now we need to get an API key from OpenAI.
Go to this page: https://platform.openai.com/account/api-keys
Then click 'Create new secret key'...
And name it something like “Entry Point AI” so you remember where you’re using it.
Once you have it, copy it.
And paste it into Entry Point here:
Now that we’re connected to OpenAI, we can fine-tune a model.
Sometimes though, we don't have enough examples to create a high-quality model.
Because creating new training examples by hand is a drag, we created Data Synthesis. This feature let's you expand your dataset automatically using AI.
Let’s take a quick peek at Data Synthesis.
You can choose which model you want to use to generate examples, add Alignment text to steer the kind of examples you want it to generate, and set how many to produce at a time.
You can even have it automatically save the examples, or manually add the best ones.
When you add a new example, you can edit it first to ensure it meets your standard of quality.
Okay, we already have plenty of data for our Helpful AI Clerk model. Let’s get back to fine-tuning GPT-3.5.
Go to Fine-tunes and click the plus button.
On the 'Start a fine-tune' page, you can select a base model.
Choose GPT-3.5 Turbo.
You can press "Show Advanced" to view and edit hyperparameters for the fine-tuning job.
The only hyperparameter available for GPT-3.5 Turbo at its time of release is N Epochs. If this is your first time fine-tuning, just leave the default. You can always learn more about hyperparameters and play with them later.
Note the estimated time to complete the fine-tuning job, because this can vary from a few minutes to several hours depending on how smoothly things are going at OpenAI and how many fine-tuning jobs are backed up in the queue.
Press Start and watch the magic happen.
Entry Point will show your fine-tuning job as “Preparing” initially. This is where it’s writing the JSONL files (both training and validation datasets, if applicable) and uploading them to OpenAI. Once the training data is uploaded to OpenAI, it will show the job as “Started.” That means it’s in the queue.
You’ll receive an email when your fine-tune is ready and the status will update to "Completed."
Congratulations, you successfully fine-tuned GPT-3.5!
Now you can use your model in a variety of ways, including directly from the OpenAI playground.
Entry Point also has a playground designed to work perfectly with your fine-tuned model that lets you leverage the structure of your fields and templates to input data more easily and with less room for formatting errors.