Skip to content

Commit 6204444

Browse files
committed
update
1 parent 15c1cdd commit 6204444

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

articles/ai-services/openai/how-to/reinforcement-fine-tuning.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -152,7 +152,7 @@ To evaluate how close the model-generated output is to the reference, scored wit
152152

153153
***Supported operations:***
154154

155-
- `bleu` – computes bleu score between strings
155+
- `bleu` – computes BLEU score between strings
156156
- `Fuzzy_match` – fuzzy string match, using rapidfuzz
157157
- `gleu` – computes google BLEU score between strings
158158
- `meteor` – computes METEOR score between strings
@@ -209,7 +209,7 @@ A multigrader object combines the output of multiple graders to produce a single
209209
- `/` (division)
210210
- `^` (power)
211211

212-
*Functions:*0
212+
*Functions:*
213213
- `min`
214214
- `max`
215215
- `abs`
@@ -219,7 +219,7 @@ A multigrader object combines the output of multiple graders to produce a single
219219
- `sqrt`
220220
- `log`
221221

222-
When using the UX you're able to write a prompt and generate a valid grader and response format in json as needed. Grader is mandatory field to be entered while submitting a finetuning job. Response format is optional.
222+
When using the UX you're able to write a prompt and generate a valid grader and response format in json as needed. Grader is mandatory field to be entered while submitting a fine-tuning job. Response format is optional.
223223

224224
> [!IMPORTANT]
225225
> Generating correct grader schema requires careful prompt authoring. You may find that your first few attempts generate invalid schemas or don't create a schema that will properly handle your training data. Grader is a mandatory field that must be entered while submitting a fine-tuning job. Response format is optional.
@@ -340,7 +340,7 @@ During the training you can view the logs and RFT metrics and pause the job as n
340340

341341
### Guardrails on training spending
342342

343-
As a RFT job can lead to high training costs, we automatically pause jobs once they have hit $5K in total training costs (training + grading). Users may deploy the most recent checkpoint or resume the training job. If the the user decides to resume the job, billing will continue for the job and subsequently no further price limits would be placed on the training job.
343+
As an RFT job can lead to high training costs, we automatically pause jobs once they have hit $5K in total training costs (training + grading). Users may deploy the most recent checkpoint or resume the training job. If the user decides to resume the job, billing will continue for the job and subsequently no further price limits would be placed on the training job.
344344

345345
## Interpreting training results
346346

0 commit comments

Comments
 (0)