Skip to content

Commit 86003eb

Browse files
authored
Fix documentation to reference Files API (#312)
1 parent 12bc21e commit 86003eb

File tree

2 files changed

+10
-5
lines changed

2 files changed

+10
-5
lines changed

clients/python/llmengine/file.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,12 @@ def upload(cls, file: BufferedReader) -> UploadFileResponse:
2222
"""
2323
Uploads a file to LLM engine.
2424
25+
For use in [FineTune creation](./#llmengine.fine_tuning.FineTune.create), this should be a CSV file with two columns: `prompt` and `response`.
26+
A maximum of 100,000 rows of data is currently supported.
27+
2528
Args:
2629
file (`BufferedReader`):
27-
A file opened with open(file_path, "r")
30+
A local file opened with `open(file_path, "r")`
2831
2932
Returns:
3033
UploadFileResponse: an object that contains the ID of the uploaded file

clients/python/llmengine/fine_tuning.py

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -38,8 +38,10 @@ def create(
3838
This API can be used to fine-tune a model. The _model_ is the name of base model
3939
([Model Zoo](../../model_zoo) for available models) to fine-tune. The training
4040
and validation files should consist of prompt and response pairs. `training_file`
41-
and `validation_file` must be publicly accessible HTTP or HTTPS URLs to a CSV file
42-
that includes two columns: `prompt` and `response`. A maximum of 100,000 rows of data is
41+
and `validation_file` must be either publicly accessible HTTP or HTTPS URLs, or
42+
file IDs of files uploaded to LLM Engine's [Files API](./#llmengine.File) (these
43+
will have the `file-` prefix). The referenced files must be CSV files that include
44+
two columns: `prompt` and `response`. A maximum of 100,000 rows of data is
4345
currently supported. At least 200 rows of data is recommended to start to see benefits from
4446
fine-tuning. For sequences longer than the native `max_seq_length` of the model, the sequences
4547
will be truncated.
@@ -52,10 +54,10 @@ def create(
5254
The name of the base model to fine-tune. See [Model Zoo](../../model_zoo) for the list of available models to fine-tune.
5355
5456
training_file (`str`):
55-
Publicly accessible URL to a CSV file for training. When no validation_file is provided, one will automatically be created using a 10% split of the training_file data.
57+
Publicly accessible URL or file ID referencing a CSV file for training. When no validation_file is provided, one will automatically be created using a 10% split of the training_file data.
5658
5759
validation_file (`Optional[str]`):
58-
Publicly accessible URL to a CSV file for validation. The validation file is used to compute metrics which let LLM Engine pick the best fine-tuned checkpoint, which will be used for inference when fine-tuning is complete.
60+
Publicly accessible URL or file ID referencing a CSV file for validation. The validation file is used to compute metrics which let LLM Engine pick the best fine-tuned checkpoint, which will be used for inference when fine-tuning is complete.
5961
6062
hyperparameters (`Optional[Dict[str, str]]`):
6163
A dict of hyperparameters to customize fine-tuning behavior.

0 commit comments

Comments
 (0)