Skip to content

Commit b9894fc

Browse files
Update README.md (#165)
Fixed inconsistency of the full stops
1 parent 3cea11b commit b9894fc

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

basic_pitch/data/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
# Data / Training
22
The code and scripts in this section deal with training basic pitch on your own. Scripts in the `datasets` folder allow one to download and process a selection of the datasets used to train the original model. Each of these download scripts has the following keyword arguments:
3-
* **--source**: Source directory to download raw data to. It defaults to `$HOME/mir_datasets/{dataset_name}`
3+
* **--source**: Source directory to download raw data to. It defaults to `$HOME/mir_datasets/{dataset_name}`.
44
* **--destination**: Directory to write processed data to. It defaults to `$HOME/data/basic_pitch/{dataset_name}`.
55
* **--runner**: The method used to run the Beam Pipeline for processing the dataset. Options include `DirectRunner`, running directly in the code process running the pipeline, `PortableRunner`, which can be used to run the pipeline in a docker container locally, and `DataflowRunner`, which can be used to run the pipeline in a docker container on Dataflow.
66
* **--timestamped**: If passed, the dataset will be put into a timestamp directory instead of 'splits'.
77
* **--batch-size**: Number of examples per tfrecord when partitioning the dataset.
8-
* **--sdk_container_image**: The Docker container image used to process the data if using `PortableRunner` or `DirectRunner` .
8+
* **--sdk_container_image**: The Docker container image used to process the data if using `PortableRunner` or `DirectRunner`.
99
* **--job_endpoint**: the endpoint where the job is running. It defaults to `embed` which works for `PortableRunner`.
1010

1111
Additional arguments that work with Beam in general can be used as well, and will be passed along and used by the pipeline. If using `DataflowRunner`, you will be required to pass `--temp_location={Path to GCS Bucket}`, `--staging_location={Path to GCS Bucket}`, `--project={Name of GCS Project}` and `--region={GCS region}`.

0 commit comments

Comments
 (0)