Skip to content

Commit 20fcef8

Browse files
committed
Add example for generating Parquet files
1 parent 33cf9ab commit 20fcef8

File tree

1 file changed

+6
-0
lines changed

1 file changed

+6
-0
lines changed

README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,12 @@ To get a complete list of the arguments, pass `--help` to the JAR file:
130130
./tools/run.py ./target/ldbc_snb_datagen_${PLATFORM_VERSION}-${DATAGEN_VERSION}.jar -- --format csv --scale-factor 0.003 --mode raw --output-dir sf0.003-raw
131131
```
132132

133+
* Generating Parquet files:
134+
135+
```bash
136+
./tools/run.py ./target/ldbc_snb_datagen_${PLATFORM_VERSION}-${DATAGEN_VERSION}.jar -- --format parquet --scale-factor 0.003 --mode bi
137+
```
138+
133139
* For the `interactive` and `bi` formats, the `--format-options` argument allows passing formatting options such as timestamp/date formats, the presence/abscence of headers (see the [Spark formatting options](https://spark.apache.org/docs/2.4.8/api/scala/index.html#org.apache.spark.sql.DataFrameWriter) for details), and whether quoting the fields in the CSV required:
134140

135141
```bash

0 commit comments

Comments
 (0)