You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note that the options preceded by the `*` in the above output are required to run the app. Also note that there can be multiple input paths from which to read the files. Eg - `/topicAndroidNew/topic1 /topicAndroidNew/topic2 ...`. At least one input path is required.
33
+
34
+
By default, this will output the data in CSV format. If JSON format is preferred, use the following instead:
Another option is to output the data in compressed form. All files will get the `gz` suffix, and can be decompressed with a GZIP decoder. Note that for a very small number of records, this may actually increase the file size.
Finally, by default, files records are not deduplicated after writing. To enable this behaviour, specify the option `-Dorg.radarcns.deduplicate=true`. This set to false by default because of an issue with Biovotion data. Please see - [issue #16](https://github.com/RADAR-base/Restructure-HDFS-topic/issues/16) before enabling it.
44
+
Finally, by default, files records are not deduplicated after writing. To enable this behaviour, specify the option `--deduplicate` or `-d`. This set to false by default because of an issue with Biovotion data. Please see - [issue #16](https://github.com/RADAR-base/Restructure-HDFS-topic/issues/16) before enabling it.
@Parameter(names = { "-f", "--format" }, description = "Format to use when converting the files. JSON and CSV is available.")
14
+
publicStringformat = "csv";
15
+
16
+
@Parameter(names = { "-c", "--compression" }, description = "Compression to use when converting the files. Gzip is available.")
17
+
publicStringcompression = "none";
18
+
19
+
// Default set to false because causes loss of records from Biovotion data. https://github.com/RADAR-base/Restructure-HDFS-topic/issues/16
20
+
@Parameter(names = { "-d", "--deduplicate" }, description = "Boolean to define if to use deduplication or not.")
21
+
publicbooleandeduplicate;
22
+
23
+
@Parameter(names = { "-u", "--hdfs-uri" }, description = "The HDFS uri to connect to. Eg - 'hdfs://<HOST>:<RPC_PORT>/<PATH>'.", required = true, validateWith = { HdfsUriValidator.class, PathValidator.class })
24
+
publicStringhdfsUri;
25
+
26
+
@Parameter(names = { "-o", "--output-directory"}, description = "The output folder where the files are to be extracted.", required = true, validateWith = PathValidator.class)
27
+
publicStringoutputDirectory;
28
+
29
+
@Parameter(names = { "-h", "--help"}, help = true, description = "Display the usage of the program with available options.")
0 commit comments