Skip to content

Commit 24ecfcd

Browse files
Set deduplicate to false by default
1 parent 9186678 commit 24ecfcd

File tree

3 files changed

+6
-4
lines changed

3 files changed

+6
-4
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,4 +31,4 @@ Another option is to output the data in compressed form. All files will get the
3131
java -Dorg.radarcns.compress=gzip -jar restructurehdfs-0.3.1-all.jar <webhdfs_url> <hdfs_topic_path> <output_folder>
3232
```
3333

34-
Finally, files records are deduplicated after writing. To disable this behaviour, specify the option `-Dorg.radarcns.deduplicate=false`.
34+
Finally, by default, files records are not deduplicated after writing. To enable this behaviour, specify the option `-Dorg.radarcns.deduplicate=true`. This set to false by default because of an issue with Biovotion data. Please see - [issue #16](https://github.com/RADAR-base/Restructure-HDFS-topic/issues/16) before enabling it.

build.gradle

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ apply plugin: 'java'
22
apply plugin: 'application'
33

44
group 'org.radarcns.restructurehdfs'
5-
version '0.3.1'
5+
version '0.3.2-SNAPSHOT'
66
mainClassName = 'org.radarcns.RestructureAvroRecords'
77

88
run {

src/main/java/org/radarcns/RestructureAvroRecords.java

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,9 @@ public class RestructureAvroRecords {
7171
private long processedFileCount;
7272
private long processedRecordsCount;
7373
private static final boolean USE_GZIP = "gzip".equalsIgnoreCase(System.getProperty("org.radarcns.compression"));
74-
private static final boolean DO_DEDUPLICATE = "true".equalsIgnoreCase(System.getProperty("org.radarcns.deduplicate", "true"));
74+
75+
// Default set to false because causes loss of records from Biovotion data. https://github.com/RADAR-base/Restructure-HDFS-topic/issues/16
76+
private static final boolean DO_DEDUPLICATE = "true".equalsIgnoreCase(System.getProperty("org.radarcns.deduplicate", "false"));
7577

7678
public static void main(String [] args) throws Exception {
7779
if (args.length != 3) {
@@ -179,7 +181,7 @@ public void start(String directoryName) throws IOException {
179181
for (Map.Entry<String, List<Path>> entry : topicPaths.entrySet()) {
180182
try (FileCacheStore cache = new FileCacheStore(converterFactory, 100, USE_GZIP, DO_DEDUPLICATE)) {
181183
for (Path filePath : entry.getValue()) {
182-
// If Json Mapping exception occurs log error and continue with other files
184+
// If JsonMappingException occurs, log the error and continue with other files
183185
try {
184186
this.processFile(filePath, entry.getKey(), cache, offsets);
185187
} catch (JsonMappingException exc) {

0 commit comments

Comments
 (0)