Skip to content

[fix]: Handle duplicates when file names include more than 4 digits #97

@signekb

Description

@signekb

Description

Sometimes when the file names include six digits they are actually meaningful and represent the data at different times. @Aastedet gave the example of an address being unchanged.

Currently, we ignore the name of the source_file (and thereby the year) and deduplicate all rows with the same values except for the source_file. But Anders mentioned that this isn't always the wanted behaviour.

@Aastedet do you mind elaborating below? :)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

To do

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions