The workflow shall acommodate for custom data to be used for validation.
The format of the data could be similar/identical to the format of the cleaned data.
This could help for multiple reasons:
- it is easy to integrate
- as the format would be the same as the cleaned data format, we'd encourage users to already parse the data into the cleaned data, which means that they are more likely to then contribute their cleaning scripts into the codebase
Opinions are welcome :)