Skip to content

Resources on data validation in the full data science pipelines #10

@chendaniely

Description

@chendaniely

We've kind of used unit tests to help validate things in the course, but those were mainly single assert or testthat cases in a script or part of a package's unit testing workflow.

We should also use tools specifically to validate our data to test our data's assumptions (e.g., no missing values, has a specific distribution, etc)

Posit conf this past year had 2 workshops in r and python that might be nice reference material as well:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions