Skip to content

Test gen for serializers #2

@thorwhalen

Description

@thorwhalen

Proposed approach to test.

  • Seeds: Make 1-3 dataframes covering different data types and structures we'd want to cover.
  • Serializers: Make a list of several pandas serializers. These can be made as a partial (curry) of the combination of pandas serializers and their parameters.
  • Going through the cartesian product of Seed and Serializers, serialize the seeds, within a try catch, collecting those that were valid.
  • Save (as test files -- when applicable), including serializer (includes parametrization) in the filename somehow (use extension plus what ever is needed for parameters (example key=val, naming.
  • Save seeds (in files or in code definition)
  • Go through all serialized data and see if deserializers work.

See below, pairs of pandas serializers that have corresponding deserializers with the same name pattern.

for a in filter(lambda a: a.startswith('to_'), dir(pd.DataFrame)):
    aa = a.replace('to_', 'read_')
    if hasattr(pd, aa):
        print(f"{a:16} {aa}")
to_clipboard     read_clipboard
to_csv           read_csv
to_excel         read_excel
to_feather       read_feather
to_gbq           read_gbq
to_hdf           read_hdf
to_html          read_html
to_json          read_json
to_parquet       read_parquet
to_pickle        read_pickle
to_sql           read_sql
to_stata         read_stata

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions