Skip to content

JSON-serialization of Blocking Rules #62

@jstammers

Description

@jstammers

I have a workflow for detecting duplicates in a dataset that I am looking to deploy.

I'd like to be able to configure the blocking rules used to generate candidate pairs as part of the CLI that runs my workflow.

I can add parameters to do this, e.g. --key-blockers="foo,bar" --coordinate_distance_km=0.01, but it feels somewhat fragile to me.

It would be useful if it were possible to reconstruct a blocker from a JSON-representation, e.g.

{ 
    "type": "Key",
    "parameters": { 
        "key":  ["foo", "bar"],
        "name": "foo and bar"
    }
} # de-serializes to KeyBlocker(key=("foo", "bar"), name="foo and bar")

This could be handled by implementing .to_dict() and .from_dict()methods for the blocker classes that exist although I'm not sure how this would deal withCallableorDeferred` objects

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions