Skip to content

County business patterns is described incorrectly #151

@Maegereg

Description

@Maegereg

The economic census county business patterns demonstration dataset has its privacy parameters described incorrectly. The registry entry describes it as zCDP, with a privacy unit of a business establishment, but that is wrong. Both the census and the registry entry notes that this release uses per-record differential privacy, a DP variant defined here. The mechanism is described in more detail in the paper and in the linked webinar, but in brief:

  • Thresholds were defined for each attribute of a business establishment. For example (these are not real numbers), number of employees might have a threshold of 100, and payroll might have a threshold of $50,000.
  • Each establishment was evaluated against the thresholds. If an establishment exceeded any threshold, it would be split into two or more duplicate establishments whose values were all below the threshold. Following our example above, an establishment with 150 employees and $60,000 payroll would be split into one with 100 employees and $50,000 payroll, and another with 50 employees and $10,000 payroll.
  • This new split dataset had zCDP applied with the specified budgets.

It strikes me that this could be described as its own DP flavor, or it could be described as zCDP with a privacy unit of a volume of contribution, similar to (but more complicated than) https://registry.opendp.org/deployments-registry/#historical-pageviews-wikimedia-foundation-2023

Also, I'm not sure whether the splitting thresholds were ever released - the Census site mentions a forthcoming paper which I would have expected to include them, but I don't see any evidence that that paper was ever released, and it doesn't look like (from skimming the slides) the thresholds were announced in the webinar. Without them, the raw budget numbers are pretty incomplete.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions