Skip to content

Proposal: refactor nzshm_common.location.location.py around reading from CSV files #39

@DrCuriosity

Description

@DrCuriosity

There is work going in for get_locations(locations, resolution) on issue #36 (PR #37) that checks incoming values against the filesystem to see if they might be a CSV file to load from.

This might be better off split into a separate get_locations_csv(file, resolution) function for a few reasons:

  • Speculatively calling Path(location_id).exists() on every incoming location ID might cause performance issues when a large number of locations are being requested
  • There's the possibility that files could exist that have the same name as a location ID string, leading to unexpected errors or erroneous results
  • Security-wise, it's good to guard against people probing a system via unexpected avenues. If for example I was calling this library function with input passed through from a web form, errors from something like getLocations(["CHC", "AKL, ".dockerenv"]) could tell an attacker whether the code was running in a Docker container.

A separate get_locations_csv(file_location, resolution) function would make interacting with the host file system more intentional. It also provides us with an opportunity to take in either a Path or a file-like object (with a type under typing.TextIO) to read from, which would allow users to read CSV data in from non-traditional sources (e.g. S3 buckets) as well as from their local filesystems.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions