Skip to content

Conversation

Tyler-g-hudson
Copy link
Contributor

Added the data search command line argument to wigwam. This command searches a file (default workflowdata.json in the root of the project) for all of the data items that fit the names or tags given as search terms.



def data_search(
data_file: Path,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why Path specifically here instead of str | os.PathLike, which is more abstract. Usually we prefer more abstract types as inputs in order to avoid overly constraining users.

names: Iterable[str],
fields: Iterable[str],
all: bool = False,
print_output: bool = True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

print_output is missing a description in the docstring

Comment on lines 1 to 2
{
"data": [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure there's any reason for this "data" key. How about removing this outer layer? i.e. basically replace this:

{"data": [<actual contents>]}

with just this:

[<actual contents>]
Suggested change
{
"data": [
[

wigwam/search.py Outdated
json_dict = json.load(file)

# Get everything held underneath the "data" key on the dictionary.
data: List[Dict[str, str | Dict[str, str]]] = json_dict["data"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List[Dict[str, str | Dict[str, str]]]

This type description is kind of a mouthful and it doesn't really tell us much about the actual structure of the data contained within data.

Instead of passing around objects with type annotations that look like this, let's create a dataclass to represent the contents of the JSON file in a more structured way, e.g.

from dataclasses import dataclass

@dataclass
class TestFile:
    name: str
    checksum: str

@dataclass
class TestDataset:
    name: str
    tags: list[str]
    url: str
    files: list[TestFile]

Then we could convert data to a list[TestDataset].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants