-
Notifications
You must be signed in to change notification settings - Fork 2
Let find() take format as argument to allow returning as list #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a format parameter to the Client.find() method, allowing users to specify the return type (str, list, set, or tuple) when retrieving IDs. Previously, IDs were always returned as a space-separated string, requiring users to manually split them into lists. The default behavior remains unchanged (returning a string) for backward compatibility.
Key Changes:
- Added
formatparameter tofind()method with support for str, list, set, and tuple return types - Updated
_build_urls()to accept strings in addition to lists/tuples forid_list - Bumped version from 0.3.0 to 0.3.1
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| hepdata_cli/api.py | Added format parameter to find() method with logic to return results in different formats; updated _build_urls() to handle string input; fixed typo in assertion message |
| tests/test_search.py | Updated test parameters and assertions to validate the new format parameter functionality |
| tests/test_download.py | Added tests for different input formats (str, list, set, tuple) and new integration test combining find() with download() |
| README.md | Updated documentation to explain the new format parameter and showed example usage with format=list |
| hepdata_cli/version.py | Bumped version to 0.3.1 |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| return find_results | ||
| else: | ||
| return ' '.join(find_results) | ||
| if format==str: |
Copilot
AI
Oct 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing spaces around the equality operator. Should be if format == str: to follow PEP 8 style guidelines.
| if format==str: | |
| if format == str: |
| elif format in (set, tuple): | ||
| return format(find_results) | ||
| else: | ||
| raise TypeError(f"Cannot return results in specfied format: {format}. Allowed formats are: {str}, {list}, {set}, {tuple}.") |
Copilot
AI
Oct 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'specfied' to 'specified'.
| raise TypeError(f"Cannot return results in specfied format: {format}. Allowed formats are: {str}, {list}, {set}, {tuple}.") | |
| raise TypeError(f"Cannot return results in specified format: {format}. Allowed formats are: {str}, {list}, {set}, {tuple}.") |
GraemeWatt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I agree that it is useful to add the format option without changing the default behaviour, which comes from the arxiv-cli package that originally inspired hepdata-cli. However, the arxiv-cli package has not been maintained for 7 years and is now difficult to install. In a recent commit 3e8a9df I replaced its use in Example 6 with arxiv.py, but then I had to insert an additional line id_list = id_list.split() for it to work. The new format option makes it simpler. I'd be happy to merge after you address my two comments (and the two comments from Copilot).
| elif format in (set, tuple): | ||
| return format(find_results) | ||
| else: | ||
| raise TypeError(f"Cannot return results in specfied format: {format}. Allowed formats are: {str}, {list}, {set}, {tuple}.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is not covered by the tests, resulting in a -0.4% decrease in test coverage for the PR. Please add a test that raises the TypeError exception.
| from click.testing import CliRunner | ||
|
|
||
| from hepdata_cli.api import Client | ||
| from hepdata_cli.api import Client, MAX_MATCHES, MATCHES_PER_PAGE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why import MAX_MATCHES and MATCHES_PER_PAGE here? They're not used in this file.
|
@codecov-ai-reviewer review |
This comment has been minimized.
This comment has been minimized.
| else: | ||
| raise TypeError(f"Cannot return results in specfied format: {format}. Allowed formats are: {str}, {list}, {set}, {tuple}.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in error message: 'specfied' should be 'specified'. This is a user-facing error message that will appear unprofessional with the typo.
Did we get this right? 👍 / 👎 to inform future reviews.
Hi @GraemeWatt,
Is there a specific reason why
Client.find()returns a string if theidskeyword is given? I see why that is helpful when callingfindfrom the command line but if using the python interface, it seems unnecessary. In particular, if the list entries are first joined into a string by' '.join(find_results)only to be broken up into a list again in user's scripts. (I have one where exactly this is the case.)Insofar, if there are no principal objections, this PR resolves the issue by adding a
formatargument toClient.find(). To leave the default behaviour offind()unchanged, the default format remainsstr. It can also be set tolist,setortuplenow, though, if desired.Matching tests have been added, the readme updated and the version bumped.