Skip to content

Support expressions when filtering products by numeric columns #3365

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

snbianco
Copy link
Contributor

@snbianco snbianco commented Jul 2, 2025

Enhanced filter_products methods in MastMissionsClass and ObservationsClass to support advanced filtering expressions for numeric columns. Users can now filter using single values, value ranges (e.g., "100..1000"), comparison operators (e.g., ">=500"), or lists combining multiple expressions (e.g., [100, "500..1000", ">=1500"]). This provides greater flexibility when filtering mission data products.

Also includes tests and documentation updates. Fixed some unrelated documentation tests that were out-of-date.

Copy link

codecov bot commented Jul 2, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.16%. Comparing base (c522e2a) to head (dd38fbd).
⚠️ Report is 21 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3365      +/-   ##
==========================================
+ Coverage   70.07%   70.16%   +0.08%     
==========================================
  Files         232      232              
  Lines       19893    19918      +25     
==========================================
+ Hits        13940    13975      +35     
+ Misses       5953     5943      -10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@snbianco snbianco added the mast label Jul 2, 2025
@snbianco snbianco marked this pull request as ready for review July 2, 2025 19:24
@snbianco snbianco requested a review from bsipocz July 2, 2025 19:24
@bsipocz bsipocz added this to the v0.4.11 milestone Jul 2, 2025
Copy link
Member

@bsipocz bsipocz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern here is the usage of warnings instead of letting the exception through when garbage input was provided. Could you elaborate on why you prefer warnings, that most users will just ignore?

col_mask = np.isin(products[colname], vals)
col_data = products[colname]
# If the column is an integer or float, accept numeric filters
if col_data.dtype.kind in 'if':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in 'if'? I'm not sure I get that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's checking whether the kind code of the column is i (integer) or f (float).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add that as a comment? Or even better, rephrase the line as

Suggested change
if col_data.dtype.kind in 'if':
if col_data.dtype.kind in ["i", "f"]:

Comment on lines 519 to 521
except ValueError:
warnings.warn(f"Could not parse numeric filter '{vals}' for column '{colname}'.", InputWarning)
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not allow the exception to be raised here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that this should just raise an exception.

Comment on lines 600 to 602
if colname not in products.colnames:
warnings.warn(f"Column '{colname}' not found in product table.", InputWarning)
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be an exception instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reasoning here was that the schema of the product table returned by our API may change in the future. Only issuing a warning would prevent a user's code from breaking unexpectedly, but I suppose the output of the function would not be what the user expects either. This is also the precedent set by the MastMissions class.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeap, but if the API is changing then their code will have to change anyway, or the astroquery module should change and thus they have to update the astroquery version to keep the same user code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see where you're coming from! It probably would be better to fully alert the user if the column they have been filtering on no longer exists. The latest commit raises an error in both Observations and MastMissions if a nonexistent column filter is passed.

Comment on lines 374 to 376
# Filter by extension
filtered = mast.MastMissions.filter_products(products,
extension='fits')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick, but we allow linelength to run up to 120, there is really no need for the break here, and below

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants