-
-
Notifications
You must be signed in to change notification settings - Fork 424
Refactor WFAU image retrieval to reject deprecated images #2809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
bsipocz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some comments, but overall I turned this to a draft and plan to come back once it's finished, CI passes and it's out of the draft status
astroquery/wfau/core.py
Outdated
| image_width=1 * u.arcmin, image_height=None, | ||
| radius=None, database=None, | ||
| programme_id=None, get_query_payload=False): | ||
| def get_image_list(self, coordinates, *, radius=None, ignore_deprecated=True, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spell out kwargs, we try to get rid of this way of hiding them so we should not introduce more of these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, ok. That's a lot of duplicated text, which I don't love, but fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, but if it's public API, one would need useful docstrings, too. dumping kwarg out there is not exactly useful. OTOH, if it's an option of not getting slightly overlapping public API then certainly I would be in favour of not duplicating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_image_list is more of an internal-pointing method. It is technically usable, and useful, to users, but get_image_table is more useful and I prefer to point people there. I'd actually lean toward making _get_image_list private before duplicating its docstring and keyword arguments. Let me know which you prefer
astroquery/wfau/core.py
Outdated
| self._penultimate_response = response | ||
| response = self._check_page(response.url, "row") | ||
| self._last_response = response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this really necessary?
Also, try to avoid overwriting variables to help future debugging, the second response could be response_row, or whatever makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, this was for debugging, I'll remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the _last_response addition has historically been very useful in debugging other tools, so I prefer to keep it, but I agree about the variable renaming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeap, keeping _last_response is fine, I really just meant to spot _penultimate_response ;) and the rename.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair enough, I could drop that. But when else do you get to use penultimate in the real world?!
|
Re "solves" - it's not a "magic" word, won't auto-close the issue. Please use |
|
local tests passing: pytest astroquery/ukidss -x --pdb --remote-data
astroquery/ukidss/tests/test_ukidss.py ........
astroquery/ukidss/tests/test_ukidss_remote.py ........ |
astroquery/wfau/core.py
Outdated
| for row in html_in.split("\n")]) | ||
| return ascii.read(html, format='html') | ||
|
|
||
| def extract_urls(self, html_in): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I commented this before in another module: I think methods like this should not be class methods, definitely not public ones, but instead private functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I agree - happy to make this, and get_image_list, both private
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can't make get_image_list private, at least not without a deprecation period.
astroquery/wfau/core.py
Outdated
| self._penultimate_response = response | ||
| response = self._check_page(response.url, "row") | ||
| self._last_response = response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeap, keeping _last_response is fine, I really just meant to spot _penultimate_response ;) and the rename.
|
The last failure I see here is in cadc, not wfau - is this real? |
Opened #2811 for it. It looks consistent in CI, but I cannot yet reproduce it locally. |
bsipocz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This totally slipped through while I was prepping for the release.
I wonder if we could triage out these drafts and see how much they would require to get merged; or to get closed.
| try: | ||
| from bs4 import BeautifulSoup, XMLParsedAsHTMLWarning | ||
| except ImportError: | ||
| # workaround: older versions of bs4, which we still support, didn't have this warning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how old bs4 we are talking about? We could certainly bump the minimum version if it's not something very recent (e.g. I would say within the last 2-2.5 years)
This refactor also adds a new method to retrieve the very useful metadata tables rather than just scrape them for download URLs.
Solves #2808