-
Notifications
You must be signed in to change notification settings - Fork 137
Eula error handling #1166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eula error handling #1166
Conversation
|
I will automatically update this comment whenever this PR is modified
|
|
User ana-sher does not have permission to run integration tests. A maintainer must perform a security review of the code changes in this pull request and re-run the failed integration tests jobs, if the code is deemed safe. |
chuckwondo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ana-sher for picking this up!
earthaccess/exceptions.py
Outdated
| pass | ||
|
|
||
|
|
||
| class EulaException(Exception): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@betolink, what do you think about making this a subclass of DownloadFailure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe? one thing I forgot is that this error will also arise when we open the file with .open() would that be semactically download error too? @chuckwondo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say that it would be semantically the same. When opening, we still end up attempting to "download" data. The only difference is how much data is being downloaded (either all of it, or some of it).
In the case of open, a user likely won't care or notice whether or not EulaException is a DownloadFailure, but in the case of download, it might be nice to be able to catch DownloadFailure and have that also catch EulaException.
Also, now that I think about it, I suggest we rename EulaException to EulaNotAccepted to make it a bit more clear (note also the intentional lack of the suffix Error or Exception).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are having same issue on .open(), will try to reproduce and add test on that as well, thank you @betolink.
Might need to intercept and raise in _open_urls_https or _open_files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
testing:
when calling earthaccess.open for
"S5P_L2__CH4____HiR"file-like objects were created (maybe further xarray processing would give error)."sentinel-1c_raw"401 returned that was wrapped inFileNotFoundErrorfor the file byfsspec. By opening that file url in browser - automatically redirected me to accept eula, so giving data url if fsspec returnsFileNotFoundErrormight be a good idea. But even fsspec returns it:
raise FileNotFoundError(url) from exc
E FileNotFoundError: https://datapool.asf.alaska.edu/RAW/SC/S1C_IW_RAW__0SDV_20250328T182304_20250328T182337_001645_002A8D_2E72.zip
C:\Users\dmitr\miniconda3\envs\earthaccess\Lib\site-packages\fsspec\implementations\http.py:451: FileNotFoundError
"ALOS2_L1_PSR2"(also has eula 403 ondownload) file-like objects were created without error."ERS-1_L1"-FileNotFoundError"ALOS_PSR_RTC_LOW"-FileNotFoundError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggesting to move earthaccess.open EULA error processing to another PR since we are not sure yet how to infer it from returned FileNotFoundError
earthaccess/store.py
Outdated
| if r.status_code in [401, 403]: | ||
| text = (r.text or "").lower() | ||
| if "eula" in text: | ||
| raise EulaException( | ||
| "You must accept the EULA to download this data." | ||
| ) | ||
|
|
||
| r.raise_for_status() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the response text contains "eula" (ignoring case), we should use that text as the error message, rather than making up our own message.
Also, if the response is not successful, but the text does not include "eula", we want to raise a general DownloadFailure (which seems to have been accidentally left out as part of the previous PR that added the DownloadFailure exception class -- i.e., the class was added, but wasn't used here, as it should have been).
| if r.status_code in [401, 403]: | |
| text = (r.text or "").lower() | |
| if "eula" in text: | |
| raise EulaException( | |
| "You must accept the EULA to download this data." | |
| ) | |
| r.raise_for_status() | |
| if not r: | |
| if "eula" in (text := r.text or "").lower() | |
| raise EulaException(text) | |
| raise DownloadFailure(text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to include full html page that was returned in text to exception message? I could strip it down to the closest ">" message "<" if we are sure it's always HTML.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right. I forgot it's returning HTML.
I don't think it's worth trying to parse out the message at runtime. Can you reproduce the response and just manually copy the text of the message and then paste it into the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about logging the content and telling the users that the full response is in the error logs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaning towards not logging the full response, because it is a bunch of HTML, and would likely be annoying and probably too verbose and messy for a user to bother to read to find the error message.
If we can reproduce the HTML response without much effort, I'd say we should do that (like you did during our hack session the other day), find the error message in the HTML response, and copy the error message to put into the code. We may want to slightly adjust the wording, and include the data URL in the message too, so the user knows exactly what failed to be downloaded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was reproducing this issue again and got 403 instead of 401 with json, not sure why. Was testing for S5P_L2__CH4____HiR (same collection as for hackdays) and same credentials. Also tested for sentinel-1c_raw and ALOS2_L1_PSR2 - same json with eula error was returned. For ALOS_PSR_RTC_LOW and ERS-1_L1 got 401 with text HTTP Basic: Access denied, no EULA mentioned...


I can check content-type in headers and output error_description + resolution_url + data url if json. If not - fallback to "Eula Acceptance Failure" + data url?
tests/unit/test_store.py
Outdated
| ) | ||
| store = Store(self.auth) | ||
| # test | ||
| with self.assertRaises(EulaException): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| with self.assertRaises(EulaException): | |
| with self.assertRaises(EulaException, msg=response): |
Co-authored-by: Chuck Daniels <[email protected]>
|
User ana-sher does not have permission to run integration tests. A maintainer must perform a security review of the code changes in this pull request and re-run the failed integration tests jobs, if the code is deemed safe. |
|
@ana-sher, looks like pre-commit failed on one of my suggestions: https://results.pre-commit.ci/run/github/399867529/1765488551.Ty6xULlSTtagswoCnbUJyQ |
|
User ana-sher does not have permission to run integration tests. A maintainer must perform a security review of the code changes in this pull request and re-run the failed integration tests jobs, if the code is deemed safe. |
1 similar comment
|
User ana-sher does not have permission to run integration tests. A maintainer must perform a security review of the code changes in this pull request and re-run the failed integration tests jobs, if the code is deemed safe. |
|
User ana-sher does not have permission to run integration tests. A maintainer must perform a security review of the code changes in this pull request and re-run the failed integration tests jobs, if the code is deemed safe. |
|
User ana-sher does not have permission to run integration tests. A maintainer must perform a security review of the code changes in this pull request and re-run the failed integration tests jobs, if the code is deemed safe. |
chuckwondo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Thanks @ana-sher!
betolink
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! thanks for the PR @ana-sher!
Resolves #36
Contains:
response.text, with html that includes "eula" error inside)_download_file, because if catching error further it won't containtextorcontent, will only raise error for given status code.Pull Request (PR) draft checklist - click to expand
contributing documentation
before getting started.
title such as "Add testing details to the contributor section of the README".
Example PRs: #763
example
closes #1. SeeGitHub docs - Linking a pull request to an issue.
CHANGELOG.mdwith details about your change in a section titled## Unreleased. If such a section does not exist, please create one. FollowCommon Changelog for your additions.
Example PRs: #763
README.mdwith details of changes to theearthaccess interface, if any. Consider new environment variables, function names,
decorators, etc.
Click the "Ready for review" button at the bottom of the "Conversation" tab in GitHub
once these requirements are fulfilled. Don't worry if you see any test failures in
GitHub at this point!
Pull Request (PR) merge checklist - click to expand
Please do your best to complete these requirements! If you need help with any of these
requirements, you can ping the
@nsidc/earthaccess-supportteam in a comment and wewill help you out!
Request containing "pre-commit.ci autofix" to automate this.
📚 Documentation preview 📚: https://earthaccess--1166.org.readthedocs.build/en/1166/