-
Notifications
You must be signed in to change notification settings - Fork 0
feat: Pythonize Landsat Tile workflow #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
||
| def test_create_manifest(mock_binaries: Path, mock_config: EnvConfig) -> None: | ||
| """Test surface reflectance manifest creation.""" | ||
| output_basename = f"HLS.L30.T{MGRS}.2020001T101010.v2.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be configured (i.e., v2.0) as a parameter instead of hard coded for future HLS v3 productions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 this could be refactored, though I don't think it'll have benefit for a v3 since I would hope we can delete a lot of these tasks as our overall workflow becomes simpler (e.g., avoiding having like 7 format translations from TIF -> binary -> TIF -> HDF -> HDF -> COG)
Funny enough a lot of the other HLS code is still referencing v1.5 ^_^. As with the code in those other projects, the version here doesn't really matter for the purpose of the test
| s3.create_bucket(Bucket=BUCKET_OUT) | ||
| s3.create_bucket(Bucket=BUCKET_GIBS) | ||
|
|
||
| granule_id = f"HLS.L30.T{MGRS}.2020001T101010.v2.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious why it is being hardcoded here for date '2020001T101010'? Is it something that can be configured using 'MOCKED_SCENE_TIME'?
Probably the docstrings of the function can be further enhanced to better explain what specific behavior is being tested (beyond just "generation").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably the docstrings of the function can be further enhanced to better explain what specific behavior is being tested (beyond just "generation").
Generally -1 to this sort of comment, especially for tests. If the test code isn't straightforward enough to serve as the source of truth, then it's more likely that variable names are unclear, the scope of the test is too large, or some of the test setup could be refactored.
Writing a long paragraph explaining what the code does is not a great use of time. Especially over time as the code changes it's common for such lengthy documentation to become out of date, making it misleading and worse than if it didn't exist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious why it is being hardcoded here for date '2020001T101010'? Is it something that can be configured using 'MOCKED_SCENE_TIME'?
We could be using MOCKED_SCENE_TIME here, but it doesn't really matter for this test.
The bigger problem is that this has a hard coded date instead of using the date value from mock_config. I'll fix that
| assert cfg.working_dir.exists() | ||
|
|
||
|
|
||
| def test_download_pathrows(mock_config: EnvConfig, mock_aws_s3: S3Client) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be beneficial to have an error handling for this function to test handling of missing input files or server errors, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error handling for this function
Assuming you mean for the DownloadPathRows task, +1 this is a good idea! One caveat is right now there is no error condition in the task itself for something like missing input files.
In the original shell script there was no error checking for the equivalent step. We would just download whatever we found, and any errors associated with incomplete outputs would rise up later in the pipeline. This sort of error would be confusing, since the actual issue happened in previous steps
I think we might be able to check if we have all the expected data and raise an error at the appropriate time (during the download task)
|
|
||
|
|
||
| def process_path_rows(mock_binaries: Path, mock_config: EnvConfig) -> None: | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstrings could be further improved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there something specific that is unclear about the code or the single line docstring? Typically tests are self explanatory and don't require extensive docstrings
If anything this test is missing the test_ prefix to the function, so pytest probably isn't recognizing it
What I am changing
Adds workflow for Landsat tile
How I did it
$@to pytest from test scriptHow you can test it