improved retrievals and adaptions to next pyslk release#48
improved retrievals and adaptions to next pyslk release#48observingClouds merged 57 commits intoobservingClouds:mainfrom neumannd:main
Conversation
|
@observingClouds Can I prevent somehow that tests are generated automatically for all pyslk functions? Setting up all SlkMock functions properly will take a lot of time. |
|
@neumannd can you point me to some tests? I think I cannot completely follow you. You just need to mock those functions that are actually used within slkspec. |
That's the main issue I have. It is quite time consuming to set up proper mock functions for this purpose. I am aware that this would be the proper proceedure. But, I am lacking time to finished this task in December. Would you merge anyway? |
|
I have to say no here as I believe it's crucial for all code additions to pass the tests to maintain the integrity of the project. |
|
@observingClouds : The tests succeed now. @doguskbilir Updated the mock functions. Do you have any comments on the current code version? Additionally, @doguskbilir is working on setting up a |
|
Amazing! Great work! Yes, please do the |
observingClouds
left a comment
There was a problem hiding this comment.
Thanks @neumannd and @doguskbilir for this hard work! I really appreciate it. My main comment is about the complexity of _retrieve_item(). Please restructure it so that flake8 does no longer complain about C901. I think with a few extra functions and classes it will become much more readable and help us in the long run to maintain this package better.
slkspec/core.py
Outdated
| """ | ||
| {'SKIPPED': {'SKIPPED_TARGET_EXISTS': ['/arch/bm0146/k204221/iow/INDEX.txt']}, | ||
| 'FILES': {'/arch/bm0146/k204221/iow/INDEX.txt': '/home/k204221/tmp/INDEX.txt'}} | ||
|
|
||
| # dry run | ||
| {'ENVISAGED': {'ENVISAGED': ['/arch/bm0146/k204221/iow/INDEX.txt']}, | ||
| 'FILES': {'/arch/bm0146/k204221/iow/INDEX.txt': '/home/k204221/tmp/abcdef2/INDEX.txt'}} | ||
|
|
||
| # after successful retrieval | ||
| {'ENVISAGED': {'ENVISAGED': []}, 'FILES': {'/arch/bm0146/k204221/iow/INDEX.txt': | ||
| '/home/k204221/tmp/INDEX.txt'}, 'SUCCESS': {'SUCCESS': ['/arch/bm0146/k204221/iow/INDEX.txt']}} | ||
|
|
||
| {'FAILED': {'FAILED_NOT_CACHED': ['/arch/bm0146/k204221/iow/iow_data5_001.tar']}, | ||
| 'FILES': {'/arch/bm0146/k204221/iow/iow_data5_001.tar': '/home/k204221/tmp/iow_data5_001.tar'}} |
There was a problem hiding this comment.
This could go into the docstring of a new function?!
There was a problem hiding this comment.
This docstring contains possible output of the function call four lines above. The following lines contain multiple if-clauses to react on the output. Therefore, I think it is reasonable to please this docstring here and not into the docstring of the function's header.
|
@observingClouds The tests do not start running. Any idea what causes this? |
|
@observingClouds There is are for loops in the code in which recalls are started and their status is checked. E.g. one of the for loops runs as long as there are unfinished recalls. Takes some time to implement this. Would it be possible to skip the tests for now and add them later? I know that not the ideal way but at some point there should be a working slkspec again. |
|
Wouldn't that whole recall/retrieve loop logic be something that would have a good place in pyslk, so any client can use it? As a client, I could simply call |
|
@florianziemen In principle, I agree with you. But ... ;-) . This approach is very inefficient when it is done for one or two files. Then, it is better to run the classical retrieval command (or something else). This approach is very useful when the users want to retrieve many files at once. For the latter purpose we have already a nice script setup with automatic submission of SLURM jobs (https://docs.dkrz.de/doc/datastorage/hsm/retrievals.html#retrieve-more-than-a-handful-of-files). Additionally, the approach taken here is pseudo-parallelized and hard to debug if something failes. Providing it to the broad community for individual usage might be a bit risky because nobody understands what is going on. We would need to invest much more time to make it mass-compatible. Therefore, I do not think that it was very useful in pyslk -- even counter-productive if users start using it for getting one file after each other. However, if this was set up from the scratch with a central thread running (server-like) then this approach might improve the situation of the users. The thread could "accept" newly requested files, put single requests together and organized recalls/retrievals. This approach would match the usage pattern in Python when users work with catalogs. I think that this is too much work in the moment -- particularly with a HSM proxy system at the horizon. |
|
Hi @neumannd, |
|
@observingClouds The tests of Dogus' PR finished. For this PR, the tests already failed on Friday. I think there is an infinite loop when I run the core functionality with the pyslk Mock functions. At least it looks like this when I run the tests locally. |
This does not sound like we'd want to have it in a package like slkspec that @observingClouds provides to everybody without resources for further support. |
|
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files☔ View full report in Codecov by Sentry. |
|
@observingClouds Mocks updated. All current tests succeed. On paper, I have prepared two additional tests which would automatically test the new code part in more detail. These are tests which I ran manually. I need to add these -- but not this and not the next week. I'd prefer if this PR could be merged already before. |
|
@observingClouds What is this codecov thing? Do I need to care about this? |
@florianziemen @observingClouds I formulated it unclear. The process of getting data from tape itself is hard to debug if a tools does the recall and retrieval for the user in a black box. I don't mean that my code in particular has this issue. After a meeting with Flo, we found two bugs which I will fix today or tomorrow. There is one general thing to discuss: If a file cannot be recalled or retrieved, should an exception be thrown directly? Or is it OK to try to get all files first, collect "bad files" (recall or retrieval failed) and announce this in the end. Currently, I do the letter. Additionally, I write a list of all failed recalls and retrievals into one file and provide the path to the file in the error message. I think that it reasonable because then the user directly knows about all tapes/files which are problematic. @observingClouds Would you agree with this procedure? |
|
Open issues add files on tapes with errorstate to list of failed recalls and proper exception in the end of _retrieve_item() fixed. Tests succeeded locally ... . |
|
@observingClouds Tests succeed. Code is running also in real use cases. Do you have any more comments? |
|
@observingClouds @florianziemen |
slkspec/core.py
Outdated
| return self._file | ||
| return self._url | ||
|
|
||
| # flake8: noqa: C901 |
There was a problem hiding this comment.
@neumannd can we remove this now that the function got a bit simplified?
|
@observingClouds Is there anything which needs to be adapted from your point of view? |
|
@neumannd, my only remaining comment is to get rid of |
|
@observingClouds Removed it and updated the code. |
|
@observingClouds adapted CHANGELOG.md |
|
It is done!! 🎉 Thanks for being patient with all my comments! |
|
If you like I can immediately do the |
@observingClouds Sounds good! |
pyslkversion 2.x.y