Skip to content

Prevent task failures by checking for missing SSH keys before launch on Userkey Data Providers#1598

Closed
RafsanNeloy wants to merge 11 commits intoaces:devfrom
RafsanNeloy:SSH
Closed

Prevent task failures by checking for missing SSH keys before launch on Userkey Data Providers#1598
RafsanNeloy wants to merge 11 commits intoaces:devfrom
RafsanNeloy:SSH

Conversation

@RafsanNeloy
Copy link
Contributor

Currently, tasks using UserkeyFlatDirSshDataProvider fail on the cluster with a cryptic RuntimeError if the user hasn't pushed their SSH key locally. This PR adds early validation in TasksController : it shows a clear warning on the task form (directing users to the SSH Panel in "My Account") and adds a hard block on submission if the key is still missing, preventing wasted resources and blocked synchronization.
Closes #1562

@RafsanNeloy
Copy link
Contributor Author

@natacha-beck as per your comment (#1597 (comment)) I have made a PR in dev branch.

@MontrealSergiy
Copy link
Contributor

thanks, please rebase it on dev, to make review and merge easier

@RafsanNeloy
Copy link
Contributor Author

@MontrealSergiy can you please suggest me how to rebase it. I saw that dev branch is not synced with master

@MontrealSergiy
Copy link
Contributor

MontrealSergiy commented Mar 5, 2026

Yes the problem is that you PR contain commits from master which are not yet in dev.

If you are not familiar with git, I think the easiest workaround is to 1) branch out the feature branch from dev 2) copy paste your new code and 3) forcibly overwrite the present branch.

However there is a chance that some error happens during the copying, so the proper recommended way is to rebase.

If you feel up to the challenge check a good tutorial on git

git rebase might be not most straightforward command. Therefore many developers cheat and use cherry-picking, patching and/or GUI tools instead. I believe RubyMine IDE might allow one to drop commits in GUI though I never tried it myself.

UPDATED: I am sorry I just tried RubyMine commit delete, unfortunately it only work in only in the simpler cases, do not use it

@MontrealSergiy
Copy link
Contributor

MontrealSergiy commented Mar 5, 2026

Sorry do not use RubyMine to drop commits from history log. Unfortunately GUI commits drops do not always work well. Otherwise most of times this IDE is friendly and reliable

@MontrealSergiy
Copy link
Contributor

I wonder did you test your PR?

@MontrealSergiy
Copy link
Contributor

MontrealSergiy commented Mar 5, 2026

BTW I synced the dev to master. Unfortunately it does not affect the pr change list on github immediately, perhaps, due to some caching on github side. Hopefully you would not need to rebase

@RafsanNeloy
Copy link
Contributor Author

I wonder did you test your PR?

Our SSH key check at line 275 is never reached, as I'm testing in local environment. owner.ssh_key(:ok_no_files => true).valid? , returns false when keys are missing.

I tested the fix locally by creating a UserkeyFlatDirSshDataProvider
and temporarily removing the owner's SSH key files. When launching a task with that file, the portal now correctly intercepts the request and shows the warning message shown in the attached screenshot, redirecting the user to their account page.

I don't know any other way to actually test it with server.
IMG-20260306-WA0005

@prioux
Copy link
Member

prioux commented Mar 6, 2026

It's a good feature but there is a problem with it. A user's SSH key is never missing on a portal, so the check will never report that the key is not available. What we need is a check on whether or not the owner of the data provider has pushed their key to a Bourreau, which is typically a distinct, remote system.

Currently, the way CBRAIN records whether or not a key has been pushed is through a clumsy MetaData registry (to check on bourreau 1234, we fetch a date from user.meta[:ssh_key_install_date_1234] ). I call it clumsy because instead of using the meta data directly, we should have a set of OO methods to check the status of the keys, e.g. maybe something like User.is_ssh_key_installed?( bourreau ) or the reverse Bourreau.has_user_ssh_key?( user ) . These could currently be implemented to check the meta data store, and then one day re-implemented when we have a better way of recording the status of the key (the meta data store mechanism has always been just a quick temporary solution).

@prioux
Copy link
Member

prioux commented Mar 6, 2026

I think in fact, we should have a more general mechanism to verify that a given file on a given data provider will be able to be fetched from a bourreau, and the verification mechanism would provide an explanation if not. That way we could make checks not just for user SSH keys, but for other situations, such as a DP that is not allowed to be synced to a particular bourreau.

So I think maybe what we would need is new general methods on the DPs that can be overloaded in subclasses, called something like:

error_message = dp.can_read_from_userfile(userfile, bourreau, user)  # not sure about args and method name or even receiver

in the case of a UserKeySshDataProvider, the method would call super and also check that the user's SSH key was pushed on the bourreau.

Signed-off-by: rafsanneloy <rafsanneloy@gmail.com>
@RafsanNeloy
Copy link
Contributor Author

Implemented the userfile_syncing_issue hook to prevent task failures by preemptively checking if a user's SSH key is pushed to the remote Bourreau before launch. If it's missing, it now gracefully blocks the task and alerts the user.

@prioux
Copy link
Member

prioux commented Mar 9, 2026

I'm closing this PR as it contains way too much code for what it should be.

I made the changes myself that implements the feature. See my three commits here:

  1. Refactor all the code associated with user SSH keys, so that everything is now neatly packaged in three methods all in the same file: 2b87eb4

  2. Modify the Userkey SSH data provider to extend the behavior of the rr_allow_syncing? method: 2d2fd96

  3. Add the check and the message in the tasks controller. It's really clumsy, but it will be good enough for the moment: ddcff63

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Warn user if input's data provider is not allowed to use particular server

3 participants