Skip to content

[FEATURE REQUEST]: The ability to use an external identity provider instead of the MongoDB database #1

@jpellman

Description

@jpellman

Hi nextPYP Team,

I'm a systems administrator at an institution (NYSBC/SEMC) that is interested in using nextPYP with some regularity. We recently performed a test installation of nextPYP and came across a few architectural decisions that make it particularly unwieldy from an operational perspective.

One particular issue that we have with the software design is the fact that SLURM jobs run as a service account instead of the user who actually submitted the job. The main problem with this setup is that SLURM comes with an accounting system that we can use to gather statistics about usage. This accounting system isn't just used for reporting purposes, however; the database where job metrics is kept also can influence scheduling decisions (via fairshare) and resource limits. When all users run jobs as the same service account, this effectively blurs usage patterns from all users together, and makes it impossible to use SLURM's accounting functionality in any meaningful way.

At a small institution such as NYSBC, not being able to fully leverage the accounting functionality would not be too big of a deal, but at larger institutions (universities, the NIH, etc) this would likely be a bit of a showstopper. For some individual labs within a larger university, a potential workaround could be to use an individual grad student/postdoc's account as the service account, but this would come with the caveat that that individual's fairshare factor would be affected and their jobs would be unjustly deprioritized by the activities of their colleagues.

Ideally, instead of using a service account researchers would be able to log in as themselves using the same identity provider as the SLURM cluster. In this way, the identity provider and authentication functionality (which currently seems to be handled by MongoDB entries) would be decoupled from authorization. One could, for instance use LDAP as an identity provider and store authorization information (i.e., which identities can access which projects) in the MongoDB database.

A concrete example of this can be seen in how JupyterHub currently handles authentication (see here), where there are multiple authenticators that can be used with different identity providers. Multiple authenticator classes is probably overkill for nextPYP, but at the same time I feel like it would be useful to at least be offload authentication onto PAM.

I don't know how difficult what I described would be to implement on your end, but would be happy to chat some more if you find this feedback useful (or alternatively, if you would like some clarification about what I'm describing / asking for).

--John

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions