Skip to content

Issue in cron when same user has multiple sis_names #1559

@jonespm

Description

@jonespm

Thank you for contributing to this project!

  • Make sure to search the issues for duplicates first!

Describe the bug (Tell us what happens instead of the expected behavior) :

It was noted by Unizin that when running their cron, some user_ids can come back with multiple sis_names set. This could have implications for application behavior as well as impersonation possibilities. I didn't see anything like this on our local instance but that doesn't mean it can't happen. This change was introduced in the UDP switch in #1443 because Kaltura only stores the email address and not other identifiers. And the UDP isn't storing non-roster user-info.

So for example these users

user_id sis_name course_id enrollment_type
10100000001234567 matt.jones@gmail.com 101000000000001 StudentEnrollment
10160000001234567 matt.jones@example.edu 101000000000001 StudentEnrollment

are both changed by the cron to be this:

user_id sis_name course_id enrollment_type
10100000001234567 matt.jones 101000000000001 StudentEnrollment
10160000001234567 matt.jones 101000000000001 StudentEnrollment

It was also noted to have duplicates in this table somehow. This table really should be distinct between all 4 columns.

So I think we need a better fix here. I believe the best solution might be to store both the sis_name "as-is" along with the as the email_address "as-is" rather than trying to manipulate it in the query to support Kaltura. Then in the code we can handle the special case for Kaltura if one of the value is blank.

Kaltura will either have
{"id": "https://aakaf.mivideo.it.umich.edu/caliper/info/user/jonespm", "type": "Person", "name": "Matthew Jones", "dateCreated": "2023-11-30T22:03:05.000Z", "dateModified": "2023-12-12T15:29:38.000Z"}
Or
{"id": "https://aakaf.mivideo.it.umich.edu/caliper/info/user/jonespm+gmail.com", "type": "Person", "name": "Matthew Jones", "dateCreated": "2023-11-30T22:03:05.000Z", "dateModified": "2023-12-12T15:29:38.000Z"}

A temporary fix identified for this was to change the query in the cron to not strip out the email.

case
   when pe.email_address is not null then lower(pe.email_address)
   else p2.sis_ext_id end as sis_name,

We need to fully test and identify where this ID is used to ensure there's no regressions.

Steps to Reproduce :

  1. TBD - Will probably involve adding multiple sis_names and enroll them in a class to try this.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

To do

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions