Skip to content

Conversation

TipzCM
Copy link
Collaborator

@TipzCM TipzCM commented Oct 1, 2025

closes #7281

@robogary
Copy link
Contributor

robogary commented Oct 1, 2025

This Pull Request has failed the formatting check

Please run mvn spotless:apply or mvn clean install -DskipTests to fix the formatting issues.

You can automate this auto-formatting process to execute on the git pre-push hook, by installing pre-commit and then calling pre-commit install --hook-type pre-push. This will cause formatting to run automatically whenever you push.

Copy link
Collaborator

@tadgh tadgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only got about halfway through, but with what I've seen, I question the approach, due to:

  1. New version of the bulk export job.
  2. Unclear expansion semantics
  3. Change in behaviour of group export (I'm pretty sure, at least).

Probably easier to have a working session to discuss this. I may be wrong about certain aspects, but I've got enough concerns about the approach that it warrants a call.

// of fetching mdm linked patients as well as converting all of them to
// JpaPid
Set<JpaPid> resolvedAndMdmExpanded = myMdmExpandersHolder
.getBulkExportMDMResourceExpanderInstance()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: how is bulk export MDM pid expansion different than normal mdm pid expansion?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is related to the issue Luis found.

Depending on whose code merges first, it'll likely be updated there.

Internally, it's doing some check it shouldn't do. But this code here was just refactored from another place so i didn't delve into what it was doing internally.


// use those maps to get the patient ids we care about
List<JpaPid> pids =
getPatientPidsUsingSearchMaps(maps, theParams.getGroupId(), null, theParams.getRequestPartitionId());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: I'm confused here. A group export exports for all patients in the group. Why would not all patients in the group be MDM-expanded? Don't we care about all the patient IDs?

.map(pid -> (JpaPid) pid)
.toList();
return new LinkedHashSet<>(existingMembers);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: I don't understand the purpose of this method. THe first thing it does is check if its already expanded? Why do it at all? The caller should know if expansion has occurred or not. Are we conflating mdm expansion with group expansion? May be worthwhile to once-over the var names to disambiguate the word.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kinda where we get into trouble with having shared logic.

the "expanded patient ids" are for v3

V2 will get here and not have this. so they will continue to do the expansion just as before (per resource).

This is because "FetchIds" happens in differnet places.

The only other solution would be to manually put a parameter in that states what version it is and use that or just copy paste the code all over htep lace.

I'm ok with eitehr ,if you'd prefer

Copy link
Contributor

@michaelabuckley michaelabuckley Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are different versions of the job, we should fork the code. It is unsafe for us to think we can be backwards compatible by being tricky like this.

public void addExpandedPatientId(PatientIdAndPidJson theId) {
getExpandedPatientIds().add(theId);
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think this needs to exist. Why not move the responsibility of expansion way out of the job itself, and have it expand before job initiation, and the job can just remain how it is?

pid.setAssociatedResourceId(theFhirContext.getVersion().newIdType(getResourceId()));
return pid;
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question; Have you checked for existence of a similar class? This is a whole data class that just represents a tuple.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually just because i needed the real resourceid as well as what's in the pid.

We store these objects into the db in batch jobs. and i need the real resource id to be serializable.

I could've added the value to TypedPidJson. But that object is used in other places and putting this value on the base class would (imo) be confusing since users in other areas might actually expect it.

The other option is to make a completely new object that is like TypePidJson but has the actual resource id, but.... this feels 'wrong' somehow since it's duplicating a lot of what TypePidJson is doing already.

And TypePidJson is already being used in over 100 places.

(Another down side of all the shared logic between BulkExport, Reindex, and BulkModify jobs)


return submissionCount;
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (blocking) : this class seems to have extracted functionality just to support reuse between v2 and v3 of this job. Also, fetchIds does wayyyyyy more than fetch IDs. the ownership of consumption has moved out of the step, and into this Service, which receives a consumer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did just copy-paste this code into a service because i basically wanted "the exact same logic"

But you're right - it might make sense to actually copy-paste the code (and leave it in the old step) so that the job itself has unique code unshared with previous versions

"Expand out patient ids if necessary",
MdmExpandedPatientIds.class,
mdmExpansionStep())
// load in (all) ids and create id chunks of 1000 each
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question (repeated): why not just perform an expansion before the job starts, and run a patient/group export with the pre-expanded list of IDs, on the existing job def?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add MDM expansion to bulk export
5 participants