Skip to content

Merging and uuids#340

Open
EvanDietzMorris wants to merge 4 commits intomainfrom
merging-and-uuids
Open

Merging and uuids#340
EvanDietzMorris wants to merge 4 commits intomainfrom
merging-and-uuids

Conversation

@EvanDietzMorris
Copy link
Copy Markdown
Collaborator

This bumps ORION to a new version which has a significantly improved node/edge merging algorithm. It's quite a bit faster (2-3x in benchmarks so far) and consumes less memory (half-ish? but not well tested).

It also adds a couple flags that tell ORION to add UUID edge ids to every edge during merging, overwriting existing ones.

The pinned version of robokop-orion was removed from the pyproject.toml recently but I'm not sure why, so this also sets one again (was that intentional @sierra-moxon?).

@sierra-moxon
Copy link
Copy Markdown
Member

just a tiny bit worried about the overwrite of edge ids. I know ctkp likes those to be set internally?

@EvanDietzMorris
Copy link
Copy Markdown
Collaborator Author

EvanDietzMorris commented Mar 19, 2026 via email

@RichardBruskiewich
Copy link
Copy Markdown
Collaborator

Just to clarify, Pydantic validates association 'id' fields as present, so the ingests need to set them, but they are otherwise irrelevant in that they are simply reassigned (overwritten, idempotently and globally uniquely) as the various knowledge-specific ingest data are pulled into the merge?

@EvanDietzMorris
Copy link
Copy Markdown
Collaborator Author

Based on our MUTT discussion today:

@RichardBruskiewich the edge ids assigned in ingests won't be completely irrelevant, we'll use them to track edges through our system, and to better track merges, but most of them might also be overwritten during normalization or merging.

I'll also implement a new section of merge metadata that includes the specific edge id mappings when mergers occur; and a way to let valid UUIDs from upstream "pass through" sources persist through the ingest, and even through normalization and merging if nothing changes. Let's hold off on this PR until then I guess.

@RichardBruskiewich
Copy link
Copy Markdown
Collaborator

Based on our MUTT discussion today:

@RichardBruskiewich the edge ids assigned in ingests won't be completely irrelevant, we'll use them to track edges through our system, and to better track merges,

Roger that @EvanDietzMorris ... No change in any of the ingests (within which Pydantic enforces the setting of Association.id fields)

...Let's hold off on this PR until then I guess.

It's your PR so... this ball is in your court, LOL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants