Skip to content

Optimize post_bsos (postgres) #1930

@data-sync-user

Description

@data-sync-user

The initial version of post_bsos is a simple repeated call to put_bso. Ideally this should be the inverse (like in the Spanner impl): put_bso should be implemented as a call to post_bsos with a vec of a single bso.

This could be tricky on Postgres using the diesel ORM as the collection of bsos could be a mix of different selective updates.

E.g. it could be a request for bso0 to update solely its payload, and bso1 to update solely its sortindex. A plain upsert on multiple bsos would want to adjust all values set on ON CONFLICT DO UPDATE SET across all the bsos (in this case payload/sortindex on both).

The python impl handles this by grouping all bsos w/ the same field updates into separate batches of updates. This would result in 2 separate upserts in the example above, one for a payload only upsert, one for a sortindex only. This sounds difficult to emulate in the strongly typed diesel ORM.

The batch commit handles a similar situation via a COALESCE on the old vs new values, which might be a better option.

┆Issue is synchronized with this Jira Task

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions