Skip to content

Support global aggregations#1072

Open
rgruener wants to merge 1 commit intoairbnb:mainfrom
rgruener:global-aggregations
Open

Support global aggregations#1072
rgruener wants to merge 1 commit intoairbnb:mainfrom
rgruener:global-aggregations

Conversation

@rgruener
Copy link
Contributor

Summary

This adds support for keys=None to signify a global aggregation. Global aggregations can be added to a join and are then joined with each record on the left since they have no keys.

Why / Goal

Global aggregations are extremely useful for computing sane defaults to be used later in a join coalesce.

Test Plan

  • Added Unit Tests
  • Covered by existing CI
  • Integration tested

Checklist

  • Documentation update

Reviewers

def flatSchema: StructType = if (withTime) StructType(baseFlatSchema :+ timeField) else baseFlatSchema
def flatZSchema: api.StructType = flatSchema.toChrononSchema("Flat")
lazy val keyToBytes: Any => Array[Byte] = AvroConversions.encodeBytes(keyZSchema, GenericRowHandler.func)
// For global aggregations, use plain UTF-8 bytes for a dummy key (no need for an avro schema)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly think this is a bit of a hack but it is necessary since most KVStore implementations will require a non-empty key.

Global aggregations also pretty much require the user to cache the stored results since they will be fetched with every online request. It would be helpful to be more opinionated on how they are stored but I don't want to broadly affect the code base

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant