Skip to content

Thought experiment: index partitions #921

@pudo

Description

@pudo

Imagine a yente index in which each dataset carries a partition ID (which is default if unset). The partition ID for the dataset is then baked into the index alias name, so instead of yente-entities, the alias is now called yente-entities-default. This is reflected in the indexer as a definition of managing the index lifecycle (rollover, GC). When a collection is used as the main query scope, the partition it is linked to is then used to build the alias name for all queries.

Concerns:

  • Overall complexity
  • Do the dataset indexes then need to also carry that prefix? We're running into the ES index max len limit (64. yente-entities-default-default-xxxx is probably ok, but yente-entities-default-icij_offshoreleaks-xxxx) is pushing it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions