Skip to content

Partition / RowKey Schema + Performance Efficiency #16

@aarondcoleman

Description

@aarondcoleman

Since the current version stores all errors in a single partition, as the row count grows, performance starts to deteriorate. A better solution would be to try to minimize the number of rows in a partition to just a few hundred.

From the guidelines on designing a scalable table solution doc here: https://msdn.microsoft.com/en-us/library/azure/hh508997.aspx

"A highly uneven distribution of entities across partitions may limit the performance of the larger and more active partitions"

A better solution might be to either

  • Partition on a day, or an hour that can also be a range query (numeric) representation like 20150616
  • A fixed partition size with a counter so 000001, 000002, and an additional table that has some pointer info to know which dates fall in to which partition buckets.

But those are just two ideas. We're using this in production and now that our table size has increased, performance is dramatically slow (lookups of up to 30 seconds!)

Thoughts?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions