Skip to content

Commit 607e93f

Browse files
authored
feat(grouping): Add options to control grouphash caching (#103943)
This adds three new options, `grouping.use_ingest_grouphash_caching`, `grouping.ingest_grouphash_existence_cache_expiry`, and `grouping.ingest_grouphash_object_cache_expiry`, to control the caching of both a check on secondary grouphash existence and of actual `GroupHash` objects during ingest. The two expiry times are controlled separately because caching a boolean takes less memory than caching a whole Django ORM object (albeit a quite simple one), so we can probably let things stay in the first cache longer than the second. (We'll know after we turn caching on and see what the respective hit rates are.)
1 parent 8c744ce commit 607e93f

File tree

1 file changed

+33
-0
lines changed

1 file changed

+33
-0
lines changed

src/sentry/options/defaults.py

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2863,6 +2863,39 @@
28632863
flags=FLAG_AUTOMATOR_MODIFIABLE,
28642864
)
28652865

2866+
# When handling grouphashes during ingest, use the cache to reduce postgres load. With secondary
2867+
# grouphashes we want to use ones which are already there but not create new ones, so we track the
2868+
# boolean result of their `.exists()` check. For all existing grouphashes, secondary or not, we know
2869+
# that if they already have a group assigned they won't be modified, so in that case we also cache
2870+
# the full `GroupHash` object. The killswitch below controls both caches, but they have separate
2871+
# expiry times because the secondary grouphash existence cache is used less frequently and has a
2872+
# lighter memory footprint, so we can afford to cache things there for longer.
2873+
#
2874+
# TODO: Check hit/miss rates for both caches and adjust the two expiry options accordingly.
2875+
register(
2876+
"grouping.use_ingest_grouphash_caching",
2877+
type=Bool,
2878+
default=True,
2879+
flags=FLAG_AUTOMATOR_MODIFIABLE,
2880+
)
2881+
2882+
# How long to cache a boolean indicating whether or not a grouphash exists for a given secondary
2883+
# hash value
2884+
register(
2885+
"grouping.ingest_grouphash_existence_cache_expiry",
2886+
type=Int,
2887+
default=60, # seconds
2888+
flags=FLAG_AUTOMATOR_MODIFIABLE,
2889+
)
2890+
2891+
# How long to cache actual `GroupHash` objects
2892+
register(
2893+
"grouping.ingest_grouphash_object_cache_expiry",
2894+
type=Int,
2895+
default=60, # seconds
2896+
flags=FLAG_AUTOMATOR_MODIFIABLE,
2897+
)
2898+
28662899

28672900
# Sample rate for double writing to experimental dsn
28682901
register(

0 commit comments

Comments
 (0)