|
55 | 55 | /** |
56 | 56 | * This is a cache for {@link BitSet} instances that are used with the {@link DocumentSubsetReader}. |
57 | 57 | * It is bounded by memory size and access time. |
58 | | - * |
| 58 | + * <p> |
59 | 59 | * DLS uses {@link BitSet} instances to track which documents should be visible to the user ("live") and which should not ("dead"). |
60 | 60 | * This means that there is a bit for each document in a Lucene index (ES shard). |
61 | 61 | * Consequently, an index with 10 million document will use more than 1Mb of bitset memory for every unique DLS query, and an index |
62 | 62 | * with 1 billion documents will use more than 100Mb of memory per DLS query. |
63 | 63 | * Because DLS supports templating queries based on user metadata, there may be many distinct queries in use for each index, even if |
64 | 64 | * there is only a single active role. |
65 | | - * |
| 65 | + * <p> |
66 | 66 | * The primary benefit of the cache is to avoid recalculating the "live docs" (visible documents) when a user performs multiple |
67 | 67 | * consecutive queries across one or more large indices. Given the memory examples above, the cache is only useful if it can hold at |
68 | 68 | * least 1 large (100Mb or more ) {@code BitSet} during a user's active session, and ideally should be capable of support multiple |
69 | 69 | * simultaneous users with distinct DLS queries. |
70 | | - * |
| 70 | + * <p> |
71 | 71 | * For this reason the default memory usage (weight) for the cache set to 10% of JVM heap ({@link #CACHE_SIZE_SETTING}), so that it |
72 | 72 | * automatically scales with the size of the Elasticsearch deployment, and can provide benefit to most use cases without needing |
73 | 73 | * customisation. On a 32Gb heap, a 10% cache would be 3.2Gb which is large enough to store BitSets representing 25 billion docs. |
74 | | - * |
| 74 | + * <p> |
75 | 75 | * However, because queries can be templated by user metadata and that metadata can change frequently, it is common for the |
76 | 76 | * effetively lifetime of a single DLS query to be relatively short. We do not want to sacrifice 10% of heap to a cache that is storing |
77 | 77 | * BitSets that are not longer needed, so we set the TTL on this cache to be 2 hours ({@link #CACHE_TTL_SETTING}). This time has been |
|
0 commit comments