@@ -49,6 +49,8 @@ these areas (A is relatively good, F is relatively bad):
4949
5050|https://facebook.github.io/zstd/[Zstd] |`ZstdCompressor` | A- | A- | A+ | `>= 4.0`
5151
52+ |https://facebook.github.io/zstd/[Zstd with Dictionary] |`ZstdDictionaryCompressor` | A- | A- | A++ | `>= 6.0`
53+
5254|http://google.github.io/snappy/[Snappy] |`SnappyCompressor` | A- | A | C | `>= 1.0`
5355
5456|https://zlib.net[Deflate (zlib)] |`DeflateCompressor` | C | C | A | `>= 1.0`
@@ -60,13 +62,112 @@ cycle spent. This is why it is the default choice in Cassandra.
6062
6163For storage critical applications (disk footprint), however, `Zstd` may
6264be a better choice as it can get significant additional ratio to `LZ4` .
65+ For workloads with highly repetitive or similar data patterns,
66+ `ZstdDictionaryCompressor` can achieve even better compression ratios by
67+ training a compression dictionary on representative data samples.
6368
6469`Snappy` is kept for backwards compatibility and `LZ4` will typically be
6570preferable.
6671
6772`Deflate` is kept for backwards compatibility and `Zstd` will typically
6873be preferable.
6974
75+ == ZSTD Dictionary Compression
76+
77+ The `ZstdDictionaryCompressor` extends standard ZSTD compression by using
78+ trained compression dictionaries to achieve superior compression ratios,
79+ particularly for workloads with repetitive or similar data patterns.
80+
81+ === How Dictionary Compression Works
82+
83+ Dictionary compression improves upon standard compression by training a
84+ compression dictionary on representative samples of your data. This
85+ dictionary captures common patterns, repeated strings, and data structures,
86+ allowing the compressor to reference these patterns more efficiently than
87+ discovering them independently in each compression chunk.
88+
89+ === When to Use Dictionary Compression
90+
91+ Dictionary compression is most effective for:
92+
93+ * *Tables with similar row structures*: JSON documents, XML data, or
94+ repeated data schemas benefit significantly from dictionary compression.
95+ * *Storage-critical workloads*: When disk space savings justify the
96+ additional operational overhead of dictionary training and management.
97+ * *Large datasets with repetitive patterns*: The more similar your data,
98+ the better the compression ratio improvement.
99+
100+ Dictionary compression may not be ideal for:
101+
102+ * *Highly random or unique data*: Already-compressed data or cryptographic
103+ data will see minimal benefit.
104+ * *Small tables*: The overhead of dictionary management may outweigh the
105+ storage savings.
106+ * *Frequently changing schemas*: Schema changes may require retraining
107+ dictionaries to maintain optimal compression ratios.
108+
109+ === Dictionary Training
110+
111+ Before dictionary compression can provide optimal results, a compression
112+ dictionary must be trained on representative data samples. Cassandra
113+ supports both manual and automatic training approaches.
114+
115+ ==== Manual Dictionary Training
116+
117+ Use the `nodetool compressiondictionary train` command to manually train
118+ a compression dictionary:
119+
120+ [source,bash]
121+ ----
122+ nodetool compressiondictionary train <keyspace> <table>
123+ ----
124+
125+ The command trains a dictionary by sampling from existing SSTables. If no
126+ SSTables are available on disk (e.g., all data is in memtables), the command
127+ will automatically flush the memtable before sampling.
128+
129+ The training process completes synchronously and displays progress information
130+ including sample count, sample size, and elapsed time. Training typically
131+ completes within minutes for most workloads.
132+
133+ By default, training will only proceed if enough samples have been collected.
134+ To force training even with insufficient samples, use the `--force` or `-f` option:
135+
136+ [source,bash]
137+ ----
138+ nodetool compressiondictionary train --force <keyspace> <table>
139+ ----
140+
141+ This can be useful for testing or when you want to train a dictionary from
142+ limited data during initial setup.
143+
144+ ==== Automatic Dictionary Training
145+
146+ Enable automatic training in `cassandra.yaml`:
147+
148+ [source,yaml]
149+ ----
150+ compression_dictionary_training_auto_train_enabled: true
151+ compression_dictionary_training_sampling_rate: 100 # 1% of writes
152+ ----
153+
154+ When enabled, Cassandra automatically samples write operations and
155+ trains dictionaries in the background based on the configured sampling
156+ rate (range: 1-10000, where 100 = 1% of writes).
157+
158+ === Dictionary Storage and Distribution
159+
160+ Compression dictionaries are stored cluster-wide in the
161+ `system_distributed.compression_dictionaries` table. Each table can
162+ maintain multiple dictionary versions: the current dictionary for
163+ compressing new SSTables, plus historical dictionaries needed for
164+ reading older SSTables.
165+
166+ Dictionaries are identified by `dict_id`, with higher IDs representing
167+ newer dictionaries. Cassandra automatically refreshes dictionaries
168+ across the cluster based on configured intervals, and caches them
169+ locally to minimize lookup overhead.
170+
70171== Configuring Compression
71172
72173Compression is configured on a per-table basis as an optional argument
@@ -105,6 +206,17 @@ should be used with caution, as they require more memory. The default of
105206`3` is a good choice for competing with `Deflate` ratios and `1` is a
106207good choice for competing with `LZ4`.
107208
209+ The `ZstdDictionaryCompressor` supports the same options as
210+ `ZstdCompressor`:
211+
212+ * `compression_level` (default `3`): Same range and behavior as
213+ `ZstdCompressor`. Dictionary compression provides improved ratios at
214+ any compression level compared to standard ZSTD.
215+
216+ NOTE: `ZstdDictionaryCompressor` requires a trained compression
217+ dictionary to achieve optimal results. See the ZSTD Dictionary
218+ Compression section above for training instructions.
219+
108220Users can set compression using the following syntax:
109221
110222[source,cql]
@@ -121,6 +233,25 @@ ALTER TABLE keyspace.table
121233 WITH compression = {'class': 'LZ4Compressor', 'chunk_length_in_kb': 64};
122234----
123235
236+ For dictionary compression:
237+
238+ [source,cql]
239+ ----
240+ CREATE TABLE keyspace.table (id int PRIMARY KEY)
241+ WITH compression = {'class': 'ZstdDictionaryCompressor'};
242+ ----
243+
244+ Or with a specific compression level:
245+
246+ [source,cql]
247+ ----
248+ ALTER TABLE keyspace.table
249+ WITH compression = {
250+ 'class': 'ZstdDictionaryCompressor',
251+ 'compression_level': '3'
252+ };
253+ ----
254+
124255Once enabled, compression can be disabled with `ALTER TABLE` setting
125256`enabled` to `false`:
126257
@@ -140,6 +271,63 @@ immediately, the operator can trigger an SSTable rewrite using
140271`nodetool scrub` or `nodetool upgradesstables -a`, both of which will
141272rebuild the SSTables on disk, re-compressing the data in the process.
142273
274+ == Dictionary Compression Configuration
275+
276+ When using `ZstdDictionaryCompressor`, several additional configuration
277+ options are available in `cassandra.yaml` to control dictionary
278+ management, caching, and training behavior.
279+
280+ === Dictionary Refresh Settings
281+
282+ * `compression_dictionary_refresh_interval` (default: `3600`): How often
283+ (in seconds) to check for and refresh compression dictionaries
284+ cluster-wide. Newly trained dictionaries will be picked up by all nodes
285+ within this interval.
286+ * `compression_dictionary_refresh_initial_delay` (default: `10`): Initial
287+ delay (in seconds) before the first dictionary refresh check after node
288+ startup.
289+
290+ === Dictionary Caching
291+
292+ * `compression_dictionary_cache_size` (default: `10`): Maximum number of
293+ compression dictionaries to cache per table. Higher values reduce lookup
294+ overhead but increase memory usage.
295+ * `compression_dictionary_cache_expire` (default: `3600`): Dictionary
296+ cache entry TTL in seconds. Expired entries are evicted and reloaded on
297+ next access.
298+
299+ === Training Configuration
300+
301+ * `compression_dictionary_training_max_dictionary_size` (default: `65536`):
302+ Maximum size of trained dictionaries in bytes. Larger dictionaries can
303+ capture more patterns but increase memory overhead.
304+ * `compression_dictionary_training_max_total_sample_size` (default:
305+ `10485760`): Maximum total size of sample data to collect for training,
306+ approximately 10MB.
307+ * `compression_dictionary_training_auto_train_enabled` (default: `false`):
308+ Enable automatic background dictionary training. When enabled, Cassandra
309+ samples writes and trains dictionaries automatically.
310+ * `compression_dictionary_training_sampling_rate` (default: `100`):
311+ Sampling rate for automatic training, range 1-10000 where 100 = 1% of
312+ writes. Lower values reduce training overhead but may miss data patterns.
313+
314+ Example configuration:
315+
316+ [source,yaml]
317+ ----
318+ # Dictionary refresh and caching
319+ compression_dictionary_refresh_interval: 3600
320+ compression_dictionary_refresh_initial_delay: 10
321+ compression_dictionary_cache_size: 10
322+ compression_dictionary_cache_expire: 3600
323+
324+ # Automatic training
325+ compression_dictionary_training_auto_train_enabled: false
326+ compression_dictionary_training_sampling_rate: 100
327+ compression_dictionary_training_max_dictionary_size: 65536
328+ compression_dictionary_training_max_total_sample_size: 10485760
329+ ----
330+
143331== Other options
144332
145333* `crc_check_chance` (default: `1.0`): determines how likely Cassandra
@@ -186,6 +374,39 @@ correctness of data on disk, compressed tables allow the user to set
186374probabilistically validate chunks on read to verify bits on disk are not
187375corrupt.
188376
377+ === Dictionary Compression Operational Considerations
378+
379+ When using `ZstdDictionaryCompressor`, additional operational factors
380+ apply:
381+
382+ * *Dictionary Storage*: Compression dictionaries are stored in the
383+ `system_distributed.compression_dictionaries` table and replicated
384+ cluster-wide. Each table maintains current and historical dictionary
385+ versions.
386+ * *Dictionary Cache Memory*: Dictionaries are cached locally on each node
387+ according to `compression_dictionary_cache_size`. Memory overhead is
388+ typically minimal (default 64KB per dictionary × cache size).
389+ * *Dictionary Training Overhead*: Manual training via
390+ `nodetool compressiondictionary train` samples SSTable chunk data and
391+ performs CPU-intensive dictionary training. Consider running training
392+ during off-peak hours.
393+ * *Automatic Training Impact*: When
394+ `compression_dictionary_training_auto_train_enabled` is true, write
395+ operations are sampled based on `compression_dictionary_training_sampling_rate`.
396+ This adds minimal overhead but should be monitored in write-intensive
397+ workloads.
398+ * *Dictionary Refresh*: The dictionary refresh process
399+ (`compression_dictionary_refresh_interval`) checks for new dictionaries
400+ cluster-wide. The default 1-hour interval balances freshness with
401+ overhead.
402+ * *SSTable Compatibility*: Each SSTable is compressed with a specific
403+ dictionary version. Historical dictionaries must be retained to read
404+ older SSTables until they are compacted with new dictionaries.
405+ * *Schema Changes*: Significant schema changes or data pattern shifts may
406+ require retraining dictionaries to maintain optimal compression ratios.
407+ Monitor the `SSTable Compression Ratio` via `nodetool tablestats` to
408+ detect degradation.
409+
189410== Advanced Use
190411
191412Advanced users can provide their own compression class by implementing
0 commit comments