-
It's clear from the docs and various forum posts that streams which are not being regularly published to will not have their segment files cleaned up according to the Retention rules. This is likely to become an issue for our use case. Users create their own streams in the system, and we promise only 7 days' retention of the messages in their stream. We have defined our streams with 10MiB segment size and 7D max-age retention policy. If one of our users decides to suddenly stop publishing to their stream, the segment files will not be cleaned up, so we'll potentially end up with significant wasted disk space. Ideally I hope to have just one 10MiB file on the disk for each of these dormant streams. I had thought that perhaps we could set up a background process to periodically drop a small 'null' message into every stream once every 24 hours to trigger a cleanup but on closer reading it looks like the segment files are only cleaned up when the current segment file fills up and Rabbit goes looking to allocate a new file. In the absence of RabbitMQ server periodically scanning for out of date segment files, what I'm proposing is a command (ideally accessible through the Management API) to trigger the segment file cleanup process, which we could call periodically. I had a thought that we could potentially monitor for publishing activity and purge the stream from application code via the Management API but I'm not sure that DELETE /api/queues/vhost/name/contents is available for streams. (The Management Web UI doesn't appear to have a Purge function for streams.) |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 5 replies
-
I think periodically re-evaluating streams with a time based retention setting makes sense. We'd always need to keep at least one segment, that cannot change, but we should at least be able to clear any other surplus segments automatically. Retention evaluation does have a cost, especially for larger streams but doing it hourly would probably be perfectly fine. |
Beta Was this translation helpful? Give feedback.
-
I've created this issue to implement this functionality in the core streams component. |
Beta Was this translation helpful? Give feedback.
I think periodically re-evaluating streams with a time based retention setting makes sense. We'd always need to keep at least one segment, that cannot change, but we should at least be able to clear any other surplus segments automatically.
Retention evaluation does have a cost, especially for larger streams but doing it hourly would probably be perfectly fine.