-
Notifications
You must be signed in to change notification settings - Fork 236
Description
Problem
Currently, when using the RQ engine in docling-serve, the results_ttl (Time To Live) for RQ job results is hardcoded to 4 hours (3_600 * 4 seconds) in the RQOrchestratorConfig class. This value cannot be configured by users, which limits flexibility for different use cases:
- Long-running workflows: Users may need results to persist longer than 4 hours
- Short-lived results: Users may want to reduce Redis memory usage by setting shorter TTLs
- Compliance requirements: Some environments may require specific retention policies
Current Implementation Analysis
1. RQOrchestratorConfig (docling-jobkit)
The results_ttl parameter exists in RQOrchestratorConfig with a default value:
# docling_jobkit/orchestrators/rq/orchestrator.py
class RQOrchestratorConfig(BaseModel):
redis_url: str = "redis://localhost:6379/"
results_ttl: int = 3_600 * 4 # 4 hours default
results_prefix: str = "docling:results"
sub_channel: str = "docling:updates"
scratch_dir: Optional[Path] = NoneThe results_ttl is correctly passed to the RQ Queue constructor:
rq_queue = Queue(
"convert",
connection=conn,
default_timeout=14400,
result_ttl=config.results_ttl, # ✅ Used here
)2. Orchestrator Factory (docling-serve)
However, in orchestrator_factory.py, when creating the RQOrchestratorConfig, the results_ttl parameter is not passed, causing it to always use the default:
# docling_serve/orchestrator_factory.py (lines 310-315)
rq_config = RQOrchestratorConfig(
redis_url=docling_serve_settings.eng_rq_redis_url,
results_prefix=docling_serve_settings.eng_rq_results_prefix,
sub_channel=docling_serve_settings.eng_rq_sub_channel,
scratch_dir=get_scratch(),
# ❌ Missing: results_ttl parameter
)3. Settings (docling-serve)
The DoclingServeSettings class does not include a setting for results_ttl:
# docling_serve/settings.py (lines 79-82)
# RQ engine
eng_rq_redis_url: str = ""
eng_rq_results_prefix: str = "docling:results"
eng_rq_sub_channel: str = "docling:updates"
# ❌ Missing: eng_rq_results_ttl settingProposed Solution
Add a configurable eng_rq_results_ttl setting to DoclingServeSettings that can be set via the DOCLING_SERVE_ENG_RQ_RESULTS_TTL environment variable, following the existing pattern for other RQ engine settings.
Required Changes
Change 1: Add Setting to DoclingServeSettings
File: docling_serve/settings.py
Add the new setting after line 82 (in the RQ engine section):
# RQ engine
eng_rq_redis_url: str = ""
eng_rq_results_prefix: str = "docling:results"
eng_rq_sub_channel: str = "docling:updates"
eng_rq_results_ttl: int = 3_600 * 4 # 4 hours default (matches RQOrchestratorConfig default)Change 2: Pass Setting to RQOrchestratorConfig
File: docling_serve/orchestrator_factory.py
Update the RQOrchestratorConfig instantiation (around line 310) to include the results_ttl parameter:
rq_config = RQOrchestratorConfig(
redis_url=docling_serve_settings.eng_rq_redis_url,
results_prefix=docling_serve_settings.eng_rq_results_prefix,
sub_channel=docling_serve_settings.eng_rq_sub_channel,
results_ttl=docling_serve_settings.eng_rq_results_ttl, # ✅ Add this line
scratch_dir=get_scratch(),
)Usage Examples
Environment Variable
export DOCLING_SERVE_ENG_RQ_RESULTS_TTL=7200 # 2 hours
docling-serve.env File
DOCLING_SERVE_ENG_RQ_RESULTS_TTL=86400 # 24 hoursDocker/Kubernetes
env:
- name: DOCLING_SERVE_ENG_RQ_RESULTS_TTL
value: "3600" # 1 hourBenefits
- Flexibility: Users can configure TTL based on their specific needs
- Consistency: Follows the existing pattern for RQ engine configuration
- Backward Compatible: Default value matches current behavior (4 hours)
- Minimal Changes: Only requires 2 small code changes
- No Breaking Changes: Existing deployments will continue to work with the default value
Testing Considerations
- Verify default value (4 hours) is used when not set
- Verify custom value is respected when set via environment variable
- Verify value is correctly passed to RQ Queue
- Verify TTL behavior in Redis matches configured value