docs: explain rollout_id caching (#8746)

okhat · web-flow · commit 8c167dc43624 · 2025-08-31T13:11:37.000-04:00
diff --git a/docs/docs/cheatsheet.md b/docs/docs/cheatsheet.md
@@ -8,6 +8,16 @@ This page will contain snippets for frequent usage patterns.
 
 ## DSPy Programs
 
+### Forcing fresh LM outputs
+
+DSPy caches LM calls. Provide a unique ``rollout_id`` to bypass an existing
+cache entry while still caching the new result:
+
+```python
+predict = dspy.Predict("question -> answer")
+predict(question="1+1", config={"rollout_id": 1})
+```
+
 ### dspy.Signature
 
 ```python
diff --git a/docs/docs/learn/programming/language_models.md b/docs/docs/learn/programming/language_models.md
@@ -167,13 +167,34 @@ gpt_4o_mini = dspy.LM('openai/gpt-4o-mini', temperature=0.9, max_tokens=3000, st
 By default LMs in DSPy are cached. If you repeat the same call, you will get the same outputs. But you can turn off caching by setting `cache=False`.
 
 If you want to keep caching enabled but force a new request (for example, to obtain diverse outputs),
-pass a unique `rollout_id` in your call. Different values ensure a different cache entry while
-still caching future calls with the same inputs and `rollout_id`.
+pass a unique `rollout_id` in your call. DSPy hashes both the inputs and the `rollout_id` when
+looking up a cache entry, so different values force a new LM request while
+still caching future calls with the same inputs and `rollout_id`. The ID is also recorded in
+`lm.history`, which makes it easy to track or compare different rollouts during experiments.
 
 ```python linenums="1"
 lm("Say this is a test!", rollout_id=1)
 ```
 
+You can pass these LM kwargs directly to DSPy modules as well. Supplying them at
+initialization sets the defaults for every call:
+
+```python linenums="1"
+predict = dspy.Predict("question -> answer", rollout_id=1)
+```
+
+To override them for a single invocation, provide a ``config`` dictionary when
+calling the module:
+
+```python linenums="1"
+predict = dspy.Predict("question -> answer")
+predict(question="What is 1 + 52?", config={"rollout_id": 5})
+```
+
+In both cases, ``rollout_id`` is forwarded to the underlying LM, affects
+its caching behavior, and is stored alongside each response so you can
+replay or analyze specific rollouts later.
+
 
 ## Inspecting output and usage metadata.
 
diff --git a/dspy/predict/predict.py b/dspy/predict/predict.py
@@ -17,6 +17,20 @@
 
 
 class Predict(Module, Parameter):
+    """Basic DSPy module that maps inputs to outputs using a language model.
+
+    Args:
+        signature: The input/output signature describing the task.
+        callbacks: Optional list of callbacks for instrumentation.
+        **config: Default keyword arguments forwarded to the underlying
+            language model. These values can be overridden for a single
+            invocation by passing a ``config`` dictionary when calling the
+            module. For example::
+
+                predict = dspy.Predict("q -> a", rollout_id=1)
+                predict(q="What is 1 + 52?", config={"rollout_id": 2})
+    """
+
     def __init__(self, signature: str | type[Signature], callbacks: list[BaseCallback] | None = None, **config):
         super().__init__(callbacks=callbacks)
         self.stage = random.randbytes(8).hex()
@@ -99,7 +113,7 @@ def _forward_preprocess(self, **kwargs):
         assert "new_signature" not in kwargs, "new_signature is no longer a valid keyword argument."
         signature = ensure_signature(kwargs.pop("signature", self.signature))
         demos = kwargs.pop("demos", self.demos)
-        config = dict(**self.config, **kwargs.pop("config", {}))
+        config = {**self.config, **kwargs.pop("config", {})}
 
         # Get the right LM to use.
         lm = kwargs.pop("lm", self.lm) or settings.lm