docs: Update documentation for transform_on_write functionality (feast-dev#5286)

franciscojavierarceo · devin-ai-integration[bot] · web-flow · commit d60e414e975c · 2025-04-21T14:58:01.000-04:00
Update documentation for transform_on_write functionality

Co-authored-by: Devin AI &lt;158243242+devin-ai-integration[bot]@users.noreply.github.com&gt;
diff --git a/docs/getting-started/architecture/write-patterns.md b/docs/getting-started/architecture/write-patterns.md
@@ -42,7 +42,7 @@ There are two ways the client can write *feature values* to the online store:
 Precomputed transformations can happen outside of Feast (e.g., via some batch job or streaming application) or inside of the Feast feature server when writing to the online store via the `push` or `write-to-online-store` api. 
 
 ### 2. Computing Transformations On Demand
-On Demand transformations can only happen inside of Feast at either (1) the time of the client's request or (2) when the data producer writes to the online store.
+On Demand transformations can only happen inside of Feast at either (1) the time of the client's request or (2) when the data producer writes to the online store. With the `transform_on_write` parameter, you can control whether transformations are applied during write operations, allowing you to skip transformations for pre-processed data while still enabling transformations during API calls.
 
 ### 3. Hybrid (Precomputed + On Demand)
 The hybrid approach allows for precomputed transformations to happen inside or outside of Feast and have the On Demand transformations happen at client request time. This is particularly convenient for "Time Since Last" types of features (e.g., time since purchase).
diff --git a/docs/getting-started/concepts/data-ingestion.md b/docs/getting-started/concepts/data-ingestion.md
@@ -24,6 +24,8 @@ Ingesting from batch sources is only necessary to power real-time models. This i
 
 A key command to use in Feast is the `materialize_incremental` command, which fetches the _latest_ values for all entities in the batch source and ingests these values into the online store.
 
+When working with On Demand Feature Views with `write_to_online_store=True`, you can also control whether transformations are applied during ingestion by using the `transform_on_write` parameter. Setting `transform_on_write=False` allows you to materialize pre-transformed features without reapplying transformations, which is particularly useful for large batch datasets that have already been processed.
+
 Materialization can be called programmatically or through the CLI:
 
 <details>
diff --git a/docs/reference/beta-on-demand-feature-view.md b/docs/reference/beta-on-demand-feature-view.md
@@ -236,6 +236,42 @@ online_response = store.get_online_features(
 ).to_dict()
 ```
 
+### Materializing Pre-transformed Data
+
+In some scenarios, you may have already transformed your data in batch (e.g., using Spark or another batch processing framework) and want to directly materialize the pre-transformed features without applying transformations during ingestion. Feast supports this through the `transform_on_write` parameter.
+
+When using `write_to_online_store=True` with On Demand Feature Views, you can set `transform_on_write=False` to skip transformations during the write operation. This is particularly useful for optimizing performance when working with large pre-transformed datasets.
+
+```python
+from feast import FeatureStore
+import pandas as pd
+
+store = FeatureStore(repo_path=".")
+
+# Pre-transformed data (transformations already applied)
+pre_transformed_data = pd.DataFrame({
+    "driver_id": [1001],
+    "event_timestamp": [pd.Timestamp.now()],
+    "conv_rate": [0.5],
+    # Pre-calculated values for the transformed features
+    "conv_rate_adjusted": [0.55],  # Already contains the adjusted value
+})
+
+# Write to online store, skipping transformations
+store.write_to_online_store(
+    feature_view_name="transformed_conv_rate",
+    df=pre_transformed_data,
+    transform_on_write=False  # Skip transformation during write
+)
+```
+
+This approach allows for a hybrid workflow where you can:
+1. Transform data in batch using powerful distributed processing tools
+2. Materialize the pre-transformed data without reapplying transformations
+3. Still use the Feature Server to execute transformations during API calls when needed
+
+Even when features are materialized with transformations skipped (`transform_on_write=False`), the feature server can still apply transformations during API calls for any missing values or for features that require real-time computation.
+
 ## CLI Commands
 There are new CLI commands to manage on demand feature views: