Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/mkdocs-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,4 +42,4 @@ jobs:
kill -9 $SERVER_PID

- name: mike deploy docs
run: mike deploy 3.2-SNAPSHOT dev -u
run: mike deploy test-version
3 changes: 2 additions & 1 deletion docs/user_guides/fs/feature_group/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,5 @@ This section serves to provide guides and examples for the common usage of abstr
- [Data Types and Schema management](data_types.md)
- [Statistics](statistics.md)
- [Data Validation](data_validation.md)
- [Feature Monitoring](feature_monitoring.md)
- [Feature Monitoring](feature_monitoring.md)
- [Time-To-Live (TTL)](ttl.md)
143 changes: 143 additions & 0 deletions docs/user_guides/fs/feature_group/ttl.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
## Feature Group TTL Usage Guide

Time To Live (TTL) is a feature that automatically expires data in feature groups after a specified time period.
This guide explains when and how to use TTL in your feature groups.

### Use Case: When to Use TTL

TTL is particularly useful for feature groups that contain time-sensitive data that becomes stale or irrelevant after a certain period.
Common use cases include:

- **Regulatory compliance**: Data that must be automatically purged after a retention period for privacy or compliance reasons (e.g., GDPR, HIPAA)
- **Cost optimization**: Reducing storage costs by automatically removing outdated data that is no longer needed for model inference
- **Data freshness**: Ensuring that only recent, relevant data is available for online serving, preventing models from using stale features

For example, if you're building a recommendation system, you might want user interaction features (like "items viewed in the last hour") to automatically expire after 1 hour, ensuring your model only uses current, relevant data.

---

## Getting Started

### Creating a Feature Group with TTL

When creating a new feature group, you can enable TTL by specifying the `ttl` parameter.
The TTL value determines how long data will remain in the feature group before being automatically expired.
The TTL is calculated based on the `event_time` column.
Data rows where `event_time` is older than the TTL period will be automatically removed.

```python
from datetime import datetime, timezone
import pandas as pd

# Assume you already have a feature store handle
# fs = ...

now = datetime.now(timezone.utc)
df = pd.DataFrame(
{
"id": [0, 1, 2],
"timestamp": [now, now, now],
"feature1": [10, 20, 30],
"feature2": ["a", "b", "c"],
}
)

# Create a feature group with TTL enabled (60 seconds)
fg = fs.create_feature_group(
name="fg_ttl_example",
version=1,
primary_key=["id"],
event_time="timestamp",
online_enabled=True,
ttl=60, # TTL in seconds - data will expire after 60 seconds
)

fg.insert(
df,
write_options={
"start_offline_materialization": False,
"wait_for_online_ingestion": True,
},
)

# After 60 seconds, reading online will return empty data
fg.read(online=True) # Returns empty DataFrame after TTL expires
```

For detailed API reference on all possible types of TTL values, see the [FeatureStore.create_feature_group API documentation](https://docs.hopsworks.ai/hopsworks-api/{{{ hopsworks_version }}}/generated/api/feature_store_api/#create_feature_group).

---

## Managing TTL on Existing Feature Groups

### Updating the TTL Value

You can change the TTL value for an existing feature group at any time.
This is useful when you need to adjust the retention period based on changing requirements.

```python
# Get your existing feature group
fg = fs.get_feature_group(
name="fg_ttl_example",
version=1,
)

# Update TTL to a new value (120 seconds = 2 minutes)
fg.enable_ttl(ttl=120)
```

After updating the TTL, the new retention period will apply to all future data insertions and will affect when existing data expires.

---

### Disabling and Re-enabling TTL

You can temporarily disable TTL on a feature group if you need to retain data indefinitely, and then re-enable it later.

#### Disabling TTL

```python
# Disable TTL - data will no longer expire automatically
fg.disable_ttl()
```

#### Re-enabling TTL

When re-enabling TTL, you have two options:

1. **Re-enable with the previous TTL value**: If you don't specify a TTL value, the feature group will use the last TTL value that was set.

```python
# Re-enable TTL using the previous TTL value
fg.enable_ttl()
```

2. **Re-enable with a new TTL value**: Specify a new TTL value when re-enabling.

```python
# Re-enable TTL with a new value (90 seconds)
fg.enable_ttl(ttl=90)
```

**Important**: If TTL was never set on the feature group before, you must provide a TTL value when enabling it.
Otherwise, TTL cannot be enabled.

---

### Enabling TTL on an Existing Feature Group

If you created a feature group without TTL initially, you can enable it later:

```python
# Get an existing feature group that was created without TTL
fg = fs.get_feature_group(
name="fg_existing_no_ttl",
version=1,
)

# Enable TTL for the first time (60 seconds)
fg.enable_ttl(ttl=60)
```

Once enabled, TTL will apply to all data in the feature group based on the `event_time` column.
For detailed API reference on all possible types of TTL values and additional options, see the [FeatureGroup.enable_ttl API documentation](https://docs.hopsworks.ai/hopsworks-api/{{{ hopsworks_version }}}/generated/api/feature_group_api/#enable_ttl).
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@ nav:
- Notification: user_guides/fs/feature_group/notification.md
- On-Demand Transformations: user_guides/fs/feature_group/on_demand_transformations.md
- Online Ingestion Observability: user_guides/fs/feature_group/online_ingestion_observability.md
- Time-To-Live (TTL): user_guides/fs/feature_group/ttl.md
- Feature View:
- user_guides/fs/feature_view/index.md
- Overview: user_guides/fs/feature_view/overview.md
Expand Down