feast-dev
diff --git a/‎ARCHITECTURE.md‎
Lines changed: 248 additions & 0 deletions b/‎ARCHITECTURE.md‎
Lines changed: 248 additions & 0 deletions
diff --git a/‎docs/adr/001-pluggable-offline-online-stores.md‎
Lines changed: 39 additions & 0 deletions b/‎docs/adr/001-pluggable-offline-online-stores.md‎
Lines changed: 39 additions & 0 deletions
diff --git a/‎docs/adr/002-registry-design.md‎
Lines changed: 44 additions & 0 deletions b/‎docs/adr/002-registry-design.md‎
Lines changed: 44 additions & 0 deletions
diff --git a/‎docs/adr/003-passthrough-provider.md‎
Lines changed: 39 additions & 0 deletions b/‎docs/adr/003-passthrough-provider.md‎
Lines changed: 39 additions & 0 deletions
@@ -0,0 +1,248 @@
+# Feast Architecture
+
+This document describes the high-level architecture of Feast, the open-source feature store for machine learning. It is intended for contributors, AI agents, and anyone who needs to understand how the codebase is organized.
+
+## System Overview
+
+Feast manages the lifecycle of ML features: from batch data sources through offline storage, materialization into online stores, and low-latency serving for real-time inference. The system is designed around pluggable backends—every storage layer, compute engine, and registry can be swapped independently.
+
+![Feast Architecture](docs/.gitbook/assets/feast-marchitecture-211014.png)
+
+### Component Overview
+
+| Layer | Component | Implementations |
+|-------|-----------|-----------------|
+| **SDK / CLI** | `feast.feature_store` | Python, Go, Java |
+| **Registry** | Metadata catalog | File (S3/GCS), SQL (Postgres/MySQL/SQLite) |
+| **Provider** | Orchestrator | PassthroughProvider |
+| **Offline Store** | Historical retrieval | BigQuery, Snowflake, Redshift, Spark, DuckDB, Postgres, Trino, Athena |
+| **Online Store** | Low-latency serving | Redis, DynamoDB, Bigtable, Postgres, SQLite, Cassandra, Milvus, Qdrant |
+| **Compute Engine** | Materialization jobs | Local, Spark, Kubernetes, Ray, Snowflake, AWS Lambda |
+
+## Core Concepts
+
+| Concept | Description | Definition File |
+|---------|-------------|-----------------|
+| **Entity** | A real-world object (user, product) that features describe | `sdk/python/feast/entity.py` |
+| **FeatureView** | A group of features sourced from a single data source | `sdk/python/feast/feature_view.py` |
+| **OnDemandFeatureView** | Features computed at request time via transformations | `sdk/python/feast/on_demand_feature_view.py` |
+| **StreamFeatureView** | Features derived from streaming data sources | `sdk/python/feast/stream_feature_view.py` |
+| **FeatureService** | A named collection of feature views for a use case | `sdk/python/feast/feature_service.py` |
+| **DataSource** | Connection to raw data (file, warehouse, stream) | `sdk/python/feast/data_source.py` |
+| **Permission** | Authorization policy controlling access to resources | `sdk/python/feast/permissions/permission.py` |
+
+## Key Abstractions
+
+### FeatureStore (`sdk/python/feast/feature_store.py`)
+
+The main entry point for all SDK operations. Users interact with Feast through this class:
+
+- `apply()` — register feature definitions in the registry
+- `get_historical_features()` — point-in-time correct feature retrieval for training
+- `get_online_features()` — low-latency feature retrieval for inference
+- `materialize()` / `materialize_incremental()` — copy features from offline to online store
+- `push()` — push features directly to the online store
+- `teardown()` — remove infrastructure
+
+### Provider (`sdk/python/feast/infra/provider.py`)
+
+Orchestrates the offline store, online store, and compute engine. All cloud providers (GCP, AWS, Azure, local) use `PassthroughProvider`, which delegates directly to the configured store implementations.
+
+### OfflineStore (`sdk/python/feast/infra/offline_stores/offline_store.py`)
+
+Abstract base class for historical feature retrieval. Key methods:
+
+- `get_historical_features()` — point-in-time join of features with entity timestamps
+- `pull_latest_from_table_or_query()` — extract latest entity rows for materialization
+- `pull_all_from_table_or_query()` — extract all rows in a time range
+- `offline_write_batch()` — write features to the offline store
+
+Implementations: BigQuery, Snowflake, Redshift, Spark, Dask, DuckDB, Postgres, Trino, Athena, and more under `infra/offline_stores/contrib/`.
+
+### OnlineStore (`sdk/python/feast/infra/online_stores/online_store.py`)
+
+Abstract base class for low-latency feature serving. Key methods:
+
+- `online_read()` — read features by entity keys
+- `online_write_batch()` — write materialized features
+- `update()` — create/update cloud resources
+- `retrieve_online_documents()` — vector similarity search (for embedding stores)
+
+Implementations: Redis, DynamoDB, Bigtable, Snowflake, SQLite, Postgres, Cassandra, MongoDB, MySQL, Elasticsearch, Milvus, Qdrant, and a HybridOnlineStore that combines multiple backends.
+
+### Registry (`sdk/python/feast/infra/registry/`)
+
+The metadata catalog that stores all feature definitions (entities, feature views, feature services, permissions). Two main implementations:
+
+- **FileRegistry** (`registry.py`) — serializes the entire registry as a single protobuf file, stored on local disk, S3, GCS, or Azure Blob. Uses `RegistryStore` backends for storage.
+- **SqlRegistry** (`sql.py`) — stores metadata in a SQL database (PostgreSQL, MySQL, SQLite).
+
+### ComputeEngine (`sdk/python/feast/infra/compute_engines/base.py`)
+
+Abstract base class for materialization — the process of copying features from the offline store to the online store. Key method:
+
+- `materialize()` — execute materialization tasks, each representing a (feature_view, time_range) pair
+
+Implementations: Local (single-machine), Spark, Kubernetes (K8s Jobs), Ray, Snowflake (SQL-based), AWS Lambda.
+
+The compute engine also includes a DAG module (`compute_engines/dag/`) for building execution plans with nodes, values, and contexts.
+
+### Feature Server (`sdk/python/feast/feature_server.py`)
+
+A FastAPI application that exposes Feast operations over HTTP:
+
+- `POST /get-online-features` — retrieve online features
+- `POST /push` — push features to online/offline stores
+- `POST /materialize` — trigger materialization
+- `POST /materialize-incremental` — incremental materialization
+
+Started via `feast serve` CLI command.
+
+## Permissions and Authorization
+
+The permissions system (`sdk/python/feast/permissions/`) provides fine-grained access control:
+
+| Component | File | Purpose |
+|-----------|------|---------|
+| `Permission` | `permission.py` | Policy definition (resource type + action + roles) |
+| `SecurityManager` | `security_manager.py` | Runtime permission enforcement |
+| `AuthManager` | `auth/auth_manager.py` | Token extraction and parsing |
+| `AuthConfig` | `auth_model.py` | Auth configuration (OIDC, Kubernetes, NoAuth) |
+
+Auth flow: Client sends token → AuthManager extracts identity → SecurityManager checks Permission policies → access granted or denied.
+
+Server-side enforcement is implemented for REST (`permissions/server/rest.py`), gRPC (`permissions/server/grpc.py`), and Arrow Flight protocols. Client-side interceptors handle token injection for each transport.
+
+## CLI
+
+The Feast CLI (`sdk/python/feast/cli/cli.py`) is built with Click and provides commands for:
+
+- `feast apply` — register feature definitions
+- `feast materialize` / `feast materialize-incremental` — run materialization
+- `feast serve` — start the feature server
+- `feast plan` — preview changes before applying
+- `feast teardown` — remove infrastructure
+- `feast init` — scaffold a new feature repository
+
+## Kubernetes Operator
+
+The Feast Operator (`infra/feast-operator/`) is a Go-based Kubernetes operator built with controller-runtime (Kubebuilder):
+
+| Component | Location | Purpose |
+|-----------|----------|---------|
+| CRD (`FeatureStore`) | `api/v1/featurestore_types.go` | Custom Resource Definition |
+| Reconciler | `internal/controller/featurestore_controller.go` | Main control loop |
+| Service handlers | `internal/controller/services/` | Manage Deployments, Services, ConfigMaps |
+| AuthZ | `internal/controller/authz/` | RBAC/authorization setup |
+
+The operator watches `FeatureStore` custom resources and reconciles Deployments, Services, ConfigMaps, Secrets, CronJobs, and HPAs to run Feast components in Kubernetes.
+
+**Phases**: Ready, Pending, Failed
+**Conditions**: ClientReady, OfflineStoreReady, OnlineStoreReady, RegistryReady, UIReady, AuthorizationReady, CronJobReady
+
+## Protobuf Definitions
+
+All cross-language data models and service interfaces are defined in Protocol Buffers (`protos/feast/`):
+
+```
+protos/feast/
+├── core/          # Data models: Entity, FeatureView, FeatureService, Permission, Registry
+├── serving/       # ServingService, TransformationService gRPC APIs
+├── registry/      # RegistryServer gRPC API
+├── storage/       # Redis storage format
+└── types/         # Primitive types: Value, EntityKey, Field
+```
+
+Protos are compiled to Python (`make compile-protos-python`), Go (`make compile-protos-go`), and Java.
+
+## Multi-Language SDKs
+
+| SDK | Location | Purpose |
+|-----|----------|---------|
+| **Python** | `sdk/python/` | Primary SDK — full feature store implementation |
+| **Go** | `go/` | Embedded online feature retrieval |
+| **Java** | `java/` | Serving client and feature server |
+
+The Python SDK is the canonical implementation. The Go and Java SDKs provide serving capabilities and client libraries.
+
+## Directory Structure
+
+```
+feast/
+├── sdk/python/feast/              # Python SDK (primary implementation)
+│   ├── cli/                       #   CLI commands (Click)
+│   ├── infra/                     #   Infrastructure abstractions
+│   │   ├── offline_stores/        #     Offline store implementations
+│   │   ├── online_stores/         #     Online store implementations
+│   │   ├── compute_engines/       #     Materialization engines
+│   │   ├── registry/              #     Registry implementations
+│   │   ├── feature_servers/       #     Feature server deployments
+│   │   └── common/                #     Shared infra code
+│   ├── permissions/               #   Authorization system
+│   ├── transformation/            #   Feature transformations
+│   ├── templates/                 #   Project templates
+│   └── feature_store.py           #   Main FeatureStore class
+├── go/                            # Go SDK
+├── java/                          # Java SDK (serving + client)
+├── protos/                        # Protocol Buffer definitions
+├── ui/                            # React/TypeScript web UI
+├── infra/                         # Infrastructure and deployment
+│   ├── feast-operator/            #   Kubernetes operator (Go)
+│   ├── charts/                    #   Helm charts
+│   ├── scripts/                   #   Build and release scripts
+│   ├── terraform/                 #   Cloud infrastructure (IaC)
+│   └── templates/                 #   Configuration templates
+├── docs/                          # Documentation (GitBook)
+├── examples/                      # Example feature repositories
+└── Makefile                       # Build targets (80+ targets)
+```
+
+## Data Flow
+
+### Training (Offline)
+
+```
+Data Source → OfflineStore.get_historical_features() → Point-in-Time Join → Training DataFrame
+```
+
+1. User defines `FeatureView` + `Entity` + `DataSource`
+2. User calls `store.get_historical_features(entity_df, features)`
+3. OfflineStore performs point-in-time join against the data source
+4. Returns a `RetrievalJob` that materializes to a DataFrame or Arrow table
+
+### Serving (Online)
+
+```
+OfflineStore → ComputeEngine.materialize() → OnlineStore → FeatureServer → Inference
+```
+
+1. `feast materialize` triggers the compute engine
+2. ComputeEngine reads latest values from the offline store
+3. Values are written to the online store via `OnlineStore.online_write_batch()`
+4. Feature server or SDK reads from online store via `OnlineStore.online_read()`
+
+### Push-Based Ingestion
+
+```
+Application → FeatureStore.push() → OnlineStore (+ optionally OfflineStore)
+```
+
+Features can be pushed directly without materialization, useful for streaming or real-time features.
+
+## Extension Points
+
+Feast is designed for extensibility. To add a new backend:
+
+1. **Offline Store**: Subclass `OfflineStore` and `OfflineStoreConfig` in `infra/offline_stores/contrib/`
+2. **Online Store**: Subclass `OnlineStore` and `OnlineStoreConfig` in `infra/online_stores/`
+3. **Compute Engine**: Subclass `ComputeEngine` in `infra/compute_engines/`
+4. **Registry Store**: Subclass `RegistryStore` in `infra/registry/`
+
+Register the new implementation in `RepoConfig` (see `repo_config.py` for the class resolution logic).
+
+## Related Documents
+
+- [Development Guide](docs/project/development-guide.md) — build, test, and debug instructions
+- [ADR Index](docs/adr/README.md) — architecture decision records
+- [Operator README](infra/feast-operator/README.md) — Kubernetes operator documentation
+- [Helm Charts](infra/charts/) — deployment configuration
@@ -0,0 +1,39 @@
+# ADR-001: Pluggable Offline and Online Store Architecture
+
+## Status
+
+Accepted (January 2021)
+
+## Context
+
+Feast needs to support a wide variety of data infrastructure backends. Different organizations use different data warehouses (BigQuery, Snowflake, Redshift, Spark) for historical feature storage and different databases (Redis, DynamoDB, Bigtable, Postgres) for low-latency serving. A monolithic approach would require every user to install dependencies for all backends and would make it difficult for the community to contribute new integrations.
+
+## Decision
+
+Define abstract base classes `OfflineStore` and `OnlineStore` that declare the interface each backend must implement. Each backend is a separate module that can be selected via `RepoConfig`. Contributed backends live under `infra/offline_stores/contrib/` to keep them separate from core-maintained implementations.
+
+Key interface methods:
+
+- **OfflineStore**: `get_historical_features()`, `pull_latest_from_table_or_query()`, `pull_all_from_table_or_query()`, `offline_write_batch()`
+- **OnlineStore**: `online_read()`, `online_write_batch()`, `update()`, `teardown()`, `retrieve_online_documents()`
+
+Backend selection is done via string identifiers in `feature_store.yaml` (e.g., `offline_store: bigquery`), which are resolved to Python classes at runtime through `RepoConfig`.
+
+## Consequences
+
+**Positive:**
+- Users only install dependencies for their chosen backends
+- New backends can be added without modifying core code
+- Community contributions are isolated under `contrib/`
+- Configuration-driven backend selection simplifies deployment
+
+**Negative:**
+- Interface changes require updates across all implementations
+- Testing matrix grows with each new backend
+- Contributed backends may have inconsistent quality or maintenance levels
+
+## References
+
+- `sdk/python/feast/infra/offline_stores/offline_store.py` — OfflineStore base class
+- `sdk/python/feast/infra/online_stores/online_store.py` — OnlineStore base class
+- `sdk/python/feast/repo_config.py` — Backend resolution logic
@@ -0,0 +1,44 @@
+# ADR-002: Registry as Serialized Protobuf Metadata Store
+
+## Status
+
+Accepted (January 2021)
+
+## Context
+
+Feast needs a metadata catalog to store feature definitions (entities, feature views, feature services, data sources, permissions). This registry must be accessible from multiple environments (local development, CI/CD, production serving) and should not require heavy infrastructure for simple deployments.
+
+## Decision
+
+Implement the registry as a single serialized Protocol Buffer file (`Registry.proto`) that can be stored on local disk or cloud object storage (S3, GCS, Azure Blob). This is the `FileRegistry` implementation, backed by pluggable `RegistryStore` classes.
+
+For production deployments needing concurrent access and transactional updates, provide `SqlRegistry` as an alternative that stores metadata in a SQL database (PostgreSQL, MySQL, SQLite).
+
+Both implementations share the `BaseRegistry` abstract interface, ensuring consistent behavior regardless of backend.
+
+**Registry store backends:**
+- `FileRegistryStore` — local filesystem
+- `S3RegistryStore` — Amazon S3
+- `GCSRegistryStore` — Google Cloud Storage
+- `AzureRegistryStore` — Azure Blob Storage
+- `HDFSRegistryStore` — Hadoop HDFS
+
+## Consequences
+
+**Positive:**
+- Zero-infrastructure setup for local development (SQLite file)
+- Cloud-native storage for production (S3/GCS)
+- SQL backend provides transactional semantics for concurrent access
+- Protobuf serialization ensures cross-language compatibility
+
+**Negative:**
+- FileRegistry has no built-in concurrency control (last-writer-wins)
+- Full registry serialization/deserialization on every read (mitigated by TTL-based caching)
+- Two distinct implementations to maintain (File and SQL)
+
+## References
+
+- `sdk/python/feast/infra/registry/registry.py` — FileRegistry implementation
+- `sdk/python/feast/infra/registry/sql.py` — SqlRegistry implementation
+- `sdk/python/feast/infra/registry/base_registry.py` — BaseRegistry interface
+- `protos/feast/core/Registry.proto` — Registry protobuf definition
@@ -0,0 +1,39 @@
+# ADR-003: PassthroughProvider as Universal Provider
+
+## Status
+
+Accepted (June 2021)
+
+## Context
+
+The `Provider` abstraction was originally intended to encapsulate cloud-specific logic. Separate providers existed for GCP, AWS, Azure, and local deployments. However, the pluggable offline/online store architecture (ADR-001) already handles backend-specific logic, making separate providers redundant. Maintaining multiple providers with nearly identical code increased maintenance burden.
+
+## Decision
+
+Collapse all cloud-specific providers into a single `PassthroughProvider` that delegates all operations to the configured offline store, online store, and compute engine. The provider string in configuration (`gcp`, `aws`, `azure`, `local`) still exists for backward compatibility but all resolve to `PassthroughProvider`.
+
+```python
+PROVIDERS_CLASS_FOR_TYPE = {
+    "gcp": "feast.infra.passthrough_provider.PassthroughProvider",
+    "aws": "feast.infra.passthrough_provider.PassthroughProvider",
+    "local": "feast.infra.passthrough_provider.PassthroughProvider",
+    "azure": "feast.infra.passthrough_provider.PassthroughProvider",
+}
+```
+
+## Consequences
+
+**Positive:**
+- Single provider implementation to maintain
+- Backend-specific logic lives where it belongs (in store implementations)
+- Reduced code duplication across providers
+- Simpler mental model for contributors
+
+**Negative:**
+- The `Provider` abstraction is now a thin orchestration layer with minimal logic
+- The provider config field is still required but functionally meaningless (any value maps to the same class)
+
+## References
+
+- `sdk/python/feast/infra/provider.py` — Provider base class and type mapping
+- `sdk/python/feast/infra/passthrough_provider.py` — PassthroughProvider implementation