Skip to content

Commit e04d7d5

Browse files
authored
chore: Minor language improvements  (feast-dev#4878)
Minor english updates Signed-off-by: Gilad Leifman <[email protected]>
1 parent 170e2f0 commit e04d7d5

File tree

1 file changed

+15
-15
lines changed

1 file changed

+15
-15
lines changed

docs/README.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -11,21 +11,21 @@ for historical feature extraction used in model training and an (2) [online stor
1111
for serving features at low-latency in production systems and applications.
1212

1313
Feast is a configurable operational data system that re-uses existing infrastructure to manage and serve machine learning
14-
features to realtime models. For more details please review our [architecture](getting-started/architecture/overview.md).
14+
features to realtime models. For more details, please review our [architecture](getting-started/architecture/overview.md).
1515

1616
Concretely, Feast provides:
1717

18-
* A python SDK for programtically defining features, entities, sources, and (optionally) transformations
19-
* A python SDK for for reading and writing features to configured offline and online data stores
18+
* A Python SDK for programmatically defining features, entities, sources, and (optionally) transformations
19+
* A Python SDK for reading and writing features to configured offline and online data stores
2020
* An [optional feature server](reference/feature-servers/README.md) for reading and writing features (useful for non-python languages)
2121
* A [UI](reference/alpha-web-ui.md) for viewing and exploring information about features defined in the project
2222
* A [CLI tool](reference/feast-cli-commands.md) for viewing and updating feature information
2323

2424
Feast allows ML platform teams to:
2525

2626
* **Make features consistently available for training and low-latency serving** by managing an _offline store_ (to process historical data for scale-out batch scoring or model training), a low-latency _online store_ (to power real-time prediction)_,_ and a battle-tested _feature server_ (to serve pre-computed features online).
27-
* **Avoid data leakage** by generating point-in-time correct feature sets so data scientists can focus on feature engineering rather than debugging error-prone dataset joining logic. This ensure that future feature values do not leak to models during training.
28-
* **Decouple ML from data infrastructure** by providing a single data access layer that abstracts feature storage from feature retrieval, ensuring models remain portable as you move from training models to serving models, from batch models to realtime models, and from one data infra system to another.
27+
* **Avoid data leakage** by generating point-in-time correct feature sets so data scientists can focus on feature engineering rather than debugging error-prone dataset joining logic. This ensures that future feature values do not leak to models during training.
28+
* **Decouple ML from data infrastructure** by providing a single data access layer that abstracts feature storage from feature retrieval, ensuring models remain portable as you move from training models to serving models, from batch models to real-time models, and from one data infra system to another.
2929

3030
{% hint style="info" %}
3131
**Note:** Feast today primarily addresses _timestamped structured data_.
@@ -44,26 +44,26 @@ serving system must make a request to the feature store to retrieve feature valu
4444

4545
Feast helps ML platform/MLOps teams with DevOps experience productionize real-time models. Feast also helps these teams build a feature platform that improves collaboration between data engineers, software engineers, machine learning engineers, and data scientists.
4646

47-
* *For Data Scientists*: Feast is a a tool where you can easily define, store, and retrieve your features for both model development and model deployment. By using Feast, you can focus on what you do best: build features that power your AI/ML models and maximize the value of your data.
48-
47+
* *For Data Scientists*: Feast is a tool where you can easily define, store, and retrieve your features for both model development and model deployment. By using Feast, you can focus on what you do best: build features that power your AI/ML models and maximize the value of your data.
48+
   
4949
* *For MLOps Engineers*: Feast is a library that allows you to connect your existing infrastructure (e.g., online database, application server, microservice, analytical database, and orchestration tooling) that enables your Data Scientists to ship features for their models to production using a friendly SDK without having to be concerned with software engineering challenges that occur from serving real-time production systems. By using Feast, you can focus on maintaining a resilient system, instead of implementing features for Data Scientists.
50-
51-
* *For Data Engineers*: Feast provides a centralized catalog for storing feature definitions allowing one to maintain a single source of truth for feature data. It provides the abstraction for reading and writing to many different types of offline and online data stores. Using either the provided python SDK or the feature server service, users can write data to the online and/or offline stores and then read that data out again in either low-latency online scenarios for model inference, or in batch scenarios for model training.
50+
   
51+
* *For Data Engineers*: Feast provides a centralized catalog for storing feature definitions, allowing one to maintain a single source of truth for feature data. It provides the abstraction for reading and writing to many different types of offline and online data stores. Using either the provided Python SDK or the feature server service, users can write data to the online and/or offline stores and then read that data out again in either low-latency online scenarios for model inference, or in batch scenarios for model training.
5252

5353
* *For AI Engineers*: Feast provides a platform designed to scale your AI applications by enabling seamless integration of richer data and facilitating fine-tuning. With Feast, you can optimize the performance of your AI models while ensuring a scalable and efficient data pipeline.
5454

5555
## What Feast is not?
5656

5757
### Feast is not
5858

59-
* **an** [**ETL**](https://en.wikipedia.org/wiki/Extract,\_transform,\_load) / [**ELT**](https://en.wikipedia.org/wiki/Extract,\_load,\_transform) **system.** Feast is not a general purpose data pipelining system. Users often leverage tools like [dbt](https://www.getdbt.com) to manage upstream data transformations. Feast does support some [transformations](getting-started/architecture/feature-transformetion.md).
60-
* **a data orchestration tool:** Feast does not manage or orchestrate complex workflow DAGs. It relies on upstream data pipelines to produce feature values and integrations with tools like [Airflow](https://airflow.apache.org) to make features consistently available.
61-
* **a data warehouse:** Feast is not a replacement for your data warehouse or the source of truth for all transformed data in your organization. Rather, Feast is a light-weight downstream layer that can serve data from an existing data warehouse (or other data sources) to models in production.
62-
* **a database:** Feast is not a database, but helps manage data stored in other systems (e.g. BigQuery, Snowflake, DynamoDB, Redis) to make features consistently available at training / serving time
59+
* **An** [**ETL**](https://en.wikipedia.org/wiki/Extract,\_transform,\_load) / [**ELT**](https://en.wikipedia.org/wiki/Extract,\_load,\_transform) **system.** Feast is not a general purpose data pipelining system. Users often leverage tools like [dbt](https://www.getdbt.com) to manage upstream data transformations. Feast does support some [transformations](getting-started/architecture/feature-transformetion.md).
60+
* **A data orchestration tool:** Feast does not manage or orchestrate complex workflow DAGs. It relies on upstream data pipelines to produce feature values and integrations with tools like [Airflow](https://airflow.apache.org) to make features consistently available.
61+
* **A data warehouse:** Feast is not a replacement for your data warehouse or the source of truth for all transformed data in your organization. Rather, Feast is a lightweight downstream layer that can serve data from an existing data warehouse (or other data sources) to models in production.
62+
* **A database:** Feast is not a database, but helps manage data stored in other systems (e.g. BigQuery, Snowflake, DynamoDB, Redis) to make features consistently available at training / serving time
6363

6464
### Feast does not _fully_ solve
6565
* **reproducible model training / model backtesting / experiment management**: Feast captures feature and model metadata, but does not version-control datasets / labels or manage train / test splits. Other tools like [DVC](https://dvc.org/), [MLflow](https://www.mlflow.org/), and [Kubeflow](https://www.kubeflow.org/) are better suited for this.
66-
* **batch feature engineering**: Feast supports on demand and streaming transformations. Feast is also investing in supporting batch transformations.
66+
* **batch feature engineering**: Feast supports on-demand and streaming transformations. Feast is also investing in supporting batch transformations.
6767
* **native streaming feature integration:** Feast enables users to push streaming features, but does not pull from streaming sources or manage streaming pipelines.
6868
* **lineage:** Feast helps tie feature values to model versions, but is not a complete solution for capturing end-to-end lineage from raw data sources to model versions. Feast also has community contributed plugins with [DataHub](https://datahubproject.io/docs/generated/ingestion/sources/feast/) and [Amundsen](https://github.com/amundsen-io/amundsen/blob/4a9d60176767c4d68d1cad5b093320ea22e26a49/databuilder/databuilder/extractor/feast\_extractor.py).
6969
* **data quality / drift detection**: Feast has experimental integrations with [Great Expectations](https://greatexpectations.io/), but is not purpose built to solve data drift / data quality issues. This requires more sophisticated monitoring across data pipelines, served feature values, labels, and model versions.
@@ -75,7 +75,7 @@ Many companies have used Feast to power real-world ML use cases such as:
7575
* Personalizing online recommendations by leveraging pre-computed historical user or item features.
7676
* Online fraud detection, using features that compare against (pre-computed) historical transaction patterns
7777
* Churn prediction (an offline model), generating feature values for all users at a fixed cadence in batch
78-
* Credit scoring, using pre-computed historical features to compute probability of default
78+
* Credit scoring, using pre-computed historical features to compute the probability of default
7979

8080
## How can I get started?
8181

0 commit comments

Comments
 (0)