Update index.md Predictions Data Requirements update

lizkane222 · web-flow · commit c96724aca76e · 2024-06-11T19:15:27.000-07:00
diff --git a/src/unify/Traits/predictions/index.md b/src/unify/Traits/predictions/index.md
@@ -54,11 +54,34 @@ When you build a Custom Predictive Goal, you'll first need to select a cohort, o
 
 The target event is the Segment event that you want to predict. In creating a prediction, Segment determines the likelihood of the user performing the target event. Segment lets you include up to two target events and an event property in your prediction.
 
-#### Data requirements
-
-Segment doesn't enforce data requirements for predictions. In machine learning, however, data quality and quantity are critical. Segment recommends that you make predictions for at least 50,000 users and choose a target event that at least 5,000 users have performed in the last 30 days. 
-
-You can create predictions outside of these suggestions, but your results may vary.
+### Data requirements
+
+As with everything in machine learning, better data = better predictions. Trust and performance are Segment's number 1 priority, so we have a number of data checks to ensure that the Predictions we make are of high quality and can be relied upon.
+We do our best to provide this guidance in the UI before you create a trait, but some of our checks can only happen once we start to train a model. If a trait fails, you’ll see an error message and description in the UI. In general, here are Segment's best practices, data requirements, and service limits for Predictions.
+
+#### Definitions
+- **Feature Window**: The time period in the past that contains the data that the model will use for training.
+- **Target Window**: This is the time horizon over which you want to make a prediction. You can select this in the UI for each of the different traits.
+- **Target Event**: This is the event that you are predicting the likelihood of a customer performing.
+So if you want to create a propensity to purchase over the next 30 days, the Target Window would be 30 days, and the Target Event would be `Order Completed`, (or whichever purchase event you are tracking).
+
+#### To get access to predictions, you must : 
+- Track fewer than 100 million users in the Engage Space.
+- Also track fewer than 5,000 event types. _An event type, refers to the total number of distinct events that were seen across all users in an Engage Space within the past 15 days._
+  - If you track more than 5,000 distinct events, please stop tracking enough events to drop below this limit, and then wait around 15 days before trying to create your first prediction.
+  - An event becomes inactive once it has not been sent to an Engage Space within the past 15 days.
+  - To prevent events from reaching your Engage Space, you can modify your event payloads to include `integrations.Personas` as `false`.
+    - For more information on using the integrations object, please see [Spec : Common Fields](https://segment.com/docs/connections/spec/common/#context:~:text=In%20more%20detail%20these%20common%20fields,Destinations%20field%20docs%20for%20more%20details.), [Integrations](https://segment.com/docs/connections/spec/common/#context:~:text=Kotlin-,Integrations,be%20sent%20to%20rest%20of%20the%20destinations%20that%20can%20accept%20it.,-Timestamps), and [Filtering with the Integrations object](https://segment.com/docs/guides/filtering-data/#filtering-with-the-integrations-object).
+    - Analytics.js example : `analytics.track("Button Clicked", {button:"submit form"}, {"integrations":{"Personas":false}})`. 
+- Track more than 1 event type.
+
+#### To have a trait compute successfully, you must :
+- Have at least 5 different event types tracked in the Feature Window.
+- These 5 events must have data that spans 1.5x the length of the Target Window in the past.
+  - So if you are creating a propensity to purchase in the next 60 days, there must be at least 90 days of historical data.
+- If making a prediction for a smaller subset audience, then this audience must contain more than 1 non-anonymous user.
+- At least 100 users performing the Target Event.
+- At least 100 users not performing the Target Event.
 
 > info "Predictive Traits and anonymous events"
 > Predictive Traits are limited to non-anonymous events, which means you'll need to include an additional `external_id` other than `anonymousId` in the targeted events. If want to create Predictive Traits based on anonymous events, reach out to your CSM with your use case for creating an anonymous Predictive Trait and the conditions for trait.