Skip to content

Commit db61b19

Browse files
committed
Update README and diagrams
1 parent b0be0fb commit db61b19

File tree

5 files changed

+7
-7
lines changed

5 files changed

+7
-7
lines changed

6-Human-Sampling/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,16 +3,16 @@
33
In this section, we explain the design principle and the processes behind the human sampling workflow.
44

55
**Design principle**
6-
The principle for the human-in-the-loop workflow is intended for model improvement by capturing inference images with [Amazon Rekognition Custom Labels](https://aws.amazon.com/rekognition/custom-labels-features/) with low confidence detection result and add them to the training dataset for new training. The principle for human sampling workflow is intended for business improvement. In human sampling, for every `nth` detection, the detection is flagged for human review and the human-labeled result is compared against the detected one, regardless of the original detection confidence level. The sampled detections can then used for qualify control, audit, analytics, etc. Although both workflows use the same [Amazon A2I](https://aws.amazon.com/augmented-ai/) process, inference images flagged as `sampled only` are **not** add to the training dataset.
6+
The principle for human sampling is intended for business improvement. In human sampling, every `nth` detection is flagged for human review and the human-labeled result is compared against the detected one, regardless of the original detection confidence level. The sampled detections can then be used for qualify control, audit, analytics, **manual** model drift detection, etc. Inference images flagged as `sampled only` are **not** added to the training dataset.
77

88
**Workflow**
99
The human sampling workflow is configurable by three [Amazon Parameter Store](https://console.aws.amazon.com/systems-manager/parameters) parameters as explained in [Section 2-Parameter Store](../2-Parameter-Store/). When **Enable-Automatic-Human-Sampling** is enabled, an [Amazon EventBridge](https://aws.amazon.com/eventbridge/) schedule rule is triggered every `nth` minutes as set in **Automatic-Human-Sampling-Frequency** to invoke the processes orchestrated by [AWS Step Functions](https://aws.amazon.com/step-functions/). The processes include:
1010

11-
1. Initialize the starting `query date` with the `Last modified date` of **Enable-Automatic-Human-Sampling**. The purpose is to sample forward instead retroactively finding historic samples.
11+
1. Initialize the starting `query date` with the `Last modified date` of **Enable-Automatic-Human-Sampling**. The purpose is to sample forward instead of retroactively finding historic samples.
1212
2. Query the minimum number (`limit`) of detection events as set in **Human-Sampling-Interval** forward from the `query date`
13-
3. If the query result count is less than the minimum number, then the process ends. Otherwise the last (`limit`) detection event is marked as sampled and the new `query date` is set to the detection event date.
13+
3. If the query result count is less than the minimum number, then the process ends. Otherwise, the last (`limit`) detection event is marked as sampled.
1414
4. Next an A2I human labeling task is created if none exists from another process and the originating source is marked as `Human Sampling`.
15-
5. Repeat Steps 2 through 4.
15+
5. Steps 2 through 4 are repeated with `query date` set to the current sampled detection date.
1616

1717
Finally the labeling task is completed as explained in [Section 5-A2I Human Loop](../5-A2I-Human-Loop/).
1818

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ The state machine is event-driven and divided into four separate states:
3636

3737
4. The A2I Human Loop Data state is invoked by an S3 PutObject event. When the Amazon A2I human review workflow is complete, the output is stored in Amazon S3 by Amazon A2I by default. An S3 PutObject event invokes a Lambda function to redirect the required action to the state machine. The state machine invokes another Lambda function to evaluate the human loop response and place a copy of the initial image into an S3 folder corresponding to the evaluated label. This is the process in which new human-labeled images are added to the training dataset.
3838

39-
5. The Human Sampling state is invoked by an EventBridge schedule rule on a polling schedule set by the model operator as a parameter. The state machine invokes a Lambda function to first query the image inference log to find the last human sampled inference and then query the next qualified inference to be sampled based on the sampling interval as set the model operator. If found, that inference is marked as "sampled" and an human review workflow is created. The process repeats again until all qualified inferences are marked for sampling in the same invocation.
39+
5. The Human Sampling state is invoked by an EventBridge schedule rule on a polling schedule set by the model operator as a parameter. The state machine invokes a Lambda function to first query the image inference log to find the last human sampled inference and then search for the next qualified inference to be sampled based on the sampling interval as set by the model operator. If qualified, that inference is marked as "sampled" and a human review workflow is created. The process repeats again until all qualified inferences are marked for human sampling in the same invocation.
4040

4141
### Architecture overview
4242
This solution is built on AWS [serverless](https://aws.amazon.com/serverless/) architecture. The architecture is shown in the following diagram.
@@ -55,7 +55,7 @@ We use Parameter Store in two different ways. Firstly, we provide a set of seven
5555

5656
We use two EventBridge rules to initiate Step Functions state machine runs. The first rule is based on a Systems Manager event pattern. The Systems Manager rule is triggered by changes to the Parameter Store and initiates the state machine to invoke a Lambda function to apply changes to the impacted resources. The second rule is a schedule rule. The schedule rule is triggered periodically to initiate the state machine to invoke a Lambda function to check for new model training.
5757

58-
We use an [Amazon DynamoDB](https://aws.amazon.com/dynamodb/) NoSQL database to log all Rekognition and A2I events and results for performance analysis and model drift detection. Although we did NOT include an analytics feature in this example, you can quickly deploy [Amazon QuickSight](https://aws.amazon.com/quicksight/) with the DynamoDB table as your source to visualize the performance of your model over time.
58+
We use an [Amazon DynamoDB](https://aws.amazon.com/dynamodb/) NoSQL database to log all Rekognition and A2I events and results for performance analysis and model drift detection. Although we did NOT include an analytics feature in this example, you can use [AWS Glue](https://aws.amazon.com/glue) and [Amazon Athena](https://aws.amazon.com/athena) to run interactive ad hoc SQL queries against the inference logs. With [Amazon QuickSight](https://aws.amazon.com/quicksight/), you can create real-time analytics dashboard to visualize the inference logs.
5959

6060
We use a Step Functions state machine to orchestrate the ML workflow. The state machine initiates different processes based on events received from EventBridge and responses from Lambda. In addition, the state machine uses an internal process such as Wait to wait for model training and deployment to complete and Choice to evaluate for next tasks.
6161

0 commit comments

Comments
 (0)