Skip to content

More robust sample testing python #300

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .github/scripts/clean_up_stream_table.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/bin/bash

aws kinesis delete-stream --stream-name $STREAM_NAME || true

# Reset the values of checkpoint, leaseCounter, ownerSwitchesSinceCheckpoint, and leaseOwner in DynamoDB table
echo "Resetting DDB table"
aws dynamodb update-item \
--table-name $APP_NAME \
--key '{"leaseKey": {"S": "shardId-000000000000"}}' \
--update-expression "SET checkpoint = :checkpoint, leaseCounter = :counter, ownerSwitchesSinceCheckpoint = :switches, leaseOwner = :owner" \
--expression-attribute-values '{
":checkpoint": {"S": "TRIM_HORIZON"},
":counter": {"N": "0"},
":switches": {"N": "0"},
":owner": {"S": "AVAILABLE"}
}' \
--return-values NONE
12 changes: 12 additions & 0 deletions .github/scripts/create_stream.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#!/bin/bash
set -e

for i in {1..10}; do
if aws kinesis create-stream --stream-name $STREAM_NAME --shard-count 1; then
break
else
echo "Stream creation failed, attempt $i/10. Waiting $((i * 3)) seconds..."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we scale like this for sleeping?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple retry wasn't thorough enough but an exponential backoff was a bit overkill, I found that linear backoff was a good in-between point

sleep $((i * 3))
fi
done
aws kinesis wait stream-exists --stream-name $STREAM_NAME
32 changes: 32 additions & 0 deletions .github/scripts/manipulate_properties.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#!/bin/bash
set -e

# Manipulate sample.properties file that the KCL application pulls properties from (ex: streamName, applicationName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# Manipulate sample.properties file that the KCL application pulls properties from (ex: streamName, applicationName)
sed -i "" "s/kclpysample/$STREAM_NAME/g" samples/sample.properties
sed -i "" "s/PythonKCLSample/$APP_NAME/g" samples/sample.properties
sed -i "" 's/us-east-5/us-east-1/g' samples/sample.properties
# Depending on the OS, different properties need to be changed
if [[ "$RUNNER_OS" == "macOS" ]]; then
  grep -v "idleTimeBetweenReadsInMillis" samples/sample.properties > samples/temp.properties
  echo "idleTimeBetweenReadsInMillis = 250" >> samples/temp.properties
  mv samples/temp.properties samples/sample.properties
elif [[ "$RUNNER_OS" == "Linux" || "$RUNNER_OS" == "Windows" ]]; then
  sed -i "/idleTimeBetweenReadsInMillis/c\idleTimeBetweenReadsInMillis = 250" samples/sample.properties
...

# Depending on the OS, different properties need to be changed
if [[ "$RUNNER_OS" == "macOS" ]]; then
sed -i "" "s/kclpysample/$STREAM_NAME/g" samples/sample.properties
sed -i "" "s/PythonKCLSample/$APP_NAME/g" samples/sample.properties
sed -i "" 's/us-east-5/us-east-1/g' samples/sample.properties
grep -v "idleTimeBetweenReadsInMillis" samples/sample.properties > samples/temp.properties
echo "idleTimeBetweenReadsInMillis = 250" >> samples/temp.properties
mv samples/temp.properties samples/sample.properties
elif [[ "$RUNNER_OS" == "Linux" ]]; then
sed -i "s/kclpysample/$STREAM_NAME/g" samples/sample.properties
sed -i "s/PythonKCLSample/$APP_NAME/g" samples/sample.properties
sed -i 's/us-east-5/us-east-1/g' samples/sample.properties
sed -i "/idleTimeBetweenReadsInMillis/c\idleTimeBetweenReadsInMillis = 250" samples/sample.properties
elif [[ "$RUNNER_OS" == "Windows" ]]; then
sed -i "s/kclpysample/$STREAM_NAME/g" samples/sample.properties
sed -i "s/PythonKCLSample/$APP_NAME/g" samples/sample.properties
sed -i 's/us-east-5/us-east-1/g' samples/sample.properties
sed -i "/idleTimeBetweenReadsInMillis/c\idleTimeBetweenReadsInMillis = 250" samples/sample.properties

echo '@echo off' > samples/run_script.bat
echo 'python %~dp0\sample_kclpy_app.py %*' >> samples/run_script.bat
sed -i 's/executableName = sample_kclpy_app.py/executableName = samples\/run_script.bat/' samples/sample.properties
else
echo "Unknown OS: $RUNNER_OS"
exit 1
fi

cat samples/sample.properties
15 changes: 15 additions & 0 deletions .github/scripts/put_words_to_stream.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#!/bin/bash
set -e

sample_kinesis_wordputter.py --stream $STREAM_NAME -w cat -w dog -w bird -w lobster -w octopus

# Get records from stream to verify they exist before continuing
SHARD_ITERATOR=$(aws kinesis get-shard-iterator --stream-name $STREAM_NAME --shard-id shardId-000000000000 --shard-iterator-type TRIM_HORIZON --query 'ShardIterator' --output text)
INITIAL_RECORDS=$(aws kinesis get-records --shard-iterator $SHARD_ITERATOR)
RECORD_COUNT_BEFORE=$(echo $INITIAL_RECORDS | jq '.Records | length')

if [ "$RECORD_COUNT_BEFORE" -eq 0 ]; then
echo "No records found in stream. Test cannot proceed."
exit 1
fi
echo "Found $RECORD_COUNT_BEFORE records in stream before KCL start"
45 changes: 45 additions & 0 deletions .github/scripts/start_kcl.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#!/bin/bash
set -e
set -o pipefail

chmod +x samples/sample.properties
chmod +x samples/sample_kclpy_app.py

# Reset the values of checkpoint, leaseCounter, ownerSwitchesSinceCheckpoint, and leaseOwner in DynamoDB table
echo "Resetting checkpoint for shardId-000000000000"
aws dynamodb update-item \
--table-name $APP_NAME \
--key '{"leaseKey": {"S": "shardId-000000000000"}}' \
--update-expression "SET checkpoint = :checkpoint, leaseCounter = :counter, ownerSwitchesSinceCheckpoint = :switches, leaseOwner = :owner" \
--expression-attribute-values '{
":checkpoint": {"S": "TRIM_HORIZON"},
":counter": {"N": "0"},
":switches": {"N": "0"},
":owner": {"S": "AVAILABLE"}
}' \
--return-values NONE

# Get records from stream to verify they exist before continuing
SHARD_ITERATOR=$(aws kinesis get-shard-iterator --stream-name $STREAM_NAME --shard-id shardId-000000000000 --shard-iterator-type TRIM_HORIZON --query 'ShardIterator' --output text)
INITIAL_RECORDS=$(aws kinesis get-records --shard-iterator $SHARD_ITERATOR)
RECORD_COUNT_BEFORE=$(echo $INITIAL_RECORDS | jq '.Records | length')

echo "Found $RECORD_COUNT_BEFORE records in stream before KCL start"

if [[ "$RUNNER_OS" == "macOS" ]]; then
brew install coreutils
KCL_COMMAND=$(amazon_kclpy_helper.py --print_command --java $(which java) --properties samples/sample.properties)
gtimeout 240 $KCL_COMMAND 2>&1 | tee kcl_output.log || [ $? -eq 124 ]
elif [[ "$RUNNER_OS" == "Linux" ]]; then
KCL_COMMAND=$(amazon_kclpy_helper.py --print_command --java $(which java) --properties samples/sample.properties)
timeout 240 $KCL_COMMAND 2>&1 | tee kcl_output.log || [ $? -eq 124 ]
elif [[ "$RUNNER_OS" == "Windows" ]]; then
KCL_COMMAND=$(amazon_kclpy_helper.py --print_command --java $(which java) --properties samples/sample.properties)
timeout 300 $KCL_COMMAND 2>&1 | tee kcl_output.log || [ $? -eq 124 ]
else
echo "Unknown OS: $RUNNER_OS"
exit 1
fi

echo "---------ERROR LOGS HERE-------"
grep -i error kcl_output.log || echo "No errors found in logs"
20 changes: 20 additions & 0 deletions .github/scripts/verify_kcl.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#!/bin/bash
set -e

LEASE_EXISTS=$(aws dynamodb scan --table-name $APP_NAME --select "COUNT" --query "Count" --output text || echo "0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: NUM_LEASES_FOUND

CHECKPOINT_EXISTS=$(aws dynamodb scan --table-name $APP_NAME --select "COUNT" --filter-expression "attribute_exists(checkpoint) AND checkpoint <> :trim_horizon" --expression-attribute-values '{":trim_horizon": {"S": "TRIM_HORIZON"}}' --query "Count" --output text || echo "0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: NUM_CHECKPOINTS_FOUND


echo "Found $LEASE_EXISTS leases and $CHECKPOINT_EXISTS non-TRIM-HORIZON checkpoint in DynamoDB"

echo "Printing checkpoint values"
aws dynamodb scan --table-name $APP_NAME --projection-expression "leaseKey,checkpoint" --output json

if [ "$LEASE_EXISTS" -gt 0 ] && [ "$CHECKPOINT_EXISTS" -gt 0 ]; then
echo "Test passed: Found both leases and non-TRIM_HORIZON checkpoints in DDB (KCL is fully functional)"
exit 0
else
echo "Test failed: KCL not fully functional"
echo "Lease(s) found: $LEASE_EXISTS"
echo "non-TRIM_HORIZON checkpoint(s) found: $CHECKPOINT_EXISTS"
exit 1
fi
112 changes: 0 additions & 112 deletions .github/workflows/privileged-run.yml

This file was deleted.

Loading