The PcReservePoller periodically checks for new patron reservations on PCs, merges this information with patron data from Sierra and Redshift, obfuscates identifying patron information, and writes the result to PcReserve Kinesis streams for ingest into the BIC.
- Add your
AWS_PROFILEto the config file for the environment you want to run- Alternatively, you can manually export it (e.g.
export AWS_PROFILE=<profile>)
- Alternatively, you can manually export it (e.g.
- Run
ENVIRONMENT=<env> python3 main.py<env>should be the config filename without the.yamlsuffix. Note that running the poller withproduction.yamlwill actually send records to the production Kinesis stream -- it is not meant to be used for development purposes.make runwill run the poller in using the development environment
- Alternatively, to build and run a Docker container, run:
docker image build -t pc-reserve-poller:local .
docker container run -e ENVIRONMENT=<env> -e AWS_ACCESS_KEY_ID=<> -e AWS_SECRET_ACCESS_KEY=<> pc-reserve-poller:local
This repo uses the Main-QA-Production git workflow.
main has the latest and greatest commits, qa has what's in our QA environment, and production has what's in our production environment.
- Cut a feature branch off of
main - Commit changes to your feature branch
- File a pull request against
mainand assign a reviewer- In order for the PR to be accepted, it must pass all unit tests, have no lint issues, and update the CHANGELOG (or contain the Skip-Changelog label in GitHub)
- After the PR is accepted, merge into
main - Merge
main>qa - Deploy app to QA and confirm it works
- Merge
qa>production - Deploy app to production and confirm it works
The poller is deployed as an AWS ECS service to qa and prod environments. To upload a new QA version of this service, create a new release in GitHub off of the qa branch and tag it qa-vX.X.X. The GitHub Actions deploy-qa workflow will then deploy the code to ECR and update the ECS service appropriately. To deploy to production, create the release from the production branch and tag it production-vX.X.X. To trigger the app to run immediately (rather than waiting for the next scheduled event), run:
# In production, use pc-reserve-poller-app-production:1 for the task definition
aws ecs run-task --cluster pc-reserve-poller-qa --task-definition pc-reserve-poller-app-qa:14 --count 1 --region us-east-1 --profile nypl-digital-devThe first 11 unencrypted variables (every variable through KINESIS_BATCH_SIZE) plus all of the encrypted variables in each environment file are required by the poller to run. There are then eight additional optional variables that can be used for development purposes -- devel.yaml sets each of these. Note that the qa.yaml and production.yaml files are actually read by the deployed service, so do not change these files unless you want to change how the service will behave in the wild -- these are not meant for local testing.
| Name | Notes |
|---|---|
AWS_REGION |
Always us-east-1. The AWS region used for the Redshift, S3, KMS, and Kinesis clients. |
ENVISIONWARE_DB_PORT |
Always 3306 |
ENVISIONWARE_DB_NAME |
Always lasttwodays |
ENVISIONWARE_DB_HOST |
Encrypted Envisionware host. There is no QA Envisionware, so this is always the same. |
ENVISIONWARE_DB_USER |
Encrypted Envisionware user |
ENVISIONWARE_DB_PASSWORD |
Encrypted Envisionware password for the user |
SIERRA_DB_PORT |
Always 1032 |
SIERRA_DB_NAME |
Always iii |
SIERRA_DB_HOST |
Encrypted Sierra host (either test, QA, or prod) |
SIERRA_DB_USER |
Encrypted Sierra user |
SIERRA_DB_PASSWORD |
Encrypted Sierra password for the user |
REDSHIFT_DB_NAME |
Which Redshift database to query (either dev, qa, or production) |
REDSHIFT_DB_HOST |
Encrypted Redshift cluster endpoint |
REDSHIFT_DB_USER |
Encrypted Redshift user |
REDSHIFT_DB_PASSWORD |
Encrypted Redshift password for the user |
KINESIS_STREAM_ARN |
Encrypted ARN for the Kinesis stream the poller sends the encoded data to |
BCRYPT_SALT |
Encrypted bcrypt salt |
S3_BUCKET |
S3 bucket for the cache. This differs between QA and prod and should be empty when not using the cache locally. |
S3_RESOURCE |
Name of the resource for the S3 cache. This differs between QA and prod and should be empty when not using the cache locally. |
PC_RESERVE_SCHEMA_URL |
Platform API endpoint from which to retrieve the PcReserve Avro schema |
KINESIS_BATCH_SIZE |
How many records should be sent to Kinesis at once. Kinesis supports up to 500 records per batch. |
LOG_LEVEL (optional) |
What level of logs should be output. Set to info by default. |
MAX_BATCHES (optional) |
The maximum number of times the poller should poll Envisionware per session. If this is not set, the poller will continue querying until all new records in Envisionware have been processed. |
IGNORE_CACHE (optional) |
Whether fetching and setting the state from S3 should not be done. If this is true, the PCR_DATE_TIME and PCR_KEY environment variables will be used for the initial state (or 2023-01-01 00:00:00 +0000 and 0 by default). |
IGNORE_KINESIS (optional) |
Whether sending the encoded records to Kinesis should not be done |
PCR_DATE_TIME (optional) |
If IGNORE_CACHE is true, the datetime to use in the Envisionware query. If IGNORE_CACHE is false, this field is not read. |
PCR_KEY (optional) |
If IGNORE_CACHE is true, the key to use in the Envisionware query. If IGNORE_CACHE is false, this field is not read. |