TBBETA-2084 Add statusGroup attribute to application#707
TBBETA-2084 Add statusGroup attribute to application#707
Conversation
Added statusGroup attribute
💰 Infracost reportThis pull request is aligned with your company's FinOps policies and the Well-Architected Framework. There are also 8 pre-existing issues in the |
There was a problem hiding this comment.
Pull request overview
Adds a new statusGroup attribute concept alongside application status updates, and refactors the applicant migration script + CI tooling to use uv/pyproject.toml.
Changes:
- Add
StatusGroup(TypeScript + Python) and update the migration to writestatusGrouponAPPLICATION#ROOT. - Refactor migration script APIs (
migrate_item,scan_applicant_table,remove_original_applicants) and expand/adjust unit + integration tests. - Replace
requirements.txt/pytest.iniwithpyproject.toml+uv.lock, and update CI to install/run viauv.
Reviewed changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/applicant_migration/uv.lock | Adds uv lockfile for reproducible Python deps. |
| scripts/applicant_migration/src/migration_script.py | Writes statusGroup, changes function signatures, and rewires Glue entrypoint. |
| scripts/applicant_migration/src/test_unit.py | Updates unit tests for new signatures and statusGroup behavior. |
| scripts/applicant_migration/src/test_integration.py | Updates DynamoDB-local integration tests for new flow and statusGroup. |
| scripts/applicant_migration/pyproject.toml | Introduces pyproject.toml (deps + pytest config) for uv sync. |
| scripts/applicant_migration/.python-version | Pins Python 3.9 for CI/tooling. |
| scripts/applicant_migration/requirements.txt | Removed in favor of pyproject.toml/uv. |
| scripts/applicant_migration/pytest.ini | Removed; pytest config moved to pyproject.toml. |
| scripts/applicant_migration/file_changed.sh | Removed; workflow now uses shared scripts/file_changed.sh. |
| pets-core-services/src/shared/types/enum.ts | Adds StatusGroup enum for shared TS types. |
| .github/workflows/ci.yaml | Switches to uv install/test steps and updates action versions. |
| .github/workflows/deploy.yaml | Updates checkout action version. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| print(f"GJ called with args: {sys.argv}") | ||
|
|
||
| args = getResolvedOptions(sys.argv, ["customer-executor-env-vars"]) | ||
| print(f"Received arguments: {args}") | ||
|
|
||
| try: | ||
| for pair in args["customer_executor_env_vars"].split(','): | ||
| k, v = pair.split('=', 1) | ||
| os.environ[k] = v | ||
| except Exception as e: | ||
| print(f"Error parsing environment variables from arguments: {e}") | ||
| raise | ||
|
|
||
| APPLICANT_TABLE_NAME = args["APPLICANT_TABLE"] | ||
| APPLICATION_TABLE_NAME = args["APPLICATION_TABLE"] | ||
| AWS_REGION = os.getenv("AWS_REGION") | ||
| DRY_RUN = args["DRY_RUN"] |
There was a problem hiding this comment.
The new main argument parsing will raise at runtime: getResolvedOptions is called with only ["customer-executor-env-vars"], but the code later reads args["APPLICANT_TABLE"], args["APPLICATION_TABLE"], and args["DRY_RUN"]. Also, the key used to read the env var blob is "customer_executor_env_vars" (underscores) which does not match the requested option name (hyphens). Update the getResolvedOptions option list and key names (or read these values from os.environ after parsing) so the script can run in Glue without KeyError.
| @@ -89,24 +68,37 @@ | |||
| is_issued = applicationTBRow.get("isIssued") | |||
|
|
|||
| if is_issued == "Yes": | |||
| new_application_status = ApplicationStatus.certificateAvailable | |||
| new_application_status = ApplicationStatus.certificateAvailable.value | |||
| # TODO: should expiryDate also be added to application in this case? | |||
| elif is_issued == "No": | |||
| new_application_status = ApplicationStatus.certificateNotIssued | |||
| new_application_status = ApplicationStatus.certificateNotIssued.value | |||
| else: | |||
| new_application_status = ApplicationStatus.inProgress | |||
| new_application_status = ApplicationStatus.inProgress.value | |||
|
|
|||
| if ( | |||
| status_group is None | |||
| or ( | |||
| status_group != StatusGroup.complete.value | |||
| and status_group != StatusGroup.not_complete.value | |||
| ) | |||
| ): | |||
| if ( | |||
| new_application_status == "Certificate Available" | |||
| or new_application_status == "Certificate Not Issued" | |||
| or new_application_status == "Cancelled" | |||
| ): | |||
| new_status_group = StatusGroup.complete.value | |||
There was a problem hiding this comment.
statusGroup derivation will overwrite existing valid values: new_status_group is initialized to "Not Complete" and only changed when the existing status_group is None/invalid, but update_item always sets statusGroup = :new_status_group. This means an existing "Complete" statusGroup will be reset to "Not Complete". Consider either deriving statusGroup purely from new_application_status, or defaulting new_status_group to the current status_group when it is already valid, and avoid hard-coded status strings by comparing against ApplicationStatus/StatusGroup values.
| print( | ||
| "[MIGRATION] Applicant table pk updated from " | ||
| "{applicationId} to {new_applicant_pk}" | ||
| ) |
There was a problem hiding this comment.
This log line is missing f-string interpolation: it will literally print "{applicationId} to {new_applicant_pk}" instead of the values. Make this an f-string (or use proper formatting) so logs are accurate during migration runs.
| except ClientError as e: | ||
| if e.response["Error"]["Code"] == "ConditionalCheckFailedException": | ||
| print(f"WARNING: No ROOT row in application_tab for pk={applicationId}") | ||
| print(f"Updating applicatoin with pk={applicationId} failed: {e}") |
There was a problem hiding this comment.
Typo in message: "applicatoin" -> "application".
| print(f"Updating applicatoin with pk={applicationId} failed: {e}") | |
| print(f"Updating application with pk={applicationId} failed: {e}") |
| except ClientError as e: | ||
| if e.response["Error"]["Code"] == "ConditionalCheckFailedException": | ||
| print(f"WARNING: No ROOT row in application_tab for pk={applicationId}") | ||
| print(f"Updating applicatoin with pk={applicationId} failed: {e}") | ||
| else: | ||
| raise |
There was a problem hiding this comment.
Now that migrate_item() returns early when the ROOT row is missing, a ConditionalCheckFailedException from update_item most likely means the ConditionExpression failed due to applicantId already existing (not a missing ROOT row). The warning message is misleading; adjust it to reflect the actual failure condition and consider whether the migration should skip writing/deleting when applicantId is already set.
| update_kwargs = self.application_table.update_item.call_args[1] | ||
| ev = update_kwargs["ExpressionAttributeValues"] | ||
| assert ev[":new_application_status"] == ApplicationStatus.certificateAvailable | ||
|
|
||
| def test_live_status_derived_from_tb_certificate_issued_no(self): | ||
| at, apt = _mock_tables() | ||
| """When TB certificate is not issued, status should be certificateNotIssued.""" | ||
| stats = _make_statistics() | ||
| apt.get_item.side_effect = [ | ||
| {}, | ||
| {"Item": {"isIssued": "No"}}, | ||
| dry_run = False | ||
| self.application_table.get_item.side_effect = [ | ||
| {"Item": {"applicationStatus": None}}, # ROOT row with no status | ||
| {"Item": {"isIssued": "No"}}, # TB certificate not issued | ||
| ] | ||
| self.mod.migrate_item((self._base_row(), at, apt, stats)) | ||
|
|
||
| update_kwargs = apt.update_item.call_args[1] | ||
| self.mod.migrate_item( | ||
| self._base_row(), | ||
| self.applicant_table, | ||
| self.application_table, | ||
| dry_run, | ||
| stats, | ||
| ) | ||
|
|
||
| update_kwargs = self.application_table.update_item.call_args[1] | ||
| ev = update_kwargs["ExpressionAttributeValues"] | ||
| assert ev[":new_application_status"] == ApplicationStatus.certificateNotIssued | ||
|
|
||
| def test_live_status_defaults_to_in_progress_when_no_tb_row(self): | ||
| at, apt = _mock_tables() | ||
| def test_live_status_defaults_to_in_progress_when_no_tb_data(self): | ||
| """When no TB row exists, status defaults to inProgress.""" | ||
| stats = _make_statistics() | ||
| apt.get_item.side_effect = [{}, {}] | ||
| self.mod.migrate_item((self._base_row(), at, apt, stats)) | ||
| dry_run = False | ||
| self.application_table.get_item.side_effect = [ | ||
| {"Item": {"applicationStatus": None}}, # ROOT row with no status | ||
| {}, # No TB row | ||
| ] | ||
|
|
||
| self.mod.migrate_item( | ||
| self._base_row(), | ||
| self.applicant_table, | ||
| self.application_table, | ||
| dry_run, | ||
| stats, | ||
| ) | ||
|
|
||
| update_kwargs = apt.update_item.call_args[1] | ||
| update_kwargs = self.application_table.update_item.call_args[1] | ||
| ev = update_kwargs["ExpressionAttributeValues"] | ||
| assert ev[":new_application_status"] == ApplicationStatus.inProgress |
There was a problem hiding this comment.
These assertions compare ExpressionAttributeValues (strings written by migration_script) against Enum members (and in this file the Enum values are currently tuples due to trailing commas). Update assertions to compare to the expected string values (e.g., ApplicationStatus..value) so the tests reflect what DynamoDB receives.
|
|
||
| root = _get_item(application_table, "APPLICATION#abc", "APPLICATION#ROOT") | ||
| # ROOT row didn't exist so update_item should have been a no-op | ||
| # (ConditionalCheckFailed) — verify the TB row is still intact | ||
| tb = _get_item(application_table, "APPLICATION#abc", "APPLICATION#TB#CERTIFICATE") | ||
| assert tb["isIssued"] == "Yes" | ||
|
|
||
| assert root["applicationStatus"] == ApplicationStatus.certificateAvailable | ||
|
|
There was a problem hiding this comment.
These integration assertions compare DynamoDB string fields (e.g., applicationStatus) against Enum members instead of their string values, so they will fail ("Certificate Available" != ApplicationStatus.certificateAvailable). Compare against ApplicationStatus..value to match what is stored in DynamoDB.
|
|
||
| def test_more_than_25_records_are_all_deleted(self, mod, tables, dynamodb_local): | ||
| """26+ records require multiple batch_writer calls — verify none are left behind.""" | ||
| """26+ records require multiple batch_writer calls' |
There was a problem hiding this comment.
Docstring has an unmatched apostrophe after "calls" ("calls'"), which looks accidental and reads oddly. Remove the stray quote.
| """26+ records require multiple batch_writer calls' | |
| """26+ records require multiple batch_writer calls |
| ] | ||
|
|
||
| [tool.setuptools] | ||
| packages = ["applicant_migration"] |
There was a problem hiding this comment.
pyproject.toml declares a setuptools package "applicant_migration", but there is no such package directory in this project (the code lives directly under src/ as migration_script.py). This will likely cause uv sync / build steps to fail when trying to install the project. Either remove the setuptools packaging config (if you only need dependency management), or configure it to package the actual module(s).
| packages = ["applicant_migration"] | |
| package-dir = {"" = "src"} | |
| py-modules = ["migration_script"] |
| // If this is modified, then also the enum (StatusGroup class) in this file should be modified: | ||
| // scripts/application_updates/src/update_application.py | ||
| export enum StatusGroup { | ||
| complete = "Complete", | ||
| notComplete = "Not Complete", | ||
| } |
There was a problem hiding this comment.
This comment references scripts/application_updates/src/update_application.py, but that path/file does not exist in this repo. Please update the reference to the correct location (or remove it) so future enum changes aren't directed to a dead file path.
|



1. What type of PR is this? (check all applicable)
2. What areas of the service does it affect? (check all applicable)
3. Briefly describe the main changes in this PR
4. Checklist