|
| 1 | +# Collection Status Workflows |
| 2 | + |
| 3 | +This document outlines the automated workflows triggered by status changes in Collections. |
| 4 | + |
| 5 | +## Workflow Status Transitions |
| 6 | + |
| 7 | +Collections progress through workflow statuses that trigger specific automated actions: |
| 8 | + |
| 9 | +### Initial Flow |
| 10 | +1. `RESEARCH_IN_PROGRESS` → `READY_FOR_ENGINEERING` |
| 11 | + - Triggers: Creation of initial scraper and indexer configs |
| 12 | + |
| 13 | +2. `READY_FOR_ENGINEERING` → `ENGINEERING_IN_PROGRESS` → `INDEXING_FINISHED_ON_DEV` |
| 14 | + - When indexing finishes, a developer changes the status to `INDEXING_FINISHED_ON_DEV` |
| 15 | + - This will trigger a full text fetch from LRM dev |
| 16 | + - If the fetch completes successfully, it updates the status to `READY_FOR_CURATION` |
| 17 | + |
| 18 | +3. `READY_FOR_CURATION` |
| 19 | + - Triggers creation/update of plugin config |
| 20 | + |
| 21 | +4. `READY_FOR_CURATION` → `CURATION_IN_PROGRESS` → `CURATED` |
| 22 | + - When curation finishes, the curator marks the collection as `CURATED` |
| 23 | + - This triggers the promotion of DeltaUrls to CuratedUrls |
| 24 | + |
| 25 | +5. Quality Check Flow: |
| 26 | + - During quality checks the curator can put the status as `QUALITY_CHECK_PERFECT/MINOR` |
| 27 | + - These passing quality statuses will trigger the addition of the collection to the public query |
| 28 | + - After the PR is merged and SDE Prod server is updated with the latest code, this collection will become visible |
| 29 | + |
| 30 | +### Reindexing Flow |
| 31 | + |
| 32 | +After the main workflow, collections can enter a reindexing cycle: |
| 33 | + |
| 34 | +1. `REINDEXING_NOT_NEEDED` → `REINDEXING_NEEDED_ON_DEV` |
| 35 | + - By default collections do not need reindexing |
| 36 | + - They can be manually marked as reindexing needed on dev |
| 37 | + |
| 38 | +2. `REINDEXING_NEEDED_ON_DEV` → `REINDEXING_FINISHED_ON_DEV` |
| 39 | + - When re-indexing finishes, a developer changes the status to `REINDEXING_FINISHED_ON_DEV` |
| 40 | + - This will trigger a full text fetch from LRM dev |
| 41 | + - If the fetch completes successfully, it updates the status to `REINDEXING_READY_FOR_CURATION` |
| 42 | + |
| 43 | +3. `REINDEXING_READY_FOR_CURATION` → `REINDEXING_CURATED` |
| 44 | + - When re-curation finishes, the curator marks the collection as `REINDEXING_CURATED` |
| 45 | + - This triggers the promotion of DeltaUrls to CuratedUrls |
| 46 | + |
| 47 | +4. `REINDEXING_CURATED` → `REINDEXING_INDEXED_ON_PROD` |
| 48 | + - After the collection has been indexed on Prod, a dev marks it as `REINDEXING_INDEXED_ON_PROD` |
| 49 | + |
| 50 | +## Full Text Import Process |
| 51 | + |
| 52 | +The full text import process integrates with both workflows: |
| 53 | + |
| 54 | +1. Clears existing DumpUrls for the collection |
| 55 | +2. Fetches and processes new full text data in batches |
| 56 | +3. Creates new DumpUrls |
| 57 | +4. Migrates DumpUrls to DeltaUrls |
| 58 | +5. Updates collection status based on context: |
| 59 | + - In main workflow: Updates to `READY_FOR_CURATION` |
| 60 | + - In reindexing: Updates to `REINDEXING_READY_FOR_CURATION` |
| 61 | + |
| 62 | +## Key Models and Files |
| 63 | + |
| 64 | +- `Collection`: Main model handling status transitions |
| 65 | +- `WorkflowStatusChoices`: Enum defining main workflow states |
| 66 | +- `ReindexingStatusChoices`: Enum defining reindexing states |
| 67 | +- `tasks.py`: Contains full text import logic and status updates |
| 68 | +- Signal handler in Collection model manages status change triggers |
0 commit comments