|
| 1 | +# Codelists |
| 2 | + |
| 3 | +A codelist is a controlled list of valid values for a particular field. |
| 4 | + |
| 5 | +Each codelist definition file describes what the list is for, where the source data comes from, and the structure of the codes. |
| 6 | + |
| 7 | +This helps standardise terminology, improve validation, and make it easier to integrate systems. |
| 8 | + |
| 9 | +## Why |
| 10 | + |
| 11 | +We get several benefits from defining codelists, including: |
| 12 | + |
| 13 | +* **Consistency and reuse** |
| 14 | + Ensures the same values are used everywhere, avoiding subtle variations. |
| 15 | +* **Easier validation** |
| 16 | + Fields referencing a codelist can be checked automatically against a known set of codes. |
| 17 | +* **Easier to maintain** |
| 18 | + One place to update the list and its metadata, rather than chasing copies in multiple specs. |
| 19 | +* **Clear provenance** |
| 20 | + Source and licensing information are explicit, so consumers know where the data came from. |
| 21 | +* **Declarative, not procedural** |
| 22 | + Defined as structured data, so the list is format-neutral and can be processed by different tools and languages. |
| 23 | + |
| 24 | +## Decisions |
| 25 | + |
| 26 | +**Each codelist has a single canonical definition** |
| 27 | +The definition lives in this shared planning application specification repository |
| 28 | + |
| 29 | +**The data can be defined in the repo or elsewhere** |
| 30 | +Some codelists are specific to this specification so the CSV (or other format) containing the actual codes will be included in this repository. Other codelists have wider applicability so they will be elsewhere for wider use. |
| 31 | + |
| 32 | +**Attributes of codelist definitions** |
| 33 | + |
| 34 | +* `codelist` — short, stable identifier (lowercase kebab-case) |
| 35 | +* `name` — singular display name |
| 36 | +* `plural` — plural display name |
| 37 | +* `description` — purpose and scope |
| 38 | +* `organisation` — identifier for the owning organisation |
| 39 | +* `licence` — licence for reuse (e.g. `ogl3`) |
| 40 | +* `source` — URL to the authoritative source data (CSV or API) |
| 41 | +* `fields` — list of column names in the codelist data file |
| 42 | +* `key-field` — column containing the unique identifier for codes |
| 43 | +* `entry-date` — when this codelist definition was first added |
| 44 | +* `end-date` — when this codelist definition was withdrawn (if applicable) |
| 45 | +* `notes` — any extra context or implementation guidance |
| 46 | +* `github-discussion` — link or ID for relevant discussion thread |
| 47 | + |
| 48 | +**The codelist definition is metadata only** |
| 49 | +It describes the list and its columns but does not include the rows themselves. |
| 50 | + |
| 51 | +**Fields in a codelist CSV should match the `fields` attribute** |
| 52 | +This allows automated validation to check that the source file has the expected structure. |
| 53 | + |
| 54 | +## Still to decide |
| 55 | + |
| 56 | +* Should codelist definitions include version information beyond `entry-date` and `end-date`? |
| 57 | +* Should we require `status` (e.g. active, deprecated, experimental)? |
| 58 | +* Do the fields need to be defined? |
| 59 | + |
| 60 | +## Example |
| 61 | + |
| 62 | +Codelist definition: |
| 63 | +```yaml |
| 64 | +--- |
| 65 | +codelist: development-phase |
| 66 | +name: Development phase |
| 67 | +plural: Development phases |
| 68 | +description: | |
| 69 | + The development phase codelist defines the various stages or phases that an extraction of oil and gas project may progress through, such as exploratory and production. This helps standardize the terminology used to describe the status of projects. |
| 70 | +organisation: government-organisation:D1342 |
| 71 | +licence: ogl3 |
| 72 | +entry-date: 2025-08-13 |
| 73 | +end-date: |
| 74 | +fields: |
| 75 | + - field: reference |
| 76 | + - field: name |
| 77 | + - field: description |
| 78 | +key-field: reference |
| 79 | +source: |
| 80 | +notes: |
| 81 | +github-discussion: 194 |
| 82 | +--- |
| 83 | +``` |
| 84 | + |
| 85 | +### Validation rules for codelist definitions |
| 86 | + |
| 87 | +* codelist, name, description, fields, source and key-field must be present |
| 88 | +* every field in fields must appear as a column in the source data |
| 89 | +* the key-field must be unique within the source data |
| 90 | +* if end-date is present, it must be on or after entry-date |
| 91 | +* `source` must be a valid URL |
0 commit comments