Skip to content

Commit ebf7de7

Browse files
committed
Document base-only recovery and track overlay build scripts
1 parent 35e7ad4 commit ebf7de7

File tree

4 files changed

+515
-66
lines changed

4 files changed

+515
-66
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -145,6 +145,8 @@ bin/
145145
scripts/*
146146
!scripts/
147147
!scripts/fetch_latest_acs.py
148+
!scripts/fetch_overlays.py
149+
!scripts/build_nibrs_crime_overlay.py
148150

149151
# Local overlay inputs (crime/project data)
150152
/overlays/*

HANDOFF.md

Lines changed: 51 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -4,74 +4,59 @@
44
- Project: `geocompare`
55
- Handoff date: 2026-03-08
66
- Branch: `master`
7-
- Version: `0.6.2`
8-
- Last completed commit at handoff prep: `9d11bd7` (`Bump version to 0.6.2`)
7+
- Version: `0.6.9`
98

10-
## Purpose
11-
GeoCompare builds and queries local demographic/geographic data products (Gazetteer + ACS-derived inputs) and supports ranking, distance, nearest geographies, and profile exports.
9+
## Project Scope
10+
GeoCompare builds and queries local demographic data products from ACS/Gazetteer
11+
inputs and optional overlays.
1212

13-
## Environment
14-
- Python: 3.9+
15-
- Package/deps: managed via `pyproject.toml`
16-
- Common setup:
17-
- `python3 -m pip install -e ".[dev]"`
13+
Core/base overlays:
14+
- Crime overlay (`CRIME`)
15+
- Voter registration overlay (`VOTER REGISTRATION`)
1816

19-
## Canonical CLI (Current)
20-
Use canonical commands and flags only.
17+
Optional/custom overlays:
18+
- User/private metrics, typically in `project_data.csv` (`PROJECT DATA` or
19+
manifest-defined section)
2120

22-
- Build data products:
21+
## Current Data/Overlay Model
22+
- Build command:
2323
- `python3 -m geocompare.interfaces.cli build <data_path>`
24-
- Search:
25-
- `python3 -m geocompare.interfaces.cli query search "san francisco"`
26-
- Profile:
27-
- `python3 -m geocompare.interfaces.cli query profile "San Francisco city, California"`
28-
- Top/bottom with modern filter and scope:
29-
- `python3 -m geocompare.interfaces.cli query top median_year_structure_built --where 'population>=100000' --universe places --in-state ca`
30-
- `python3 -m geocompare.interfaces.cli query bottom median_household_income --where 'population>=50000' --scope places+ca`
31-
- Nearest:
32-
- `python3 -m geocompare.interfaces.cli query nearest "San Francisco city, California" --where 'population>=100000' --universe places --in-state ca -n 10`
33-
- Resolve:
34-
- `python3 -m geocompare.interfaces.cli resolve "San Francisco, CA" --state ca -n 5`
35-
- Export rows:
36-
- `python3 -m geocompare.interfaces.cli export rows ":population :income" --where 'population>=100000' --universe places --in-state ca`
37-
38-
## Important Interface Decisions
39-
- Legacy CLI aliases were purged.
40-
- Removed command aliases like `hv`, `lv`, `cg`, `dist`, `dp`, etc.
41-
- Removed legacy flag names `--geofilter` and `--context`.
42-
- Canonical query flags are:
43-
- Filter: `--where` (short: `-w`)
44-
- Scope string: `--scope` (short: `-s`)
45-
- Explicit scope composition: `--universe`, plus one of `--in-state|--in-county|--in-zcta`
46-
- Legacy tool shims were removed.
47-
- Deleted: `CountyTools.py`, `StateTools.py`, `KeyTools.py`, `SummaryLevelTools.py`
48-
- Canonical modules: `county_lookup.py`, `state_lookup.py`, `county_key_index.py`, `summary_level_parser.py`
49-
50-
## Shell Caveat
51-
- In `zsh`, quote or escape filter expressions that contain `>` or `<`.
52-
- Good: `--where 'population>=100000'`
53-
- Good: `--where population\>=100000`
54-
55-
## Data Expectations
56-
- Build command expects source files under the provided `<data_path>`.
57-
- The project has support for recent ACS table-based inputs and latest Gazetteer-era ingestion logic integrated in prior updates.
58-
- If refreshing download logic, verify source year discovery against the files present in `<data_path>`.
59-
60-
## Validation / Definition of Done
61-
Before ACP:
62-
1. `ruff check geocompare/interfaces/cli.py geocompare/engine.py geocompare/tools tests`
63-
2. `python3 -m pytest -q`
64-
3. Run at least one smoke query:
65-
- `python3 -m geocompare.interfaces.cli --version`
66-
- `python3 -m geocompare.interfaces.cli query search "san francisco" -n 3`
67-
- `python3 -m geocompare.interfaces.cli query top median_year_structure_built --where 'population>=100000' --universe places --in-state ca -n 5`
68-
69-
## ACP Workflow
70-
- Stage only intended files.
71-
- Commit message should describe functional change precisely.
72-
- Push to `origin/master` unless directed otherwise.
73-
74-
## Immediate Backlog Suggestions
75-
1. Add CLI integration tests that invoke argument parsing paths for `--where`, `--scope`, and explicit scope flags.
76-
2. Document canonical CLI examples in `README.md` and remove any remaining historical examples.
77-
3. Optionally add `--where` parser help examples for compound (`:c`/`:cc`) usage.
24+
- Base inputs are discovered from files under `<data_path>`.
25+
- Overlay inputs are discovered under `<data_path>/overlays`.
26+
- Optional manifest support:
27+
- `overlay_manifest.json` (or `manifest.json`) in overlays directory.
28+
- Supports per-metric metadata (`key`, `label`, `section`, `type`, `order`).
29+
- Overlay section placement:
30+
- Base profile sections first.
31+
- Overlay sections appended at bottom.
32+
- Overlay rows deterministically ordered.
33+
34+
## Base-Only Recovery (No Custom Overlay)
35+
To restore a clean base state without private project overlay:
36+
37+
1. Fetch core data:
38+
- `python3 scripts/fetch_latest_acs.py --out-dir <data_path> --archive-existing`
39+
2. Optionally build canonical base overlays:
40+
- `python3 scripts/fetch_overlays.py --out-dir <data_path> --crime-source <src> --voter-source <src>`
41+
3. Ensure custom overlay artifacts are absent:
42+
- remove/relocate `<data_path>/overlays/project_data.csv`
43+
- remove/relocate `<data_path>/overlays/overlay_manifest.json`
44+
4. Rebuild:
45+
- `python3 -m geocompare.interfaces.cli build <data_path>`
46+
47+
## Tracked Scripts
48+
- `scripts/fetch_latest_acs.py`
49+
- `scripts/fetch_overlays.py`
50+
- `scripts/build_nibrs_crime_overlay.py`
51+
52+
## Validation
53+
Recommended checks before ACP:
54+
55+
1. `ruff check tests geocompare/identity geocompare/repository/sqlite_repository.py geocompare/interfaces/cli.py scripts/fetch_overlays.py`
56+
2. `black --check tests geocompare/identity geocompare/repository/sqlite_repository.py geocompare/interfaces/cli.py scripts/fetch_overlays.py`
57+
3. `mypy geocompare/identity geocompare/repository/sqlite_repository.py geocompare/interfaces/cli.py`
58+
4. `PYTHONPATH=. pytest -q`
59+
60+
## License
61+
Repository license is MIT (`LICENSE`). This remains appropriate for the base
62+
project.

README.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -114,6 +114,44 @@ Recommended metadata file (`overlay_manifest.json`) in your overlay repo:
114114
`overlay_manifest.json` is optional today, but recommended for stable naming,
115115
labels, and section placement across overlay builds.
116116

117+
## Base-Only Rebuild (No Custom Overlay)
118+
119+
Use this path to return to a clean, shareable base geocompare state:
120+
121+
1. Fetch ACS + Gazetteer files:
122+
123+
```bash
124+
python3 scripts/fetch_latest_acs.py --out-dir /path/to/data --archive-existing
125+
```
126+
127+
2. (Optional) refresh canonical built-in overlays:
128+
129+
```bash
130+
python3 scripts/fetch_overlays.py \
131+
--out-dir /path/to/data \
132+
--crime-source /path/or/url/to/crime.csv \
133+
--voter-source /path/or/url/to/voter.csv
134+
```
135+
136+
3. Ensure no private overlay file is present:
137+
138+
- Remove or relocate `/path/to/data/overlays/project_data.csv`
139+
- Remove or relocate `/path/to/data/overlays/overlay_manifest.json`
140+
141+
4. Build:
142+
143+
```bash
144+
python3 -m geocompare.interfaces.cli build /path/to/data
145+
```
146+
147+
## Repository Scripts
148+
149+
Tracked operational scripts:
150+
151+
- `scripts/fetch_latest_acs.py`: download/update ACS + Gazetteer inputs.
152+
- `scripts/fetch_overlays.py`: normalize built-in crime/voter overlays.
153+
- `scripts/build_nibrs_crime_overlay.py`: build base crime overlay from NIBRS inputs.
154+
117155
Query workflows:
118156

119157
```bash

0 commit comments

Comments
 (0)