Skip to content

Commit 8ab8693

Browse files
authored
Merge pull request #1 from khnumdev/cursor/generalize-datastore-analysis-and-cleanup-tools-f98b
Generalize datastore analysis and cleanup tools
2 parents 7f2b8e1 + 8ca20fe commit 8ab8693

File tree

16 files changed

+1052
-1
lines changed

16 files changed

+1052
-1
lines changed

.github/workflows/build.yml

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
name: build
2+
3+
on:
4+
push:
5+
branches: [ main ]
6+
tags: [ "v*" ]
7+
pull_request:
8+
9+
jobs:
10+
ci:
11+
runs-on: ubuntu-latest
12+
strategy:
13+
matrix:
14+
python-version: ["3.9", "3.10", "3.11", "3.12"]
15+
steps:
16+
- uses: actions/checkout@v4
17+
- uses: actions/setup-python@v5
18+
with:
19+
python-version: ${{ matrix.python-version }}
20+
- name: Install
21+
run: |
22+
python -m pip install -U pip
23+
python -m pip install .
24+
python -m pip install pytest
25+
- name: Test
26+
run: |
27+
pytest -q
28+
29+
publish:
30+
needs: ci
31+
if: startsWith(github.ref, 'refs/tags/v')
32+
runs-on: ubuntu-latest
33+
permissions:
34+
id-token: write
35+
contents: read
36+
steps:
37+
- uses: actions/checkout@v4
38+
- uses: actions/setup-python@v5
39+
with:
40+
python-version: '3.11'
41+
- name: Build
42+
run: |
43+
python -m pip install -U pip build
44+
python -m build
45+
- name: Publish to PyPI
46+
uses: pypa/gh-action-pypi-publish@release/v1
47+
with:
48+
user: __token__
49+
password: ${{ secrets.PYPI_API_TOKEN }}

.github/workflows/pr.yml

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
name: pr
2+
3+
on:
4+
pull_request:
5+
6+
jobs:
7+
test:
8+
runs-on: ubuntu-latest
9+
strategy:
10+
matrix:
11+
python-version: ["3.9", "3.10", "3.11", "3.12"]
12+
steps:
13+
- uses: actions/checkout@v4
14+
- uses: actions/setup-python@v5
15+
with:
16+
python-version: ${{ matrix.python-version }}
17+
cache: 'pip'
18+
- name: Install
19+
run: |
20+
python -m pip install -U pip
21+
python -m pip install .
22+
python -m pip install pytest ruff black build pip-audit
23+
- name: Lint
24+
run: |
25+
ruff check .
26+
black --check .
27+
- name: Test
28+
run: pytest -q
29+
- name: Build and verify
30+
run: |
31+
python -m build
32+
twine check dist/* || true
33+
- name: Security audit
34+
run: |
35+
pip-audit -r requirements.txt || true

.github/workflows/release.yml

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
name: release
2+
3+
on:
4+
push:
5+
branches: [ main ]
6+
7+
jobs:
8+
release:
9+
runs-on: ubuntu-latest
10+
permissions:
11+
contents: write
12+
id-token: write
13+
steps:
14+
- uses: actions/checkout@v4
15+
with:
16+
fetch-depth: 0
17+
- uses: actions/setup-python@v5
18+
with:
19+
python-version: '3.11'
20+
cache: 'pip'
21+
- name: Install
22+
run: |
23+
python -m pip install -U pip
24+
python -m pip install .
25+
python -m pip install pytest ruff black build python-semantic-release pip-audit
26+
- name: Lint
27+
run: |
28+
ruff check .
29+
black --check .
30+
- name: Test
31+
run: pytest -q
32+
- name: Build and verify
33+
run: |
34+
python -m build
35+
twine check dist/* || true
36+
- name: Security audit
37+
run: |
38+
pip-audit -r requirements.txt || true
39+
- name: Semantic Release (version, tag, GitHub release, PyPI)
40+
env:
41+
PYPI_TOKEN: ${{ secrets.PYPI_API_TOKEN }}
42+
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
43+
run: semantic-release publish

.gitignore

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -205,3 +205,10 @@ cython_debug/
205205
marimo/_static/
206206
marimo/_lsp/
207207
__marimo__/
208+
209+
# Local configuration
210+
config.yaml
211+
212+
# Editor/OS
213+
.DS_Store
214+
Thumbs.db

CONTRIBUTING.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Contributing
2+
3+
Thanks for your interest in contributing!
4+
5+
- Open an issue to discuss substantial changes.
6+
- Fork and create feature branches from `main`.
7+
- Run formatting and tests before submitting a PR.
8+
9+
## Dev setup
10+
11+
```bash
12+
python -m venv .venv && source .venv/bin/activate
13+
pip install -U pip
14+
pip install -e .
15+
```
16+
17+
## Testing
18+
19+
```bash
20+
python -m pytest -q
21+
```

README.md

Lines changed: 106 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,107 @@
11
# local-storage-utils
2-
Set of scripts and tools for managing GCP Datastore data in local
2+
3+
Utilities for analyzing and managing local Datastore/Firestore (Datastore mode) data. Works with both the Datastore Emulator and GCP using Application Default Credentials.
4+
5+
## Install (PyPI)
6+
7+
```bash
8+
pip install local-storage-utils
9+
```
10+
11+
This installs the `lsu` CLI.
12+
13+
## Install (from source)
14+
15+
git clone <this-repo-url>
16+
cd local-storage-utils
17+
python3 -m venv .venv
18+
source .venv/bin/activate
19+
python -m pip install -U pip
20+
pip install -e .
21+
22+
### Troubleshooting local installs
23+
- If you see "Command 'python' not found", use `python3 -m venv .venv` (above). Inside the venv, `python` will point to Python 3.
24+
- If you see "externally-managed-environment", you are attempting a system-wide install. Always install into a virtual environment:
25+
- Create a venv: `python3 -m venv .venv && source .venv/bin/activate`
26+
- Then use the venv pip: `python -m pip install -U pip && pip install -e .`
27+
```bash
28+
sudo apt-get update && sudo apt-get install -y python3-venv
29+
```
30+
31+
## Configuration
32+
33+
- Create a local `config.yaml` in your working directory. It is gitignored and not included in the repo.
34+
- Any CLI flag overrides values from `config.yaml`.
35+
- If neither config nor flags provide a value, the tool falls back to environment variables (for emulator detection) or sensible defaults.
36+
37+
Example `config.yaml`:
38+
39+
```yaml
40+
project_id: "my-project" # If omitted, ADC/env will be used
41+
emulator_host: "localhost:8010" # If set, uses Datastore Emulator
42+
43+
# Explicit filters (empty means all)
44+
namespaces: [""] # Empty -> iterate all namespaces (including default "")
45+
kinds: [] # Empty -> iterate all kinds per namespace
46+
47+
# Optional defaults
48+
kind: "SourceCollectionStateEntity" # Default for analyze-fields
49+
50+
# Cleanup
51+
ttl_field: "expireAt"
52+
delete_missing_ttl: true
53+
batch_size: 500
54+
55+
# Analysis
56+
group_by_field: null
57+
58+
# Logging
59+
log_level: "INFO"
60+
```
61+
62+
## CLI usage
63+
64+
```bash
65+
# Kind-level counts and size estimates
66+
lsu analyze-kinds --project my-project
67+
68+
# Use all namespaces/kinds by default, or restrict explicitly
69+
lsu analyze-kinds --namespace "" --namespace tenant-a --kind SourceCollectionStateEntity
70+
71+
# Field contribution analysis (falls back to config.kind/config.namespace if not provided)
72+
lsu analyze-fields --kind SourceCollectionStateEntity --namespace "" --group-by batchId
73+
74+
# TTL cleanup across namespaces/kinds (dry-run)
75+
lsu cleanup --ttl-field expireAt --dry-run
76+
77+
# TTL cleanup restricted to specific namespaces/kinds
78+
lsu cleanup --namespace "" --namespace tenant-a --kind pipeline-job
79+
```
80+
81+
Use `--help` on any command for full options. Config can be provided via `config.yaml` or flags.
82+
83+
## Development
84+
85+
- Create a virtual environment and install in editable mode as shown above
86+
- Run tests:
87+
88+
```bash
89+
python -m pip install pytest
90+
pytest -q
91+
```
92+
93+
- Lint/format (optional if you use pre-commit/CI):
94+
```bash
95+
python -m pip install ruff black
96+
ruff check .
97+
black .
98+
```
99+
100+
## Publishing
101+
102+
- Automated: pushing to `main` triggers versioning, tagging, GitHub release, and PyPI publish via semantic-release.
103+
- Prerequisites:
104+
- Add a PyPI token to repo secrets as `PYPI_API_TOKEN`.
105+
- Use conventional commits for proper versioning.
106+
107+
Main branch should be protected (require PRs, disallow direct pushes) in repository settings.

0 commit comments

Comments
 (0)