Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 55 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,7 @@ pip install local-storage-utils

This installs the `lsu` CLI.

Installing from TestPyPI (for dry-runs)
-------------------------------------
## Installing from TestPyPI (for dry-runs)

If you want to test publishing to TestPyPI and install the package from the test index, prefer doing that inside a virtual environment. This avoids the "externally-managed-environment" / PEP 668 error you saw when trying to install system-wide on Debian/Ubuntu.

Expand All @@ -39,21 +38,48 @@ Notes:

## Install (from source)

```bash
git clone <this-repo-url>
cd local-storage-utils
python3 -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
pip install -e .
```

### Troubleshooting local installs

- If you see "Command 'python' not found", use `python3 -m venv .venv` (above). Inside the venv, `python` will point to Python 3.
- If you see "externally-managed-environment", you are attempting a system-wide install. Always install into a virtual environment:
- Create a venv: `python3 -m venv .venv && source .venv/bin/activate`
- Then use the venv pip: `python -m pip install -U pip && pip install -e .`
```bash
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add instructions for linux and macos using brew

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added installation instructions for macOS (using Homebrew) and Linux (Debian/Ubuntu, Fedora/RHEL, and Homebrew). Instructions have been tested and verified to work. See commit 65cb11c.

sudo apt-get update && sudo apt-get install -y python3-venv
```

#### Installing Python 3 (if not already installed)

**macOS:**
```bash
# Install Python 3 using Homebrew
brew install [email protected]
```

**Linux (Debian/Ubuntu):**
```bash
# Install Python 3 and venv support
sudo apt-get update && sudo apt-get install -y python3 python3-venv python3-pip
```

**Linux (Fedora/RHEL):**
```bash
# Install Python 3 and venv support
sudo dnf install python3 python3-pip
```

**Linux (using Homebrew):**
```bash
# Install Homebrew first (if not already installed): https://brew.sh
# Then install Python 3
brew install [email protected]
```

## Configuration

Expand Down Expand Up @@ -92,16 +118,16 @@ log_level: "INFO" # (string) Logging level (DEBUG, INFO, WARNI

The keys above map directly to CLI flags (CLI flags override values in `config.yaml`). Omit any option to use sensible defaults.

# local-storage-utils — Quickstart
## Quickstart

Lightweight utilities for analyzing and cleaning Datastore (Firestore in Datastore mode). Works with the Datastore emulator for local integration testing or GCP when using Application Default Credentials.

Lightweight utilities for analyzing and cleaning Datastore (Firestore in Datastore mode). Works with the
Datastore emulator for local integration testing or GCP when using Application Default Credentials.
### Quick overview

Quick overview
- CLI: run commands via `python3 cli.py <command>` (or install the package and use the entrypoint).
- Makefile: convenience targets are provided to create a venv, install deps, and run tests locally.
- CLI: run commands via `python3 cli.py <command>` (or install the package and use the entrypoint).
- Makefile: convenience targets are provided to create a venv, install deps, and run tests locally.

Quickstart (from source)
### Quickstart (from source)
```bash
git clone <this-repo-url>
cd local-storage-utils
Expand All @@ -112,14 +138,15 @@ pip install -U pip
pip install -e .
```

Makefile shortcuts
- `make venv` — create `.venv` and install package in editable mode
- `make unit` — run fast unit tests
- `make integration` — run integration tests (starts/seeds emulator when configured)
### Makefile shortcuts

- `make venv` — create `.venv` and install package in editable mode
- `make unit` — run fast unit tests
- `make integration` — run integration tests (starts/seeds emulator when configured)

Use these targets to get a working dev environment quickly.

Basic CLI examples
### Basic CLI examples
```bash
# list kinds (scans stats or samples)
python3 cli.py analyze-kinds --project my-project
Expand All @@ -131,44 +158,47 @@ python3 cli.py analyze-fields --kind MyKind --group-by batchId
python3 cli.py cleanup --ttl-field expireAt --dry-run
```

Configuration
### Configuration

- Local `config.yaml` is supported; CLI flags override config values.
- Example keys: `project_id`, `emulator_host`, `namespaces`, `kinds`, `kind`, `ttl_field`, `batch_size`, `sample_size`, `enable_parallel`.

Emulator & integration testing
### Emulator & integration testing

- Start & seed emulator locally:
- `./scripts/run_emulator_local.sh` (prefers `.venv/bin/python` to run seeder)
- `./scripts/run_emulator_local.sh --no-seed` to skip seeding
- The seeder accepts `SEED_COUNT` and `SEED_NS_COUNT` env vars to increase dataset size for perf tests.

Run integration tests:

```bash
# create venv and install deps (see Quickstart), then:
make integration
```

Development & tests
### Development & tests

- Run unit tests:
- `make unit` (fast)
- Run full test suite locally:
- `make integration`

Publishing
-------
## Publishing

This project uses the `release` workflow to publish releases to PyPI. Follow the packaging tutorial for a complete guide on packaging and publishing: https://packaging.python.org/en/latest/tutorials/packaging-projects/

We support publishing to either TestPyPI (for dry runs) or the real PyPI. The workflow can be triggered automatically on pushes to `main` or manually via the Actions UI (use the "Run workflow" button). When you run it manually you can set the `publish_target` input to `testpypi` to publish to TestPyPI instead of PyPI.

Secrets and tokens
### Secrets and tokens
- For production publishing to the real PyPI, set the repository secret named `PYPI_API_TOKEN` with a PyPI API token.
- For test publishing to TestPyPI, set the repository secret named `TEST_PYPI_API_TOKEN` with a TestPyPI API token.

The release workflow selects the appropriate token based on the `publish_target` input. Use TestPyPI first to validate packaging and metadata before publishing to the real index.

Notes
## Notes

- `sample_size` bounds per-kind/group analysis to avoid scanning entire datasets. Set to 0 or `null` in config to disable sampling.
- `enable_parallel` (default true) enables multi-threaded processing during analysis and deletion; set to false to force single-threaded behavior.

If you'd like a short walkthrough or to change the default Makefile targets, tell me what you'd prefer and I can adjust the README or Makefile.
pip install ruff black