Skip to content

Commit 30f5901

Browse files
committed
feat: fully featured bigtable and hbase clients with tests
1 parent 4b82ded commit 30f5901

38 files changed

+5068
-622
lines changed

.bumpversion.cfg

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
[bumpversion]
2+
current_version = 0.1.0
3+
commit = True
4+
tag = True
5+
tag_name = v{new_version}
6+
7+
[bumpversion:file:pyproject.toml]
8+
search = version = "{current_version}"
9+
replace = version = "{new_version}"

.github/workflows/test.yml

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
name: CI
2+
3+
on:
4+
push:
5+
branches: [main]
6+
tags: ["v*"]
7+
pull_request:
8+
branches: [main]
9+
10+
jobs:
11+
test:
12+
runs-on: ubuntu-latest
13+
strategy:
14+
matrix:
15+
python-version: ["3.11", "3.12"]
16+
17+
env:
18+
BIGTABLE_EMULATOR_HOST: localhost:8086
19+
20+
steps:
21+
- uses: actions/checkout@v4
22+
23+
- name: Start BigTable emulator
24+
run: |
25+
docker run -d -p 8086:8086 --name bigtable-emulator \
26+
gcr.io/google.com/cloudsdktool/google-cloud-cli:emulators \
27+
gcloud beta emulators bigtable start --host-port=0.0.0.0:8086
28+
timeout 30 bash -c 'until nc -z localhost 8086; do sleep 1; done'
29+
30+
- name: Set up Python ${{ matrix.python-version }}
31+
uses: actions/setup-python@v5
32+
with:
33+
python-version: ${{ matrix.python-version }}
34+
35+
- name: Install dependencies
36+
run: |
37+
python -m pip install --upgrade pip
38+
pip install pytest pytest-cov
39+
pip install -e .
40+
41+
- name: Run tests with coverage
42+
run: pytest --cov=kvdbclient --cov-report=xml tests/
43+
44+
- name: Upload coverage to Codecov
45+
if: matrix.python-version == '3.12'
46+
uses: codecov/codecov-action@v5
47+
with:
48+
files: ./coverage.xml
49+
fail_ci_if_error: false
50+
env:
51+
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
52+
53+
publish:
54+
if: startsWith(github.ref, 'refs/tags/v')
55+
needs: test
56+
runs-on: ubuntu-latest
57+
environment: pypi
58+
permissions:
59+
id-token: write
60+
61+
steps:
62+
- uses: actions/checkout@v4
63+
64+
- name: Set up Python
65+
uses: actions/setup-python@v5
66+
with:
67+
python-version: "3.12"
68+
69+
- name: Install build tools
70+
run: pip install build
71+
72+
- name: Build package
73+
run: python -m build
74+
75+
- name: Publish to PyPI
76+
uses: pypa/gh-action-pypi-publish@release/v1

Dockerfile

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,18 @@
1-
FROM python:3.7
1+
FROM python:3.12
2+
3+
WORKDIR /app
4+
5+
# Install Google Cloud SDK + BigTable emulator
6+
RUN apt-get update && \
7+
apt-get install -y apt-transport-https ca-certificates gnupg curl && \
8+
curl -fsSL https://packages.cloud.google.com/apt/doc/apt-key.gpg | \
9+
gpg --dearmor -o /usr/share/keyrings/cloud.google.gpg && \
10+
echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main" \
11+
> /etc/apt/sources.list.d/google-cloud-sdk.list && \
12+
apt-get update && \
13+
apt-get install -y google-cloud-cli google-cloud-cli-bigtable-emulator && \
14+
rm -rf /var/lib/apt/lists/*
215

316
COPY . /app
4-
RUN pip install pip==20.0.1 \
5-
&& pip install --no-cache-dir --upgrade -r requirements.txt \
6-
&& pip install --no-cache-dir --upgrade -r requirements-dev.txt
17+
RUN pip install --no-cache-dir --upgrade -r requirements-dev.txt
18+
RUN pip install -e .

MANIFEST.in

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,2 @@
1-
include requirements.txt
1+
include LICENSE
2+
include README.md

README.md

Lines changed: 55 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,56 @@
11
# KVDbClient
2-
generic key-value database client
2+
3+
[![codecov](https://codecov.io/gh/seung-lab/KVDbClient/graph/badge.svg)](https://app.codecov.io/gh/seung-lab/KVDbClient)
4+
5+
A Python client library providing a unified interface for key-value database backends. Currently supports Google Cloud BigTable and Apache HBase.
6+
7+
Built for:
8+
9+
- Node read/write operations with automatic serialization
10+
- Concurrency control via row-level locking
11+
- Atomic unique ID generation
12+
- Operation logging and auditing
13+
- Configurable column families with per-attribute serializers (NumPy arrays, JSON, Pickle, strings)
14+
15+
## Installation
16+
17+
```bash
18+
pip install kvdbclient
19+
```
20+
21+
For development:
22+
23+
```bash
24+
git clone https://github.com/seung-lab/KVDbClient.git
25+
cd KVDbClient
26+
pip install -e .
27+
```
28+
29+
## Usage
30+
31+
```python
32+
from kvdbclient import get_client_class, BigTableConfig
33+
34+
config = BigTableConfig(PROJECT="my-project", INSTANCE="my-instance", ADMIN=True, READ_ONLY=False)
35+
client = get_client_class("bigtable")("my_table", config)
36+
```
37+
38+
The backend is selected by passing `"bigtable"` or `"hbase"` to `get_client_class()`. Alternatively, `get_default_client_info()` reads configuration from environment variables automatically.
39+
40+
## Backends
41+
42+
**Google BigTable** — Uses the `google-cloud-bigtable` SDK. Configure with `BigTableConfig` or set `BIGTABLE_PROJECT` and `BIGTABLE_INSTANCE` environment variables.
43+
44+
**Apache HBase** — Communicates via the HBase REST API using HTTP. Configure with `HBaseConfig` or set the `HBASE_REST_URL` environment variable.
45+
46+
Set `PCG_BACKEND_TYPE` to `bigtable` or `hbase` to control which backend `get_default_client_info()` uses.
47+
48+
## Testing
49+
50+
```bash
51+
pytest
52+
```
53+
54+
## License
55+
56+
MIT

codecov.yml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
coverage:
2+
status:
3+
project:
4+
default:
5+
target: auto
6+
threshold: 1%
7+
patch:
8+
default:
9+
target: 25%

kvdbclient/__init__.py

Lines changed: 47 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,67 @@
1+
"""
2+
Sub packages/modules for backend storage clients.
3+
Supports Google BigTable and Apache HBase.
4+
5+
A simple client needs to be able to create the table,
6+
store table meta and to write and read node information.
7+
Also needs locking support to prevent race conditions
8+
when modifying root/parent nodes.
9+
10+
In addition, clients with more features like generating unique IDs
11+
and logging facilities can be implemented by inherting respective base classes.
12+
13+
These methods are in separate classes because they are logically related.
14+
This also makes it possible to have different backend storage solutions,
15+
making it possible to use any unique features these solutions may provide.
16+
17+
Please see `base.py` for more details.
18+
"""
19+
120
from collections import namedtuple
21+
from os import environ
22+
from typing import Union
223

24+
from .base import ColumnFamilyConfig, DEFAULT_COLUMN_FAMILIES
25+
from .bigtable import BigTableConfig
26+
from .bigtable import get_client_info as get_bigtable_client_info
327
from .bigtable.client import Client as BigTableClient
4-
from .bigtable.attributes import Attribute
5-
from .serializers import Serializer
6-
from .serializers import UInt64String
28+
from .hbase import HBaseConfig
29+
from .hbase import get_client_info as get_hbase_client_info
30+
from .hbase.client import Client as HBaseClient
31+
32+
ClientType = Union[BigTableClient, HBaseClient]
733

834

935
_backend_clientinfo_fields = ("TYPE", "CONFIG")
10-
_backend_clientinfo_defaults = (None, None)
36+
_backend_clientinfo_defaults = ("bigtable", None)
1137
BackendClientInfo = namedtuple(
1238
"BackendClientInfo",
1339
_backend_clientinfo_fields,
1440
defaults=_backend_clientinfo_defaults,
1541
)
1642

1743

44+
def get_client_class(backend_type: str = "bigtable"):
45+
"""Return the client class for the given backend type."""
46+
backend_type = (backend_type or "bigtable").lower()
47+
if backend_type == "bigtable":
48+
return BigTableClient
49+
elif backend_type == "hbase":
50+
return HBaseClient
51+
else:
52+
raise ValueError(f"Unknown backend type: {backend_type}")
53+
54+
1855
def get_default_client_info():
1956
"""
2057
Load client from env variables.
2158
"""
22-
23-
# TODO make dynamic after multiple platform support is added
24-
from .bigtable import get_client_config as get_bigtable_client_config
59+
backend_type = environ.get("PCG_BACKEND_TYPE", "bigtable").lower()
60+
if backend_type == "hbase":
61+
return BackendClientInfo(
62+
TYPE="hbase", CONFIG=get_hbase_client_info()
63+
)
2564

2665
return BackendClientInfo(
27-
CONFIG=get_bigtable_client_config(admin=True, read_only=False)
66+
TYPE="bigtable", CONFIG=get_bigtable_client_info(admin=True, read_only=False)
2867
)

kvdbclient/__version__.py

Lines changed: 0 additions & 1 deletion
This file was deleted.

0 commit comments

Comments
 (0)