-
Notifications
You must be signed in to change notification settings - Fork 7
Extend the package with the falkordblite capabilities #163
Description
Design: Embedded FalkorDB Support in falkordb-py
Goal
Extend falkordb-py so users can optionally run an embedded FalkorDB instance (no external server required), installed via:
pip install falkordb # remote-only (default, no binaries)
pip install falkordb[lite] # includes embedded redis-server + falkordb.soThe embedded mode reuses the existing FalkorDB / Graph / AsyncGraph classes — users get the same API surface regardless of connection mode.
Architecture Overview
falkordb-py (existing)
├── falkordb/
│ ├── __init__.py
│ ├── falkordb.py # FalkorDB class (remote connections)
│ ├── graph.py # Graph class
│ ├── asyncio/ # Async variants
│ └── ...
│
│ # ── NEW ──
│ ├── lite/ # Embedded server management (lazy-loaded)
│ │ ├── __init__.py
│ │ ├── server.py # EmbeddedServer: manages redis-server lifecycle
│ │ ├── config.py # Redis config generation
│ │ └── binaries.py # Binary resolution (finds redis-server + falkordb.so)
│ │
│ └── falkordb.py # Modified: add embedded= param to constructor
│
├── pyproject.toml # Modified: add [lite] optional dependency
└── ...
Key Principle: New falkordb-bin Package for Binaries
A new package falkordb-bin on PyPI ships only the precompiled redis-server and falkordb.so binaries (per-platform wheels). The falkordb[lite] optional extra declares falkordb-bin as a dependency. All orchestration logic (starting/stopping the server, config generation, etc.) lives inside falkordb-py in the falkordb.lite subpackage.
The existing falkordblite package is left as-is and eventually deprecated — existing users of falkordblite are unaffected and can migrate at their own pace.
This gives us clean separation:
falkordb-binon PyPI = new package, platform-specific binaries only (built via CI, one wheel per OS/arch)falkordbon PyPI = pure Python client + optional embedded orchestration codefalkordb[lite]= both togetherfalkordbliteon PyPI = legacy standalone package (deprecated, eventually archived)
User-Facing API
Constructor Parameter (unified API)
Embedded mode is activated via the constructor, alongside all the standard connection kwargs. No separate factory method — one constructor, one class.
from falkordb import FalkorDB
# Remote (existing, unchanged)
db = FalkorDB(host='localhost', port=6379)
# Embedded — zero config (ephemeral)
db = FalkorDB(embedded=True)
# Embedded — with persistence
db = FalkorDB(embedded=True, db_path='/tmp/my_graph.db')
# Embedded — custom config + standard kwargs
db = FalkorDB(
embedded=True,
db_path='/tmp/my_graph.db',
embedded_config={'maxmemory': '1gb'},
max_connections=32,
socket_timeout=30,
encoding='utf-8',
)
# Usage is identical from here on
g = db.select_graph('social')
g.query('CREATE (n:Person {name: "Alice"}) RETURN n')
# Cleanup (stops the embedded server)
db.close()When embedded=True:
- The constructor spins up a local redis-server + falkordb.so via Unix socket
- All standard kwargs (
socket_timeout,encoding,encoding_errors,retry_on_error, etc.) are applied to the connection to the embedded server - Remote-specific kwargs (
host,port,ssl_*, etc.) are ignored - A connection pool is created automatically for parallel query support
When embedded=False (default):
- Behavior is identical to today — no change whatsoever
Async Support
from falkordb.asyncio import FalkorDB as AsyncFalkorDB
# Async embedded
db = AsyncFalkorDB(embedded=True, db_path='/tmp/async_graph.db')
g = db.select_graph('social')
result = await g.query('MATCH (n) RETURN n')
await db.close()Context Manager
from falkordb import FalkorDB
with FalkorDB(embedded=True, db_path='/tmp/my_graph.db') as db:
g = db.select_graph('social')
g.query('CREATE (n:Person {name: "Alice"}) RETURN n')
# Server automatically stopped on exitImplementation Plan
Phase 1: New Binary Package (falkordb-bin on PyPI)
Create a new repo and PyPI package falkordb-bin that contains only the precompiled binaries and a minimal Python API to locate them. This is a completely separate package from the existing falkordblite.
New repo: FalkorDB/falkordb-bin
falkordb_bin/__init__.py:
import os
import sys
import platform
def get_bin_dir():
"""Return path to directory containing redis-server and falkordb.so/dylib"""
return os.path.join(os.path.dirname(__file__), 'bin')
def get_redis_server():
"""Return path to redis-server binary"""
name = 'redis-server.exe' if sys.platform == 'win32' else 'redis-server'
return os.path.join(get_bin_dir(), name)
def get_falkordb_module():
"""Return path to falkordb module (falkordb.so on Linux, falkordb.dylib on macOS)"""
if sys.platform == 'darwin':
name = 'falkordb.dylib'
else:
name = 'falkordb.so'
return os.path.join(get_bin_dir(), name)The CI builds platform-specific wheels containing:
falkordb_bin/
├── __init__.py
└── bin/
├── redis-server
└── falkordb.so (or falkordb.dylib on macOS)
pyproject.toml for falkordb-bin:
[project]
name = "falkordb-bin"
version = "1.2.0" # Mirrors falkordb client version
description = "Precompiled redis-server and FalkorDB module binaries"
requires-python = ">=3.8"
[build-system]
requires = ["setuptools>=64", "wheel"]
build-backend = "setuptools.build_meta"CI/CD: The build pipeline (GitHub Actions) would:
- Build redis-server from source (there are no official Redis binaries to download)
- Download pre-built
falkordb.so/falkordb.dylibfrom FalkorDB GitHub releases - Package into platform wheels:
falkordb_bin-1.0.0-cp3-none-manylinux_2_17_x86_64.whl,...-macosx_11_0_arm64.whl, etc. - Publish to PyPI
The existing falkordblite build scripts (setup.py, build_scripts/) can be adapted for the Redis compilation step. The FalkorDB binary download replaces the current from-source build, simplifying the pipeline significantly.
Platform matrix (initial):
| Platform | Redis | FalkorDB |
|---|---|---|
| Linux x86_64 | Build from source | Download from GH releases |
| Linux aarch64 | Build from source | Download from GH releases |
| macOS x86_64 | Build from source | Download from GH releases |
| macOS arm64 | Build from source | Download from GH releases |
Windows (follow-up task):
Windows support will be added in a subsequent phase using:
- redis-windows for the Redis server binary
- falkordb-rs-next-gen for a Windows-compatible FalkorDB module
This requires separate work to validate compatibility and will be tracked as a separate task.
Phase 2: Orchestration in falkordb-py
2a. pyproject.toml Changes
[tool.poetry.extras]
lite = ["falkordb-bin"]
[tool.poetry.dependencies]
# ... existing deps ...
falkordb-bin = { version = ">=1.0.0,<2.0.0", optional = true }Or if using standard [project] table:
[project.optional-dependencies]
lite = ["falkordb-bin>=1.0.0,<2.0.0"]Note: falkordb-bin uses the same versioning as falkordb — e.g. falkordb 1.2.0 and falkordb-bin 1.2.0 are released together and known-compatible. The pinned range ensures users don't accidentally mix incompatible versions.
2b. falkordb/lite/__init__.py
"""
Embedded FalkorDB support.
This module is only usable when the 'lite' extra is installed:
pip install falkordb[lite]
"""2c. falkordb/lite/binaries.py
"""Binary resolution — finds redis-server and falkordb.so from falkordb-bin package."""
import shutil
from pathlib import Path
class BinaryNotFoundError(Exception):
"""Raised when embedded binaries are not installed."""
pass
def _require_bin():
"""Check that falkordb-bin is installed, raise helpful error if not."""
try:
import falkordb_bin
return falkordb_bin
except ImportError:
raise BinaryNotFoundError(
"Embedded FalkorDB requires the 'lite' extra. "
"Install with: pip install falkordb[lite]"
)
def get_redis_server_path() -> Path:
"""Resolve path to redis-server binary."""
falkordb_bin = _require_bin()
path = Path(falkordb_bin.get_redis_server())
if not path.exists():
raise BinaryNotFoundError(f"redis-server not found at {path}")
return path
def get_falkordb_module_path() -> Path:
"""Resolve path to falkordb.so module."""
falkordb_bin = _require_bin()
path = Path(falkordb_bin.get_falkordb_module())
if not path.exists():
raise BinaryNotFoundError(f"falkordb.so not found at {path}")
return path2d. falkordb/lite/config.py
"""Redis configuration generation for embedded mode."""
import os
import tempfile
from pathlib import Path
DEFAULT_CONFIG = {
'bind': '127.0.0.1',
'port': '0', # 0 = auto-assign port
'save': '', # Disable RDB by default for ephemeral
'appendonly': 'no',
'protected-mode': 'yes',
'loglevel': 'warning',
'databases': '16',
}
PERSISTENT_OVERRIDES = {
'save': '900 1 300 10 60 10000', # RDB snapshots
'appendonly': 'yes',
'appendfsync': 'everysec',
}
def generate_config(
falkordb_module_path: Path,
db_path: str | None = None,
unix_socket_path: str | None = None,
user_config: dict | None = None,
) -> str:
"""Generate redis.conf content for embedded mode."""
config = dict(DEFAULT_CONFIG)
# If persistence requested, set dir and enable AOF/RDB
if db_path:
db_dir = os.path.dirname(os.path.abspath(db_path))
db_file = os.path.basename(db_path)
os.makedirs(db_dir, exist_ok=True)
config['dir'] = db_dir
config['dbfilename'] = db_file
config.update(PERSISTENT_OVERRIDES)
# Unix socket for local communication (preferred over TCP)
if unix_socket_path:
config['unixsocket'] = unix_socket_path
config['unixsocketperm'] = '700'
config['port'] = '0' # Disable TCP when using socket
# Load FalkorDB module
config['loadmodule'] = str(falkordb_module_path)
# Apply user overrides
if user_config:
config.update(user_config)
# Render
lines = [f'{k} {v}' for k, v in config.items()]
return '\n'.join(lines) + '\n'2e. falkordb/lite/server.py
"""Embedded redis-server lifecycle management."""
import atexit
import os
import subprocess
import tempfile
import time
from pathlib import Path
import redis
from .binaries import get_redis_server_path, get_falkordb_module_path
from .config import generate_config
class EmbeddedServerError(Exception):
pass
class EmbeddedServer:
"""Manages a local redis-server + FalkorDB process."""
def __init__(
self,
db_path: str | None = None,
config: dict | None = None,
startup_timeout: float = 10.0,
):
self._process: subprocess.Popen | None = None
self._tmpdir = tempfile.mkdtemp(prefix='falkordb_')
self._socket_path = os.path.join(self._tmpdir, 'falkordb.sock')
self._config_path = os.path.join(self._tmpdir, 'redis.conf')
self._db_path = db_path
self._startup_timeout = startup_timeout
# Resolve binaries
self._redis_server = get_redis_server_path()
self._falkordb_module = get_falkordb_module_path()
# Generate config
config_content = generate_config(
falkordb_module_path=self._falkordb_module,
db_path=db_path,
unix_socket_path=self._socket_path,
user_config=config,
)
Path(self._config_path).write_text(config_content)
# Start server
self._start()
# Ensure cleanup on exit
atexit.register(self.stop)
def _start(self):
"""Start the redis-server process and wait for it to be ready."""
self._process = subprocess.Popen(
[str(self._redis_server), self._config_path],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
# Wait for socket to appear and server to respond
deadline = time.monotonic() + self._startup_timeout
while time.monotonic() < deadline:
if self._process.poll() is not None:
stderr = self._process.stderr.read().decode()
raise EmbeddedServerError(
f"redis-server exited with code {self._process.returncode}: {stderr}"
)
if os.path.exists(self._socket_path):
try:
r = redis.Redis(unix_socket_path=self._socket_path)
r.ping()
r.close()
return
except redis.ConnectionError:
pass
time.sleep(0.05)
self.stop()
raise EmbeddedServerError(
f"redis-server did not start within {self._startup_timeout}s"
)
@property
def unix_socket_path(self) -> str:
return self._socket_path
def stop(self):
"""Gracefully shut down the embedded server."""
if self._process and self._process.poll() is None:
try:
r = redis.Redis(unix_socket_path=self._socket_path)
r.shutdown(nosave=not bool(self._db_path))
r.close()
except Exception:
self._process.terminate()
try:
self._process.wait(timeout=5)
except subprocess.TimeoutExpired:
self._process.kill()
self._process = None
def __del__(self):
self.stop()2f. Changes to falkordb/falkordb.py
Add embedded parameter to the constructor, with connection pooling for parallel queries:
class FalkorDB:
"""FalkorDB client — connects to remote or embedded server."""
def __init__(
self,
host="localhost",
port=6379,
password=None,
socket_timeout=None,
socket_connect_timeout=None,
socket_keepalive=None,
socket_keepalive_options=None,
connection_pool=None,
unix_socket_path=None,
encoding="utf-8",
encoding_errors="strict",
retry_on_error=None,
ssl=False,
# ... other existing params ...
#
# ── NEW embedded params ──
embedded=False,
db_path=None,
embedded_config=None,
max_connections=16,
startup_timeout=10.0,
):
self._embedded_server = None
if embedded:
# Lazy import — only loaded when embedded=True
from .lite.server import EmbeddedServer
server = EmbeddedServer(
db_path=db_path,
config=embedded_config,
startup_timeout=startup_timeout,
)
self._embedded_server = server
# Override connection to use Unix socket with pooling
unix_socket_path = server.unix_socket_path
connection_pool = redis.ConnectionPool(
connection_class=redis.UnixDomainSocketConnection,
path=unix_socket_path,
max_connections=max_connections,
decode_responses=True,
socket_timeout=socket_timeout,
socket_connect_timeout=socket_connect_timeout,
encoding=encoding,
encoding_errors=encoding_errors,
retry_on_error=retry_on_error or [],
)
# ... existing __init__ logic using connection_pool or host/port ...
def close(self):
"""Close the connection. If embedded, also stops the server."""
if self._embedded_server:
self._embedded_server.stop()
self._embedded_server = None
def __enter__(self):
return self
def __exit__(self, *args):
self.close()
def __del__(self):
self.close()Key points:
- When
embedded=True, the constructor starts anEmbeddedServer, creates a Unix socketConnectionPool, and passes it through to the existing Redis connection logic - All standard kwargs (
socket_timeout,encoding, etc.) are forwarded to the pool - Remote-specific kwargs (
host,port,ssl_*) are silently ignored when embedded - Connection pool with configurable
max_connections(default 16) enables parallel queries out of the box
2g. Async Variant in falkordb/asyncio/falkordb.py
Same constructor approach — embedded=True spins up the server and creates an async connection pool:
import redis.asyncio
class FalkorDB:
# ... existing async FalkorDB ...
def __init__(
self,
# ... existing params ...
embedded=False,
db_path=None,
embedded_config=None,
max_connections=16,
startup_timeout=10.0,
):
self._embedded_server = None
if embedded:
from ..lite.server import EmbeddedServer
# Server start is synchronous (subprocess), but fast
server = EmbeddedServer(
db_path=db_path,
config=embedded_config,
startup_timeout=startup_timeout,
)
self._embedded_server = server
# Create async connection pool over Unix socket
connection_pool = redis.asyncio.BlockingConnectionPool(
connection_class=redis.asyncio.UnixDomainSocketConnection,
path=server.unix_socket_path,
max_connections=max_connections,
timeout=None,
decode_responses=True,
)
# Pass pool to existing init logic...
async def close(self):
"""Close async connection and stop embedded server if applicable."""
# Close the connection pool
if hasattr(self, 'connection') and self.connection:
await self.connection.aclose()
if self._embedded_server:
self._embedded_server.stop()
self._embedded_server = None
async def __aenter__(self):
return self
async def __aexit__(self, *args):
await self.close()falkordblite Deprecation Strategy
The existing falkordblite package (full redislite fork) is left untouched and deprecated gradually. Existing users are unaffected.
Current State (stays as-is)
falkordblite (on PyPI) — full standalone package, no changes
├── redislite/ # Full redislite fork
│ ├── __init__.py # Redis class, server management
│ ├── falkordb_client.py # FalkorDB class (reimplements graph client)
│ ├── bin/
│ │ ├── redis-server
│ │ └── falkordb.so
│ └── ...
└── setup.py # Builds redis from source
New Package (separate repo)
falkordb-bin (on PyPI) — new binary-only package
├── falkordb_bin/
│ ├── __init__.py # get_redis_server(), get_falkordb_module()
│ └── bin/
│ ├── redis-server
│ └── falkordb.so
└── pyproject.toml # Platform-specific wheel builds
Deprecation Timeline
- Now: Create
falkordb-binas a new repo & PyPI package. Ship binaries only. - Now: Add
falkordb[lite]extra tofalkordb-py, depending onfalkordb-bin. - Now: Implement
falkordb.litesubpackage infalkordb-py. - Next release of
falkordblite: Add deprecation warning on import:# In falkordblite's __init__.py or redislite/__init__.py import warnings warnings.warn( "falkordblite is deprecated. Use 'pip install falkordb[lite]' instead. " "See https://github.com/FalkorDB/falkordb-py#embedded-mode for migration guide.", DeprecationWarning, stacklevel=2, )
- 6-12 months later: Archive the
falkordbliterepo. The PyPI package remains installable but unmaintained.
Migration Guide for falkordblite Users
# BEFORE (falkordblite)
from redislite.falkordb_client import FalkorDB
db = FalkorDB('/tmp/falkordb.db')
g = db.select_graph('social')
g.query('CREATE (n:Person {name: "Alice"}) RETURN n')
# AFTER (falkordb[lite])
from falkordb import FalkorDB
db = FalkorDB(embedded=True, db_path='/tmp/falkordb.db')
g = db.select_graph('social')
g.query('CREATE (n:Person {name: "Alice"}) RETURN n')The API is intentionally almost identical — users change the import and constructor, everything else stays the same.
Comparison: Before and After
| Scenario | Before | After |
|---|---|---|
| Remote connection | pip install falkordb → FalkorDB(host=...) |
Same, unchanged |
| Embedded (current) | pip install falkordblite → from redislite.falkordb_client import FalkorDB |
pip install falkordb[lite] → FalkorDB(embedded=True) |
| Embedded + remote in same app | Two different packages, two different APIs | One package, same FalkorDB class |
| Switching from embedded → remote | Rewrite imports + constructor | Remove embedded=True, add host=... |
Existing falkordblite users |
Works as-is | Still works — deprecation warning only, migrate when ready |
Key Design Decisions
1. Unix Socket vs TCP for Embedded
Decision: Unix domain socket (default)
- Faster than TCP loopback (~30% lower latency)
- No port conflicts
- File-permission-based security (only creating user can access)
- Matches what redislite/falkordblite already does
- Falls back to TCP
127.0.0.1on Windows (no Unix sockets)
2. Ephemeral vs Persistent by Default
Decision: Ephemeral by default, opt-in persistence
FalkorDB(embedded=True) # ephemeral (no RDB/AOF)
FalkorDB(embedded=True, db_path='/tmp/my.db') # persistent3. Connection Pooling in Embedded Mode
Decision: Always use a connection pool, configurable size
Even with a Unix socket, a connection pool is needed to support parallel queries (e.g. from multiple threads or concurrent graph operations). The embedded constructor creates a ConnectionPool / BlockingConnectionPool with max_connections=16 by default, configurable via kwarg.
# Default pool (16 connections)
db = FalkorDB(embedded=True)
# Custom pool size
db = FalkorDB(embedded=True, max_connections=32)4. Lazy Import of falkordb-bin
Decision: Import only when embedded=True is passed
The falkordb_bin package is never imported at module load time. This means:
import falkordbworks without falkordb-bin installed- The import error with a helpful message only happens when you construct with
embedded=True - Zero overhead for remote-only users
5. Server Lifecycle
- The embedded server is tied to the
FalkorDBinstance close()/ context manager / garbage collection all stop the serveratexithandler ensures cleanup on interpreter exit- Multiple embedded instances are supported (each gets its own server + socket)
Testing Strategy
tests/
├── test_embedded.py # Requires falkordb-bin installed
│ ├── test_ephemeral_basic
│ ├── test_persistent_roundtrip
│ ├── test_context_manager
│ ├── test_close_stops_server
│ ├── test_multiple_instances
│ └── test_custom_config
├── test_embedded_async.py # Async embedded tests
├── test_missing_extra.py # Verifies helpful error when falkordb-bin not installed
└── ...existing tests...
CI matrix:
- All platforms: Run existing remote tests (no change)
- Linux/macOS: Additionally run embedded tests with
pip install .[lite]
File Changes Summary
falkordb-py repo
| File | Change |
|---|---|
pyproject.toml |
Add [project.optional-dependencies] lite = ["falkordb-bin>=1.0,<2.0"] |
falkordb/falkordb.py |
Add embedded=, db_path=, embedded_config=, max_connections= params, close(), context manager |
falkordb/asyncio/falkordb.py |
Add same embedded params, async close(), __aenter__/__aexit__ |
falkordb/lite/__init__.py |
New — subpackage init |
falkordb/lite/server.py |
New — EmbeddedServer class |
falkordb/lite/config.py |
New — Redis config generation |
falkordb/lite/binaries.py |
New — Binary resolution from falkordb-bin |
tests/test_embedded.py |
New — Embedded mode tests |
README.md |
Add embedded usage examples |
falkordb-bin repo (NEW)
| File | Description |
|---|---|
falkordb_bin/__init__.py |
get_redis_server(), get_falkordb_module() API |
falkordb_bin/bin/redis-server |
Precompiled binary (per-platform) |
falkordb_bin/bin/falkordb.so |
Precompiled binary (per-platform) |
pyproject.toml |
Package metadata, platform wheel config |
.github/workflows/build.yml |
CI: compile binaries, build wheels, publish to PyPI |
falkordblite repo (existing, minimal changes)
| File | Change |
|---|---|
redislite/__init__.py |
Add DeprecationWarning pointing to falkordb[lite] |
README.md |
Add deprecation notice and migration guide |
Resolved Decisions
| # | Question | Decision |
|---|---|---|
| 1 | Windows support | Required, but done as a follow-up task. Will use redis-windows for Redis and falkordb-rs-next-gen for FalkorDB on Windows. |
| 2 | Binary versioning | Pinned to major version range (e.g. >=1.0.0,<2.0.0). Both packages bump major together on breaking protocol changes. |
| 3 | Connection pooling | Yes, always. Embedded mode creates a ConnectionPool with max_connections=16 over Unix socket to support parallel queries. |
| 4 | API style | Constructor parameter (embedded=True). All standard kwargs (socket_timeout, encoding, etc.) are accepted alongside embedded-specific ones. No separate factory method. |
| 5 | Build pipeline | Hybrid. Download pre-built FalkorDB binaries from GitHub releases. Build Redis from source (no official binaries available). |
| 6 | Package name | falkordb-bin. Clear, concise, communicates "just the binaries". |
| 7 | max_connections |
Configurable via kwarg, default 16. Constructor accepts max_connections= passed through to the pool. |
| 8 | Version alignment | falkordb-bin mirrors falkordb versioning. e.g. falkordb 1.2.0 and falkordb-bin 1.2.0 are released together and known-compatible. Simplifies the compatibility story — users don't need a matrix. |