Skip to content

Commit 17e9110

Browse files
feat: implement InMemoryCatalog as a subclass of SqlCatalog (#1140)
closes: #1110 This PR implement a new catalog `InMemoryCatalog` as a subclass of `SqlCatalog` with SQLite in-memory.
1 parent d9b1c03 commit 17e9110

File tree

5 files changed

+157
-275
lines changed

5 files changed

+157
-275
lines changed

mkdocs/docs/configuration.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -388,6 +388,23 @@ catalog:
388388
| echo | true | false | SQLAlchemy engine [echo param](https://docs.sqlalchemy.org/en/20/core/engines.html#sqlalchemy.create_engine.params.echo) to log all statements to the default log handler |
389389
| pool_pre_ping | true | false | SQLAlchemy engine [pool_pre_ping param](https://docs.sqlalchemy.org/en/20/core/engines.html#sqlalchemy.create_engine.params.pool_pre_ping) to test connections for liveness upon each checkout |
390390

391+
### In Memory Catalog
392+
393+
The in-memory catalog is built on top of `SqlCatalog` and uses SQLite in-memory database for its backend.
394+
395+
It is useful for test, demo, and playground but not in production as it does not support concurrent access.
396+
397+
```yaml
398+
catalog:
399+
default:
400+
type: in-memory
401+
warehouse: /tmp/pyiceberg/warehouse
402+
```
403+
404+
| Key | Example | Default | Description |
405+
| --------- |--------------------------|-------------------------------|----------------------------------------------------------------------|
406+
| warehouse | /tmp/pyiceberg/warehouse | file:///tmp/iceberg/warehouse | The directory where the in-memory catalog will store its data files. |
407+
391408
### Hive Catalog
392409

393410
```yaml

pyiceberg/catalog/__init__.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,7 @@ class CatalogType(Enum):
116116
GLUE = "glue"
117117
DYNAMODB = "dynamodb"
118118
SQL = "sql"
119+
IN_MEMORY = "in-memory"
119120

120121

121122
def load_rest(name: str, conf: Properties) -> Catalog:
@@ -162,12 +163,22 @@ def load_sql(name: str, conf: Properties) -> Catalog:
162163
) from exc
163164

164165

166+
def load_in_memory(name: str, conf: Properties) -> Catalog:
167+
try:
168+
from pyiceberg.catalog.memory import InMemoryCatalog
169+
170+
return InMemoryCatalog(name, **conf)
171+
except ImportError as exc:
172+
raise NotInstalledError("SQLAlchemy support not installed: pip install 'pyiceberg[sql-sqlite]'") from exc
173+
174+
165175
AVAILABLE_CATALOGS: dict[CatalogType, Callable[[str, Properties], Catalog]] = {
166176
CatalogType.REST: load_rest,
167177
CatalogType.HIVE: load_hive,
168178
CatalogType.GLUE: load_glue,
169179
CatalogType.DYNAMODB: load_dynamodb,
170180
CatalogType.SQL: load_sql,
181+
CatalogType.IN_MEMORY: load_in_memory,
171182
}
172183

173184

pyiceberg/catalog/memory.py

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
from pyiceberg.catalog.sql import SqlCatalog
19+
20+
21+
class InMemoryCatalog(SqlCatalog):
22+
"""
23+
An in-memory catalog implementation that uses SqlCatalog with SQLite in-memory database.
24+
25+
This is useful for test, demo, and playground but not in production as it does not support concurrent access.
26+
"""
27+
28+
def __init__(self, name: str, warehouse: str = "file:///tmp/iceberg/warehouse", **kwargs: str) -> None:
29+
self._warehouse_location = warehouse
30+
if "uri" not in kwargs:
31+
kwargs["uri"] = "sqlite:///:memory:"
32+
super().__init__(name=name, warehouse=warehouse, **kwargs)

0 commit comments

Comments
 (0)