Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
105 changes: 101 additions & 4 deletions mkdocs/docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -1527,17 +1527,72 @@ def cleanup_old_snapshots(table_name: str, snapshot_ids: list[int]):
cleanup_old_snapshots("analytics.user_events", [12345, 67890, 11111])
```

## Views
## Create a view

PyIceberg supports view operations.
If the REST server does not indicate support for view endpoints, you can enable it by setting `"view-endpoints-supported": "true"`:

### Check if a view exists
```python
from pyiceberg.catalog import load_catalog

catalog = load_catalog(
"docs",
**{
"uri": "http://127.0.0.1:8181",
"s3.endpoint": "http://127.0.0.1:9000",
"py-io-impl": "pyiceberg.io.pyarrow.PyArrowFileIO",
"s3.access-key-id": "admin",
"s3.secret-access-key": "password",
"view-endpoints-supported": "true",
}
)
```

To create a view from the catalog:

```python
import time
from pyiceberg.catalog import load_catalog
from pyiceberg.schema import Schema
from pyiceberg.types import IntegerType, NestedField
from pyiceberg.view import SQLViewRepresentation, ViewVersion

catalog = load_catalog("default")
Comment thread
MonkeyCanCode marked this conversation as resolved.
catalog.view_exists("default.bar")

schema = Schema(NestedField(field_id=1, name="some_col", field_type=IntegerType(), required=False))
view_version = ViewVersion(
version_id=1,
schema_id=1,
timestamp_ms=int(time.time() * 1000),
summary={"spark-version": "4.1"},
representations=[
SQLViewRepresentation(
type="sql",
sql="SELECT 1 as some_col",
dialect="spark",
)
],
default_namespace=["default"],
)

catalog.create_view(
identifier="default.some_view",
schema=schema,
view_version=view_version,
)
```

`catalog.create_view` also accepts a PyArrow schema, so the following is equivalent:

```python
import pyarrow as pa

schema = pa.schema([pa.field("some_col", pa.int32())])

catalog.create_view(
identifier="default.some_view",
schema=schema,
view_version=view_version,
)
```

## Register a view
Expand All @@ -1551,6 +1606,48 @@ catalog.register_view(
)
```

## Load a view

Loading the `some_view` view:

```python
view = catalog.load_view("default.some_view")
# Equivalent to:
view = catalog.load_view(("default", "some_view"))
# The tuple syntax can be used if the namespace or view contains a dot.
```

This returns a `View` that represents an Iceberg view. You can access the SQL representation for a specific dialect:

```python
sql_representation = view.sql_for("spark")
print(sql_representation.sql)
```

## Check if a view exists

To check whether the `some_view` view exists:

```python
catalog.view_exists("default.some_view")
```

## List views

To list views in the `default` namespace:

```python
catalog.list_views("default")
```

## Drop a view

To drop a view:

```python
catalog.drop_view("default.some_view")
```

## Table Statistics Management

Manage table statistics with operations through the `Table` API:
Expand Down