Skip to content
This repository was archived by the owner on Oct 10, 2025. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
56d451c
Delta Lake docs (#313)
acquamarin Dec 19, 2024
7d075a7
Add Iceberg Extension Documentation (#314)
SterlingT3485 Dec 19, 2024
7a0dff1
Fix file extension
prrao87 Dec 19, 2024
2ea4ce4
Minor fixes
prrao87 Dec 20, 2024
227f91b
Create wasm.mdx
mewim Jan 13, 2025
9f4d9d0
remove progress_bar_time from docs (#337)
WWW0030 Jan 20, 2025
604954a
Fix ignore errors in DataFrame section (#338)
prrao87 Jan 21, 2025
51ef56a
Add doc for `show_indexes`, `show_official_extensions` (#339)
acquamarin Jan 21, 2025
ca49c2f
FTS index (#332)
acquamarin Jan 22, 2025
789dd4c
Document the behaviour of import/export database with indexes (#340)
acquamarin Jan 23, 2025
b347ae5
Add doc for file-format option (#342) (#343)
prrao87 Jan 24, 2025
9db6bb9
Fix typos and improve formatting
prrao87 Jan 24, 2025
123c28d
Add doc for yield clause (#347)
acquamarin Jan 25, 2025
6847d34
skip/limit doc (#341)
acquamarin Jan 29, 2025
bf980bc
Add documentation on special behaviour for query result getNext() (#351)
royi-luo Jan 31, 2025
3d331c0
Update rdbms.mdx (#352)
acquamarin Feb 3, 2025
448d55d
Add doc for duckdb/sqlite/postgres's type conversion (#348)
acquamarin Feb 3, 2025
683f370
Rename output of fts (#354)
acquamarin Feb 3, 2025
d652db1
Add doc for rel_table_group (#349)
acquamarin Feb 3, 2025
ac992d6
Fix formatting
prrao87 Feb 3, 2025
7b98ad1
Update src/content/docs/migrate/index.md
acquamarin Feb 3, 2025
f23e0ba
Fix export-db with index doc
acquamarin Feb 3, 2025
b402f4f
Merge pull request #355 from kuzudb/fix-export-db-index
ray6080 Feb 4, 2025
9c55602
Add reference to the bm25 match algo (#357)
acquamarin Feb 4, 2025
075f018
edits
semihsalihoglu-uw Feb 5, 2025
8dccb54
Update wasm.mdx
mewim Feb 5, 2025
8913542
Update wasm.mdx
mewim Feb 5, 2025
5f5eae2
Update full-text-search.md (#358)
acquamarin Feb 5, 2025
f2ff38c
Fts minor fix (#362)
acquamarin Feb 5, 2025
b61f8d3
Update index.mdx
mewim Feb 5, 2025
dd44f26
Update installation.mdx
mewim Feb 5, 2025
425d3cd
Update installation.mdx
mewim Feb 5, 2025
cf3dc11
Update installation.mdx
mewim Feb 5, 2025
000a7c5
Improve load/scan docs
prrao87 Feb 5, 2025
aa0b315
Update config
prrao87 Feb 5, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions astro.config.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ export default defineConfig({
{ label: 'Create your first graph', link: '/get-started' },
{ label: 'Query & visualize your graph', link: '/get-started/cypher-intro' },
{ label: 'Run prepared Cypher statements', link: '/get-started/prepared-statements' },
{ label: 'Scan data from various sources', link: '/get-started/scan'},
{ label: 'Run graph algorithms', link: '/get-started/graph-algorithms' },
]
},
Expand Down Expand Up @@ -148,6 +149,7 @@ export default defineConfig({
{ label: 'Go', link: '/client-apis/go' },
{ label: 'C++', link: '/client-apis/cpp' },
{ label: 'C', link: '/client-apis/c' },
{ label: 'WebAssembly', link: '/client-apis/wasm' },
{ label: '.NET', link: '/client-apis/net', badge: { text: 'Community', variant: 'caution'}},
{ label: 'Elixir', link: '/client-apis/elixir', badge: { text: 'Community', variant: 'caution'}}
],
Expand Down Expand Up @@ -197,8 +199,9 @@ export default defineConfig({
]
},
{ label: 'JSON', link: '/extensions/json' },
{ label: 'Iceberg', link: '/extensions/iceberg', badge: { text: 'New' }},
{ label: 'Delta Lake', link: '/extensions/delta', badge: { text: 'New' }},
{ label: 'Iceberg', link: '/extensions/iceberg' },
{ label: 'Delta Lake', link: '/extensions/delta' },
{ label: 'Full-text search', link: '/extensions/full-text-search', badge: { text: 'New' }},
],
autogenerate: { directory: 'reference' },
},
Expand Down
60 changes: 59 additions & 1 deletion src/content/docs/client-apis/c.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -73,4 +73,62 @@ And then link against `<install-dest>/libkuzu.so` (or `libkuzu.dylib`/`libkuzu.l


The static library is more complicated (as noted above, it's recommended that you use CMake to handle the details) and is not installed by default, but all static libraries will be available in the build directory.
You need to define `KUZU_STATIC_DEFINE`, and link against the static kuzu library in `build/src`, as well as `antlr4_cypher`, `antlr4_runtime`, `brotlidec`, `brotlicommon`, `utf8proc`, `re2`, `serd`, `fastpfor`, `miniparquet`, `zstd`, `miniz`, `mbedtls`, `lz4` (all of which can be found in the third_party subdirectory of the CMake build directory. E.g. `build/third_party/zstd/libzstd.a`) and whichever standard library you're using.
You need to define `KUZU_STATIC_DEFINE`, and link against the static Kùzu library in `build/src`, as well as `antlr4_cypher`, `antlr4_runtime`, `brotlidec`, `brotlicommon`, `utf8proc`, `re2`, `serd`, `fastpfor`, `miniparquet`, `zstd`, `miniz`, `mbedtls`, `lz4` (all of which can be found in the third_party subdirectory of the CMake build directory. E.g. `build/third_party/zstd/libzstd.a`) and whichever standard library you're using.

## Handling Kùzu output using `kuzu_query_result_get_next()`

For the examples in this section we will be using the following schema:
```cypher
CREATE NODE TABLE person(id INT64 PRIMARY KEY);
```

The `kuzu_query_result_get_next()` function returns a reference to the resulting flat tuple. Additionally, to reduce resource allocation all calls to `kuzu_query_result_get_next()` reuse the same
flat tuple object. This means that for a query result, each call to `kuzu_query_result_get_next()` actually overwrites the flat tuple previously returned by the previous call.

Thus, we recommend processing each tuple immediately before making the next call to `getNext`:

```c
kuzu_query_result result;
kuzu_connection_query(conn, "MATCH (p:person) RETURN p.*", result);
while (kuzu_query_result_has_next(result)) {
kuzu_flat_tuple tuple;
kuzu_query_result_get_next(result, tuple);
do_something(tuple);
}
```

If you wish to process the tuples later, you must explicitly make a copy of each tuple:
```cpp
static kuzu_value* copy_flat_tuple(kuzu_flat_tuple* tuple, uint32_t tupleLen) {
kuzu_value* ret = malloc(sizeof(kuzu_value) * tupleLen);
for (uint32_t i = 0; i < tupleLen; i++) {
kuzu_flat_tuple_get_value(tuple, i, &ret[i]);
}
return ret;
}

void mainFunction() {
kuzu_query_result result;
kuzu_connection_query(conn, "MATCH (p:person) RETURN p.*", &result);

uint64_t num_tuples = kuzu_query_result_get_num_tuples(&result);
kuzu_value** tuples = (kuzu_value**)malloc(sizeof(kuzu_value*) * num_tuples);
for (uint64_t i = 0; i < num_tuples; ++i) {
kuzu_flat_tuple tuple;
kuzu_query_result_get_next(&result, &tuple);
tuples[i] = copy_flat_tuple(&tuple, kuzu_query_result_get_num_columns(&result));
kuzu_flat_tuple_destroy(&tuple);
}

for (uint64_t i = 0; i < num_tuples; ++i) {
for (uint64_t j = 0; j < kuzu_query_result_get_num_columns(&result); ++j) {
doSomething(tuples[i][j]);
kuzu_value_destroy(&tuples[i][j]);
}
free(tuples[i]);
}

free((void*)tuples);
kuzu_query_result_destroy(&result);
}
```
67 changes: 64 additions & 3 deletions src/content/docs/client-apis/cpp.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,71 @@ See the following link for the full documentation of the C++ API.
href="https://kuzudb.com/api-docs/cpp/annotated.html"
/>

## Handling Kùzu output using `getNext()`

For the examples in this section we will be using the following schema:
```cypher
CREATE NODE TABLE person(id INT64 PRIMARY KEY);
```

The `getNext()` function in a `QueryResult` returns a reference to the resulting `FlatTuple`. Additionally, to reduce resource allocation all calls to `getNext()` reuse the same
FlatTuple object. This means that for a `QueryResult`, each call to `getNext()` actually overwrites the `FlatTuple` previously returned by the previous call to `getNext()`.

Thus, we don't recommend using `QueryResult` like this:

```cpp
std::unique_ptr<kuzu::main::QueryResult> result = conn.query("MATCH (p:person) RETURN p.*");
std::vector<std::shared_ptr<kuzu::processor::FlatTuple>> tuples;
while (result->hasNext()) {
// Each call to getNext() actually returns a pointer to the same tuple object
tuples.emplace_back(result->getNext());
}

// This is wrong!
// The vector stores a bunch of pointers to the same underlying tuple object
for (const auto& resultTuple: tuples) {
doSomething(resultTuple);
}
```

Instead, we recommend processing each tuple immediately before making the next call to `getNext`:
```cpp
std::unique_ptr<kuzu::main::QueryResult> result = conn.query("MATCH (p:person) RETURN p.*");
std::vector<std::shared_ptr<kuzu::processor::FlatTuple>> tuples;
while (result->hasNext()) {
auto tuple = result->getNext();
doSomething(tuple);
}
```

If wish to process the tuples later, you must explicitly make a copy of each tuple:
```cpp
static decltype(auto) copyFlatTuple(kuzu::processor::FlatTuple* tuple) {
std::vector<std::unique_ptr<kuzu::common::Value>> ret;
for (uint32_t i = 0; i < tuple->len(); i++) {
ret.emplace_back(tuple->getValue(i)->copy());
}
return ret;
}

void mainFunction() {
std::unique_ptr<kuzu::main::QueryResult> result = conn->query("MATCH (p:person) RETURN p.*");
std::vector<std::vector<std::unique_ptr<kuzu::common::Value>>> tuples;
while (result->hasNext()) {
auto tuple = result->getNext();
tuples.emplace_back(copyFlatTuple(tuple.get()));
}
for (const auto& tuple : tuples) {
doSomething(tuple);
}
}
```

## UDF API

In addition to interfacing with the database, the C++ API offers users the ability to define custom
functions via User Defined Functions (UDFs), described below.

## UDF API
Kùzu provides two interfaces that enable you to define your own custom scalar and vectorized functions.

### Scalar functions
Expand Down Expand Up @@ -211,7 +272,7 @@ conn->createVectorizedFunction<int64_t, int64_t>("addFour", &addFour);
conn->query("MATCH (p:person) return addFour(p.age)");
```

#### Option 2. Vectorized function with input and return type in Cypher
#### Option 2. Vectorized function with input and return type in Cypher

Create a vectorized function with input and return type in Cypher.
```cpp
Expand Down Expand Up @@ -263,4 +324,4 @@ conn->query("MATCH (p:person) return addDate(p.birthdate, p.age)");

## Linking

See the [C API Documentation](/client-apis/c#linking) for details as linking to the C++ API is more or less identical.
See the [C API Documentation](/client-apis/c#linking) for details as linking to the C++ API is more or less identical.
60 changes: 60 additions & 0 deletions src/content/docs/client-apis/java.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,63 @@ See the following link for the full documentation of the Java API.
title="Java API documentation"
href="https://kuzudb.com/api-docs/java"
/>

## Handling Kùzu output using `getNext()`

For the examples in this section we will be using the following schema:
```cypher
CREATE NODE TABLE person(id INT64 PRIMARY KEY);
```

The `getNext()` function in a `QueryResult` returns a reference to the resulting `FlatTuple`. Additionally, to reduce resource allocation all calls to `getNext()` reuse the same
FlatTuple object. This means that for a `QueryResult`, each call to `getNext()` actually overwrites the `FlatTuple` previously returned by the previous call to `getNext()`.

Thus, we don't recommend using `QueryResult` like this:

```java
QueryResult result = conn.query("MATCH (p:person) RETURN p.*");
List<FlatTuple> tuples = new ArrayList<FlatTuple>();
while (result.hasNext()) {
// Each call to getNext() actually returns a reference to the same tuple object
tuples.add(result.getNext());
}

// This is wrong!
// The list stores a bunch of references to the same underlying tuple object
for (FlatTuple resultTuple: tuples) {
doSomething(resultTuple);
}
```

Instead, we recommend processing each tuple immediately before making the next call to `getNext`:
```java
QueryResult result = conn.query("MATCH (p:person) RETURN p.*");
while (result.hasNext()) {
FlatTuple tuple = result.getNext();
doSomething(tuple);
}
```

If wish to process the tuples later, you must explicitly make a copy of each tuple:
```java
List<Value> copyFlatTuple(FlatTuple tuple, long tupleLen) throws ObjectRefDestroyedException {
List<Value> ret = new ArrayList<Value>();
for (int i = 0; i < tupleLen; i++) {
ret.add(tuple.getValue(i).clone());
}
return ret;
}

void mainFunction() throws ObjectRefDestroyedException {
QueryResult result = conn.query("MATCH (p:person) RETURN p.*");
List<List<Value>> tuples = new ArrayList<List<Value>>();
while (result.hasNext()) {
FlatTuple tuple = result.getNext();
tuples.add(copyFlatTuple(tuple, result.getNumColumns()));
}

for (List<Value> tuple: tuples) {
doSomething(tuple);
}
}
```
84 changes: 84 additions & 0 deletions src/content/docs/client-apis/wasm.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
---
title: WebAssembly (Wasm)
---

[WebAssembly](https://webassembly.org/), a.k.a. _Wasm_, is a standard defining any suitable low-level
programming language as compilation target, enabling deployment of software within web browsers on a variety
of devices. This page describes Kùzu's Wasm API, enabling Kùzu databases to run inside Wasm-capable
browsers.

## Benefits of WASM

Several benefits of Kùzu-Wasm are the following:

- Fast, in-browser graph analysis without ever sending data to a server.
- Strong data privacy guarantees, as the data never leaves the browser.
- Real-time interactive in-browser graph analytics and visualization.

## Installation

```bash
npm i kuzu-wasm
```

## Example usage

We provide a simple example to demonstrate how to use Kùzu-Wasm. In this example, we will create a simple graph and run a few simple queries.

We provide three versions of this example:
- `browser_in_memory`: This example demonstrates how to use Kùzu-Wasm in a web browser with an in-memory filesystem.
- `browser_persistent`: This example demonstrates how to use Kùzu-Wasm in a web browser with a persistent IDBFS filesystem.
- `nodejs`: This example demonstrates how to use Kùzu-Wasm in Node.js.

The example can be found in [the examples directory](https://github.com/kuzudb/kuzu/tree/master/tools/wasm/examples).

## Understanding the package

In this package, three different variants of WebAssembly modules are provided:
- **Default**: This is the default build of the WebAssembly module. It does not support multi-threading and uses Emscripten's default filesystem. This build has the smallest size and works in both Node.js and browser environments. It has the best compatibility and does not require cross-origin isolation. However, the performance may be limited due to the lack of multithreading support. This build is located at the root level of the package.
- **Multi-threaded**: This build supports multi-threading and uses Emscripten's default filesystem. This build has a larger size compared to the default build and only requires [cross-origin isolation](https://web.dev/articles/cross-origin-isolation-guide) in the browser environment. This build is located in the `multithreaded` directory.
- **Node.js**: This build is optimized for Node.js and uses Node.js's filesystem instead of Emscripten's default filesystem (`NODEFS` flag is enabled). This build also supports multi-threading. It is distributed as a CommonJS module rather than an ES module to maximize compatibility. This build is located in the `nodejs` directory. Note that this build only works in Node.js and does not work in the browser environment.

In each variant, there are two different versions of the WebAssembly module:
- **Async**: This version of the module is the default version and each function call returns a Promise. This version dispatches all the function calls to the WebAssembly module to a Web Worker or Node.js worker thread to prevent blocking the main thread. However, this version may have a slight overhead due to the serialization and deserialization of the data required by the worker threads. This version is located at the root level of each variant (e.g., `kuzu-wasm`, `kuzu-wasm/multithreaded`, `kuzu-wasm/nodejs`).
- **Sync**: This version of the module is synchronous and does not require any callbacks (other than the module initialization). This version is good for scripting / CLI / prototyping purposes but is not recommended to be used in GUI applications or web servers because it may block the main thread and cause unexpected freezes. This alternative version is located in the `sync` directory of each variant (e.g., `kuzu-wasm/sync`, `kuzu-wasm/multithreaded/sync`, `kuzu-wasm/nodejs/sync`).

Note that you cannot mix and match the variants and versions. For example, a `Database` object created with the default variant cannot be passed to a function in the multithreaded variant. Similarly, a `Database` object created with the async version cannot be passed to a function in the sync version.

### Loading the Worker script (for async versions)
In each variant, the main module is bundled as one script file. However, the worker script is located in a separate file. The worker script is required to run the WebAssembly module in a Web Worker or Node.js worker thread. If you are using a build tool like Webpack, the worker script needs to be copied to the output directory. For example, in Webpack, you can use the `copy-webpack-plugin` to copy the worker script to the output directory.

By default, the worker script is resolved under the same directory / URL prefix as the main module. If you want to change the location of the worker script, you can use pass the optional worker path parameter to the `setWorkerPath` function. For example:
```javascript
import kuzu from "kuzu-wasm";
kuzu.setWorkerPath('path/to/worker.js');
```

Note that this function must be called before any other function calls to the WebAssembly module. After the initialization is started, the worker script path cannot be changed and not finding the worker script will cause an error.

For the Node.js variant, the worker script can be resolved automatically and you do not need to set the worker path.

## API documentation
The API documentation can be found here:

**Synchronous** version: [API documentation](https://kuzudb.com/api-docs/wasm/sync/)

**Asynchronous** version: [API documentation](https://kuzudb.com/api-docs/wasm/async/)

## Local development

This section is relevant if you are interested in contributing to Kùzu's Wasm API.

First, build the WebAssembly module:

```bash
npm run build
```

This will build the WebAssembly module in the `release` directory and create a tarball ready for publishing under the current directory.

You can run the tests as follows:

```bash
npm test
```
1 change: 0 additions & 1 deletion src/content/docs/cypher/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ configuration **cannot** be used with other query clauses, such as `RETURN`.
| `HOME_DIRECTORY`| system home directory | user home directory |
| `FILE_SEARCH_PATH`| file search path | N/A |
| `PROGRESS_BAR` | enable progress bar in CLI | false |
| `PROGRESS_BAR_TIME` | show progress bar after time in ms | 1000 |
| `CHECKPOINT_THRESHOLD` | the WAL size threshold in bytes at which to automatically trigger a checkpoint | 16777216 (16MB) |
| `WARNING_LIMIT` | maximum number of [warnings](/import#warnings-table-inspect-skipped-rows) that can be stored in a single connection. | 8192 |
| `SPILL_TO_DISK` | spill data disk if there is not enough memory when running `COPY FROM (cannot be set to TRUE under in-memory or read-only mode) | true |
Expand Down
Loading