Skip to content

Commit 59201cb

Browse files
authored
Add a simple usage demo (#6)
1 parent e49d610 commit 59201cb

File tree

4 files changed

+29
-27
lines changed

4 files changed

+29
-27
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -94,3 +94,4 @@ standalone-build/
9494
_rust.h
9595
uv.lock
9696
tests/temp_models/
97+
*.cast

Makefile

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ debug: rust-build-debug ## Build the extension in debug mode (DuckDB + extension
107107
# Development Targets
108108
# ==============================================================================
109109
.PHONY: install-deps
110-
install-deps: ## Set up development environment
110+
install-deps: ## Set up development environment (for Debian-based systems)
111111
@echo "Setting up development environment..."
112112
@sudo apt-get install -y cmake clang-format snap python3-pip
113113
@sudo snap install rustup --classic
@@ -127,16 +127,16 @@ setup-hooks: ## Install Git hooks (pre-commit and pre-push)
127127
@pre-commit install-hooks
128128

129129
.PHONY: test-hooks
130-
test-hooks: ## Test Git hooks on all files
131-
@echo "Testing Git hooks..."
130+
test-hooks: ## Run Git hooks on all files manually
131+
@echo "Running Git hooks..."
132132
@pre-commit run --all-files
133133

134134
.PHONY: clean-all
135135
clean-all: clean rust-clean ## Clean everything
136136
@echo "All clean!"
137137

138138
.PHONY: check
139-
check: rust-lint rust-test ## Run all checks
139+
check: rust-lint rust-test ## Run all checks (linting and tests)
140140
@echo "All checks passed!"
141141

142142
.PHONY: examples

README.md

Lines changed: 13 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -18,29 +18,29 @@ In-Database Machine Learning for DuckDB
1818

1919
---
2020

21-
Infera is DuckDB extension that allows you use machine learning (ML) models directly in SQL queries to perform inference
22-
on data stored in DuckDB tables.
21+
Infera is a DuckDB extension that allows you to use machine learning (ML) models directly in SQL queries to perform
22+
inference on data stored in DuckDB tables.
2323
It is developed in Rust and uses [Tract](https://github.com/snipsco/tract) as the backend inference engine.
2424
Infera supports loading and running models in [ONNX](https://onnx.ai/) format.
25-
Check out the [ONNX Model Zoo](https://huggingface.co/onnxmodelzoo) repositors on Hugging Face for a large
25+
Check out the [ONNX Model Zoo](https://huggingface.co/onnxmodelzoo) repository on Hugging Face for a large
2626
collection of ready-to-use models that can be used with Infera.
2727

2828
### Motivation
2929

30-
In a conventional data science workflow, when data is stored in a database, it is not normally possible to use ML models
31-
directly on the data.
32-
Users need to move the data out of the database first (for example, export it to a CSV file), load the data into a
30+
In a conventional data science workflow, when data is stored in a database, it is not typically possible to use ML
31+
models directly on the data.
32+
Users need to move the data out of the database first (for example, export it to a CSV file) and load the data into a
3333
Python or R environment, run the model there, and then import the results back into the database.
3434
This process is time-consuming and inefficient.
35-
Infera aims to solve this problem by letting users to run ML models directly in SQL queries inside the database.
35+
Infera aims to solve this problem by letting users run ML models directly in SQL queries inside the database.
3636
It simplifies the workflow and speeds up the process for users, and eliminates the need for moving data around.
3737

3838
### Features
3939

40-
- Adds ML inference as first-class citizens in SQL queries.
40+
- Adds ML inference as a first-class citizen in SQL queries.
4141
- Supports loading and using local as well as remote models.
4242
- Supports using ML models in ONNX format with a simple and flexible API.
43-
- Supports performing inference on table columns or raw BLOB (tensor) data.
43+
- Supports performing inference on table columns or raw tensor data.
4444
- Supports both single-value and multi-value model outputs.
4545
- Supports autoloading all models from a specified directory.
4646
- Thread-safe, fast, and memory-efficient.
@@ -55,7 +55,7 @@ See the [ROADMAP.md](ROADMAP.md) for the list of implemented and planned feature
5555

5656
### Quickstart
5757

58-
1. Clone the repository and build Infera extension from source:
58+
1. Clone the repository and build the Infera extension from source:
5959

6060
```bash
6161
git clone --recursive https://github.com/CogitatorTech/infera.git
@@ -94,9 +94,11 @@ select infera_unload_model('linear_model');
9494
select infera_get_version();
9595
````
9696

97+
[![Simple Demo 1](https://asciinema.org/a/745806.svg)](https://asciinema.org/a/745806)
98+
9799
> [!NOTE]
98100
> After building from source, the Infera binary will be `build/release/extension/infera/infera.duckdb_extension`.
99-
> You can load it using the `load 'build/release/extension/infera/infera.duckdb_extension';` in DuckDB shell.
101+
> You can load it using the `load 'build/release/extension/infera/infera.duckdb_extension';` in the DuckDB shell.
100102
> Note that the extension binary will only work with the DuckDB version that it was built against.
101103
> At the moment, Infera is not available as
102104
> a [DuckDB community extension](https://duckdb.org/community_extensions/list_of_extensions).

docs/README.md

Lines changed: 11 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
### API Reference
22

3-
Table below includes the information about all SQL functions exposed by the Infera.
3+
The table below includes the information about all SQL functions exposed by Infera.
44

55
| # | Function | Return Type | Description |
66
|---|:--------------------------------------------------------|:-----------------|:--------------------------------------------------------------------------------------------------------------------------------------------|
@@ -53,7 +53,7 @@ select id,
5353
from features_table;
5454

5555
-- Get multiple outputs as a JSON array.
56-
-- This is useful models that return multiple outputs per prediction (like a non-binary classifier)
56+
-- This is a useful models that return multiple outputs per prediction (like a non-binary classifier)
5757
select infera_predict_multi('multi_output_model', 1.0, 2.0);
5858
-- Output: [0.85, 0.12, 0.03]
5959

@@ -64,7 +64,7 @@ from my_table;
6464
```
6565

6666
> [!IMPORTANT]
67-
> When you use a model model for inference, in essence it will be executed on your machine.
67+
> When you use a model, in essence, it will be executed on your machine.
6868
> So make sure you download and use models from trusted sources only.
6969
7070
#### Utility Functions
@@ -114,13 +114,13 @@ You also need to have Rust (nightly) and Cargo installed via `rustup`.
114114
115115
2. **Install dependencies:**
116116

117-
The project includes a `Makefile` target to help set up the development environment. For Debian-based systems, you
118-
can run:
117+
The project includes a [`Makefile`](../Makefile) target to help set up the development environment. For Debian-based
118+
systems, you can run:
119119
```bash
120120
make install-deps
121121
```
122122
This will install necessary system packages, Rust tools, and Python dependencies. For other operating systems, please
123-
refer to the `Makefile` to see the list of dependencies and install them manually.
123+
check the `Makefile` to see the list of dependencies and install them manually.
124124

125125
3. **Build the extension:**
126126

@@ -140,11 +140,10 @@ You also need to have Rust (nightly) and Cargo installed via `rustup`.
140140
without needing to run the `LOAD` command.
141141

142142
> [!NOTE]
143-
> After a successful build, you can run the following binaries:
144-
> - `./build/release/duckdb`: this is the newest stable version of duckdb with Infera statically linked to it.
145-
> - `./build/release/test/unittest`: this is the test runner of duckdb (for `.test` files).
146-
> - `./build/release/extension/infera/infera.duckdb_extension`: this is the loadable binary that is a `.so`,
147-
`.dylib`, or `.dll` file based on your platform.
143+
> After a successful build, you will find the following files in the `build/release/` directory:
144+
> - `./build/release/duckdb`: this is a DuckDB binary with the Infera extension already statically linked to it.
145+
> - `./build/release/test/unittest`: this is the test runner for running the SQL tests in the `test/sql/` directory.
146+
> - `./build/release/extension/infera/infera.duckdb_extension`: this is the loadable extension file for Infera.
148147
149148
---
150149

@@ -163,5 +162,5 @@ Infera is made up of two main components:
163162
responsibilities include:
164163
* Defining the custom SQL functions (like `infera_load_model` and `infera_predict`).
165164
* Translating data from DuckDB's internal vector-based format into the raw data pointers expected by the Rust FFI.
166-
* Calling the Rust functions and handling the returned results.
165+
* Calling the Rust functions and handling the returned results and errors.
167166
* Integrating with DuckDB's extension loading mechanism.

0 commit comments

Comments
 (0)