Skip to content

Commit fc0bdb6

Browse files
committed
docs: further update docs for new CLI
1 parent f952f7e commit fc0bdb6

File tree

11 files changed

+173
-148
lines changed

11 files changed

+173
-148
lines changed

docs/docs/core/basics.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -101,4 +101,4 @@ As an indexing flow is long-lived, it needs to store intermediate data to keep t
101101
CocoIndex uses internal storage for this purpose.
102102

103103
Currently, CocoIndex uses Postgres database as the internal storage.
104-
See [Initialization](initialization) for configuring its location, and `cocoindex setup` CLI command (see [CocoIndex CLI](cli)) creates tables for the internal storage.
104+
See [Settings](settings#databaseconnectionspec) for configuring its location, and `cocoindex setup` CLI command (see [CocoIndex CLI](cli)) creates tables for the internal storage.

docs/docs/core/cli.mdx

Lines changed: 23 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,11 @@ import TabItem from '@theme/TabItem';
1010

1111
CocoIndex CLI is a standalone tool for easily managing and inspecting your flows and indexes.
1212

13-
## Invoking the CLI
13+
## Invoke the CLI
1414

1515
Once CocoIndex is installed, you can invoke the CLI directly using the `cocoindex` command. Most commands require an `APP_TARGET` argument, which tells the CLI where your flow definitions are located.
1616

17-
**APP_TARGET Format:**
17+
### APP_TARGET Format
1818

1919
The `APP_TARGET` can be:
2020
1. A **path to a Python file** defining your flows (e.g., `main.py`, `path/to/my_flows.py`).
@@ -23,14 +23,34 @@ The `APP_TARGET` can be:
2323
* `path/to/my_flows.py:MyFlow`
2424
* `my_package.flows:MyFlow`
2525

26-
**Global Options:**
26+
### Environment Variables
27+
28+
Environment variables are needed as CocoIndex library settings, as described in [CocoIndex Settings](settings#list-of-environment-variables).
29+
30+
You can set environment variables in an environment file.
31+
32+
* By default, the `cocoindex` CLI searches upward from the current directory for a `.env` file.
33+
* You can use `--env-file <path>` to specify one explicitly:
34+
35+
```sh
36+
cocoindex --env-file path/to/custom.env <COMMAND> ...
37+
```
38+
39+
Loaded variables do *NOT* override existing system ones.
40+
If no file is found, only existing system environment variables are used.
41+
42+
### Global Options
43+
44+
CocoIndex CLI supports the following global options:
2745

2846
* `--env-file <path>`: Load environment variables from a specified `.env` file. If not provided, `.env` in the current directory is loaded if it exists.
2947
* `--version`: Show the CocoIndex version and exit.
3048
* `--help`: Show the main help message and exit.
3149

3250
:::caution Deprecated Usage
51+
3352
The old method of invoking the CLI using `python main.py cocoindex ...` via the `@cocoindex.main_fn()` decorator is now deprecated. Please remove `@cocoindex.main_fn()` from your scripts and use the standalone cocoindex command as described.
53+
3454
:::
3555

3656
## Subcommands

docs/docs/core/flow_def.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -313,7 +313,7 @@ Following metrics are supported:
313313

314314
### Getting App Namespace
315315

316-
You can use the [`app_namespace` setting](initialization#app-namespace) or `COCOINDEX_APP_NAMESPACE` environment variable to specify the app namespace,
316+
You can use the [`app_namespace` setting](settings#app-namespace) or `COCOINDEX_APP_NAMESPACE` environment variable to specify the app namespace,
317317
to organize flows across different environments (e.g., dev, staging, production), team members, etc.
318318

319319
In the code, You can call `flow.get_app_namespace()` to get the app namespace, and use it to name certain backends. It takes the following arguments:

docs/docs/core/flow_methods.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
---
2-
title: Flow Running
2+
title: Run a Flow
33
toc_max_heading_level: 4
44
description: Run a CocoIndex Flow, including build / update data in the target storage and evaluate the flow without changing the target storage.
55
---
66

77
import Tabs from '@theme/Tabs';
88
import TabItem from '@theme/TabItem';
99

10-
# Running a CocoIndex Flow
10+
# Run a CocoIndex Flow
1111

1212
After a flow is defined as discussed in [Flow Definition](/docs/core/flow_def), you can start to transform data with it.
1313

docs/docs/core/initialization.mdx

Lines changed: 0 additions & 138 deletions
This file was deleted.

docs/docs/core/settings.mdx

Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
---
2+
title: CocoIndex Settings
3+
description: Provide settings for CocoIndex, e.g. database connection, app namespace, etc.
4+
---
5+
6+
import Tabs from '@theme/Tabs';
7+
import TabItem from '@theme/TabItem';
8+
9+
# CocoIndex Settings
10+
11+
Certain settings need to be provided for CocoIndex to work, e.g. database connections, app namespace, etc.
12+
13+
## Launch CocoIndex
14+
15+
You have two ways to launch CocoIndex:
16+
17+
* Use [Cocoindex CLI](cli). It's handy for most routine indexing building and management tasks.
18+
It will load settings from environment variables, either already set in your environment, or specified in `.env` file.
19+
See [CLI](cli#environment-variables) for more details.
20+
21+
* Call CocoIndex functionality from your own Python application or library.
22+
It's needed when you want to leverage CocoIndex support for query, or have your custom logic to trigger indexing, etc.
23+
24+
<Tabs>
25+
<TabItem value="python" label="Python" default>
26+
27+
You need to explicitly call `cocoindex.init()` before doing anything with CocoIndex, and settings will be loaded during the call.
28+
29+
* If it's called without any argument, it will load settings from environment variables.
30+
Only existing environment variables already set in your environment will be used.
31+
If you want to load environment variables from a specific `.env` file, consider call `load_dotenv()` provided by the [`python-dotenv`](https://github.com/theskumar/python-dotenv) package.
32+
33+
```py
34+
from dotenv import load_dotenv
35+
import cocoindex
36+
37+
load_dotenv()
38+
cocoindex.init()
39+
```
40+
41+
* It takes an optional `cocoindex.Settings` dataclass object as argument, so you can also construct settings explicitly and pass to it:
42+
43+
```py
44+
import cocoindex
45+
46+
cocoindex.init(
47+
cocoindex.Settings(
48+
database=cocoindex.DatabaseConnectionSpec(
49+
url="postgres://cocoindex:cocoindex@localhost/cocoindex"
50+
)
51+
)
52+
)
53+
```
54+
</TabItem>
55+
</Tabs>
56+
57+
## List of Settings
58+
59+
`cocoindex.Settings` is a dataclass that contains the following fields:
60+
61+
* `app_namespace` (type: `str`, required): The namespace of the application.
62+
* `database` (type: `DatabaseConnectionSpec`, required): The connection to the Postgres database.
63+
64+
### App Namespace
65+
66+
The `app_namespace` field helps organize flows across different environments (e.g., dev, staging, production), team members, etc. When set, it prefixes flow names with the namespace.
67+
68+
For example, if the namespace is `Staging`, for a flow with name specified as `Flow1` in code, the full name of the flow will be `Staging.Flow1`.
69+
You can also get the current app namespace by calling `cocoindex.get_app_namespace()` (see [Getting App Namespace](flow_def#getting-app-namespace) for more details).
70+
71+
If not set, all flows are in a default unnamed namespace.
72+
73+
*Environment variable*: `COCOINDEX_APP_NAMESPACE`
74+
75+
### DatabaseConnectionSpec
76+
77+
`DatabaseConnectionSpec` configures the connection to a database. Only Postgres is supported for now. It has the following fields:
78+
79+
* `url` (type: `str`, required): The URL of the Postgres database to use as the internal storage, e.g. `postgres://cocoindex:cocoindex@localhost/cocoindex`.
80+
81+
*Environment variable* for `Settings.database.url`: `COCOINDEX_DATABASE_URL`
82+
83+
* `user` (type: `str`, optional): The username for the Postgres database. If not provided, username will come from `url`.
84+
85+
*Environment variable* for `Settings.database.user`: `COCOINDEX_DATABASE_USER`
86+
87+
* `password` (type: `str`, optional): The password for the Postgres database. If not provided, password will come from `url`.
88+
89+
*Environment variable* for `Settings.database.password`: `COCOINDEX_DATABASE_PASSWORD`
90+
91+
:::tip
92+
93+
Please be careful that all values in `url` needs to be url-encoded if they contain special characters.
94+
For this reason, prefer to use the separated `user` and `password` fields for username and password.
95+
96+
:::
97+
98+
:::info
99+
100+
If you use the Postgres database hosted by [Supabase](https://supabase.com/), please click **Connect** on your project dashboard and find the following URL:
101+
102+
* If you're on a IPv6 network, use the URL under **Direct connection**. You can visit [IPv6 test](https://test-ipv6.com/) to see if you have IPv6 Internet connection.
103+
* Otherwise, use the URL under **Session pooler**.
104+
105+
:::
106+
107+
## List of Environment Variables
108+
109+
This is the list of environment variables, each of which has a corresponding field in `Settings`:
110+
111+
| environment variable | corresponding field in `Settings` | required? |
112+
|---------------------|-------------------|----------|
113+
| `COCOINDEX_DATABASE_URL` | `database.url` | Yes |
114+
| `COCOINDEX_DATABASE_USER` | `database.user` | No |
115+
| `COCOINDEX_DATABASE_PASSWORD` | `database.password` | No |
116+
| `COCOINDEX_APP_NAMESPACE` | `app_namespace` | No |

docs/docs/ops/storages.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ It should be a unique table, meaning that no other export target should export t
3737
The spec takes the following fields:
3838

3939
* `database` (type: [auth reference](../core/flow_def#auth-registry) to `DatabaseConnectionSpec`, optional): The connection to the Postgres database.
40-
See [DatabaseConnectionSpec](../core/initialization#databaseconnectionspec) for its specific fields.
40+
See [DatabaseConnectionSpec](../core/settings#databaseconnectionspec) for its specific fields.
4141
If not provided, will use the same database as the [internal storage](/docs/core/basics#internal-storage).
4242

4343
* `table_name` (type: `str`, optional): The name of the table to store to. If unspecified, will use the table name `[${AppNamespace}__]${FlowName}__${TargetName}`, e.g. `DemoFlow__doc_embeddings` or `Staging__DemoFlow__doc_embeddings`.
@@ -419,7 +419,7 @@ The `Neo4j` storage exports each row as a relationship to Neo4j Knowledge Graph.
419419
Neo4j also provides a declaration spec `Neo4jDeclaration`, to configure indexing options for nodes only referenced by relationships. It has the following fields:
420420

421421
* `connection` (type: auth reference to `Neo4jConnectionSpec`)
422-
* Fields for [nodes to declare](#nodes-to-declare), including
422+
* Fields for [nodes to declare](#declare-extra-node-labels), including
423423
* `nodes_label` (required)
424424
* `primary_key_fields` (required)
425425
* `vector_indexes` (optional)

docs/docusaurus.config.ts

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,17 @@ const config: Config = {
4949
],
5050
}),
5151
}),
52+
[
53+
'@docusaurus/plugin-client-redirects',
54+
{
55+
redirects: [
56+
{
57+
from: '/core/initialization',
58+
to: '/core/settings',
59+
},
60+
],
61+
},
62+
],
5263
],
5364

5465
presets: [

docs/package.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616
},
1717
"dependencies": {
1818
"@docusaurus/core": "3.7.0",
19+
"@docusaurus/plugin-client-redirects": "^3.7.0",
1920
"@docusaurus/preset-classic": "3.7.0",
2021
"@docusaurus/theme-mermaid": "^3.7.0",
2122
"@mdx-js/react": "^3.0.0",

docs/sidebars.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,8 @@ const sidebars: SidebarsConfig = {
1919
items: [
2020
'core/basics',
2121
'core/data_types',
22-
'core/initialization',
2322
'core/flow_def',
23+
'core/settings',
2424
'core/flow_methods',
2525
'core/cli',
2626
'core/custom_function',

0 commit comments

Comments
 (0)