Skip to content

Commit eae7222

Browse files
authored
feat: API documentation site using Github pages (#303)
Signed-off-by: Sreekanth <[email protected]>
1 parent 2ebe47f commit eae7222

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+2439
-723
lines changed

packages/pynumaflow/Makefile

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,3 +34,42 @@ proto:
3434
poetry run python3 -m grpc_tools.protoc -Ipynumaflow/proto/sideinput=pynumaflow/proto/sideinput -Ipynumaflow/proto/common=pynumaflow/proto/common --pyi_out=. --python_out=. --grpc_python_out=. pynumaflow/proto/sideinput/*.proto
3535
poetry run python3 -m grpc_tools.protoc -Ipynumaflow/proto/sourcer=pynumaflow/proto/sourcer -Ipynumaflow/proto/common=pynumaflow/proto/common --pyi_out=. --python_out=. --grpc_python_out=. pynumaflow/proto/sourcer/*.proto
3636
poetry run python3 -m grpc_tools.protoc -Ipynumaflow/proto/accumulator=pynumaflow/proto/accumulator -Ipynumaflow/proto/common=pynumaflow/proto/common --pyi_out=. --python_out=. --grpc_python_out=. pynumaflow/proto/accumulator/*.proto
37+
38+
39+
# ============================================================================
40+
# Documentation targets
41+
# ============================================================================
42+
43+
.PHONY: docs docs-serve docs-build docs-deploy-dev docs-deploy-version docs-list docs-set-default docs-delete
44+
45+
docs: docs-serve ## Alias for docs-serve
46+
47+
docs-serve: ## Serve documentation locally with hot-reload (http://localhost:8000)
48+
poetry run mkdocs serve
49+
50+
docs-build: ## Build documentation locally
51+
poetry run mkdocs build
52+
53+
docs-deploy-dev: ## Deploy dev docs to docs-site branch
54+
poetry run mike deploy -b docs-site dev --push
55+
56+
docs-deploy-version: ## Deploy versioned docs (usage: make docs-deploy-version VERSION=0.11)
57+
ifndef VERSION
58+
$(error VERSION is required. Usage: make docs-deploy-version VERSION=0.11)
59+
endif
60+
poetry run mike deploy -b docs-site $(VERSION) latest --update-aliases --push
61+
62+
docs-list: ## List all deployed documentation versions
63+
poetry run mike list -b docs-site
64+
65+
docs-set-default: ## Set the default documentation version to 'latest'
66+
poetry run mike set-default -b docs-site latest --push
67+
68+
docs-delete: ## Delete a documentation version (usage: make docs-delete VERSION=0.10)
69+
ifndef VERSION
70+
$(error VERSION is required. Usage: make docs-delete VERSION=0.10)
71+
endif
72+
poetry run mike delete -b docs-site $(VERSION) --push
73+
74+
docs-setup: ## Install documentation dependencies
75+
poetry install --with docs --no-root

packages/pynumaflow/README.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ To build the package locally, run the following command from the root of the pro
2323

2424
```bash
2525
make setup
26-
````
26+
```
2727

2828
To run unit tests:
2929
```bash
@@ -57,22 +57,21 @@ There are different types of gRPC server mechanisms which can be used to serve t
5757
These have different functionalities and are used for different use cases.
5858

5959
Currently we support the following server types:
60+
6061
- Sync Server
6162
- Asyncronous Server
6263
- MultiProcessing Server
6364

6465
Not all of the above are supported for all UDFs, UDSource and UDSinks.
6566

6667
For each of the UDFs, UDSource and UDSinks, there are seperate classes for each of the server types.
67-
This helps in keeping the interface simple and easy to use, and the user can start the specific server type based
68-
on the use case.
68+
This helps in keeping the interface simple and easy to use, and the user can start the specific server type based on the use case.
6969

7070

7171
#### SyncServer
7272

7373
Syncronous Server is the simplest server type. It is a multithreaded threaded server which can be used for simple UDFs and UDSinks.
74-
Here the server will invoke the handler function for each message. The messaging is synchronous and the server will wait
75-
for the handler to return before processing the next message.
74+
Here the server will invoke the handler function for each message. The messaging is synchronous and the server will wait for the handler to return before processing the next message.
7675

7776
```
7877
grpc_server = MapServer(handler)
@@ -83,13 +82,13 @@ grpc_server = MapServer(handler)
8382
Asyncronous Server is a multi threaded server which can be used for UDFs which are asyncronous. Here we utilize the asyncronous capabilities of Python to process multiple messages in parallel. The server will invoke the handler function for each message. The messaging is asyncronous and the server will not wait for the handler to return before processing the next message. Thus this server type is useful for UDFs which are asyncronous.
8483
The handler function for such a server should be an async function.
8584

86-
```
85+
```py
8786
grpc_server = MapAsyncServer(handler)
8887
```
8988

9089
#### MultiProcessServer
9190

92-
MultiProcess Server is a multi process server which can be used for UDFs which are CPU intensive. Here we utilize the multi process capabilities of Python to process multiple messages in parallel by forking multiple servers in different processes.
91+
MultiProcess Server is a multi process server which can be used for UDFs which are CPU intensive. Here we utilize the multi process capabilities of Python to process multiple messages in parallel by forking multiple servers in different processes.
9392
The server will invoke the handler function for each message. Individually at the server level the messaging is synchronous and the server will wait for the handler to return before processing the next message. But since we have multiple servers running in parallel, the overall messaging also executes in parallel.
9493

9594
This could be an alternative to creating multiple replicas of the same UDF container as here we are using the multi processing capabilities of the system to process multiple messages in parallel but within the same container.
@@ -140,7 +139,8 @@ should follow the same signature.
140139

141140
For using the class based handlers the user can inherit from the base handler class for each of the functionalities and implement the handler function.
142141
The base handler class for each of the functionalities has the same signature as the handler function for the respective server type.
143-
The list of base handler classes for each of the functionalities is given below -
142+
The list of base handler classes for each of the functionalities is given below:
143+
144144
- UDFs
145145
- Map
146146
- Mapper
@@ -159,5 +159,5 @@ The list of base handler classes for each of the functionalities is given below
159159
- SideInput
160160
- SideInput
161161

162-
More details about the signature of the handler function for each of the server types is given in the
162+
More details about the signature of the handler function for each of the server types is given in the
163163
documentation of the respective server type.
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Accumulator
2+
3+
This module offers tools for accumulating and processing data while managing state. With it, you can:
4+
5+
- Accumulate data over time
6+
- Maintain state across messages
7+
- Process accumulated data
8+
9+
## Classes
10+
11+
::: pynumaflow.accumulator
12+
options:
13+
show_root_heading: false
14+
show_root_full_path: false
15+
members_order: source
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# Batch Mapper
2+
3+
The Batch Mapper module offers tools for building BatchMap UDFs, allowing you to process multiple messages simultaneously. This enables more efficient handling of workloads such as bulk API requests or batch database operations by grouping messages and processing them together in a single operation.
4+
5+
## Classes
6+
7+
::: pynumaflow.batchmapper
8+
options:
9+
show_root_heading: false
10+
show_root_full_path: false
11+
members_order: source
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# API Reference
2+
3+
This section provides detailed API documentation for all pynumaflow modules.
4+
5+
## Modules
6+
7+
| Module | Description |
8+
|--------|-------------|
9+
| [Sourcer](sourcer.md) | User Defined Source for custom data sources |
10+
| [Source Transformer](sourcetransformer.md) | Transform data at ingestion |
11+
| [Mapper](mapper.md) | Map UDF for transforming messages one at a time |
12+
| [Map Streamer](mapstreamer.md) | MapStream UDF for streaming results as they're produced |
13+
| [Batch Mapper](batchmapper.md) | BatchMap UDF for processing messages in batches |
14+
| [Sinker](sinker.md) | User Defined Sink for custom data destinations |
15+
| [Reducer](reducer.md) | Reduce UDF for aggregating messages by key and time window |
16+
| [Reduce Streamer](reducestreamer.md) | Stream reduce results incrementally |
17+
| [Accumulator](accumulator.md) | Accumulate and process data with state |
18+
| [Side Input](sideinput.md) | Inject external data into UDFs |
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# Mapper
2+
3+
The Mapper module provides classes and functions for implementing Map UDFs that transform messages one at a time.
4+
Map is the most common UDF type. It receives one message at a time and can return:
5+
6+
- One message (1:1 transformation)
7+
- Multiple messages (fan-out)
8+
- No messages (filter/drop)
9+
10+
## Classes
11+
12+
::: pynumaflow.mapper
13+
options:
14+
show_root_heading: false
15+
show_root_full_path: false
16+
members_order: source
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# Map Streamer
2+
3+
The Map Streamer module provides classes and functions for implementing MapStream UDFs that stream results as they're produced.
4+
Unlike regular Map which returns all messages at once, Map Stream yields messages one at a time as they're ready, reducing latency for downstream consumers.
5+
6+
## Classes
7+
8+
::: pynumaflow.mapstreamer
9+
options:
10+
show_root_heading: false
11+
show_root_full_path: false
12+
members_order: source
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# Reducer
2+
3+
The Reducer module provides classes and functions for implementing Reduce UDFs that aggregate messages by key within time windows.
4+
It's used for operations like counting, summing, or computing statistics over groups of messages.
5+
6+
## Classes
7+
8+
::: pynumaflow.reducer
9+
options:
10+
show_root_heading: false
11+
show_root_full_path: false
12+
members_order: source
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# Reduce Streamer
2+
3+
The Reduce Streamer module provides classes and functions for implementing ReduceStream UDFs that emit results incrementally during reduction.
4+
Unlike regular Reduce which outputs only when the window closes, Reduce Stream emits results as they're computed. This is useful for early alerts or real-time dashboards.
5+
6+
## Classes
7+
8+
::: pynumaflow.reducestreamer
9+
options:
10+
show_root_heading: false
11+
show_root_full_path: false
12+
members_order: source
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# Side Input
2+
3+
Side Input allows you to inject external data into your UDFs. This is useful for configuration, lookup tables, or any data that UDFs need but isn't part of the main data stream.
4+
5+
## Classes
6+
7+
::: pynumaflow.sideinput
8+
options:
9+
show_root_heading: false
10+
show_root_full_path: false
11+
members_order: source

0 commit comments

Comments
 (0)