Skip to content

Commit ab9fcbb

Browse files
kadolorjiayuasu
andauthored
[DOCS] Fit and finish fixes (#110)
Co-authored-by: Jia Yu <[email protected]>
1 parent 5d52cd7 commit ab9fcbb

File tree

11 files changed

+463
-175
lines changed

11 files changed

+463
-175
lines changed

README.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,11 @@ SedonaDB only runs on a single machine, so it’s perfect for processing smaller
2727

2828
## Install
2929

30-
You can install Python SedonaDB with `pip install apache-sedona[db]`.
30+
You can install Python SedonaDB with PyPI:
31+
32+
```sh
33+
pip install "apache-sedona[db]"
34+
```
3135

3236
## Overture buildings example
3337

docs/development.md renamed to docs/contributors-guide.md

Lines changed: 112 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -17,14 +17,66 @@
1717
under the License.
1818
-->
1919

20-
# Development
20+
# Contributors Guide
21+
22+
This guide details how to set up your development environment as a SedonaDB Contributor.
23+
24+
## Fork and clone the repository
25+
26+
Your first step is to create a personal copy of the repository and connect it to the main project.
27+
28+
1. Fork the repository
29+
30+
* Navigate to the official [Apache SedonaDB GitHub repository](https://github.com/apache/sedona-db).
31+
* Click the **Fork** button in the top-right corner. This creates a complete copy of the project in your own GitHub account.
32+
33+
1. Clone your fork
34+
35+
* Next, clone your newly created fork to your local machine. This command downloads the repository into a new folder named `sedona-db`.
36+
* Replace `YourUsername` with your actual GitHub username.
37+
38+
```shell
39+
git clone https://github.com/YourUsername/sedona-db.git
40+
cd sedona-db
41+
```
42+
43+
1. Configure the remotes
44+
45+
* Your local repository needs to know where the original project is so you can pull in updates. You'll add a remote link, traditionally named **`upstream`**, to the main Apache SedonaDB repository.
46+
* Your fork is automatically configured as the **`origin`** remote.
47+
48+
```shell
49+
# Add the main repository as the "upstream" remote
50+
git remote add upstream https://github.com/apache/sedona-db.git
51+
```
52+
53+
1. Verify the configuration
54+
55+
* Run the following command to verify that you have two remotes configured correctly: `origin` (your fork) and `upstream` (the main repository).
56+
57+
```shell
58+
git remote -v
59+
```
60+
61+
* The output should look like this:
62+
63+
```shell
64+
origin https://github.com/YourUsername/sedona-db.git (fetch)
65+
origin https://github.com/YourUsername/sedona-db.git (push)
66+
upstream https://github.com/apache/sedona-db.git (fetch)
67+
upstream https://github.com/apache/sedona-db.git (push)
68+
```
2169
2270
## Rust
2371
24-
SedonaDB is written and Rust and is a standard `cargo` workspace. You can
25-
install a recent version of the Rust compiler and cargo from
26-
[rustup.rs](https://rustup.rs/) and run tests using `cargo test`. A local
27-
development version of the CLI can be run with `cargo run --bin sedona-cli`.
72+
SedonaDB is written in Rust and is a standard `cargo` workspace.
73+
74+
You can install a recent version of the Rust compiler and cargo from
75+
[rustup.rs](https://rustup.rs/) and run tests using `cargo test`.
76+
77+
A local development version of the CLI can be run with `cargo run --bin sedona-cli`.
78+
79+
### Test data setup
2880
2981
Some tests require submodules that contain test data or pinned versions of
3082
external dependencies. These submodules can be initialized with:
@@ -40,16 +92,26 @@ Additionally, some of the data required in the tests can be downloaded by runnin
4092
python submodules/download-assets.py
4193
```
4294
95+
### System dependencies
96+
4397
Some crates wrap external native libraries and require system dependencies
44-
to build. At this time the only crate that requires this is the sedona-s2geography
45-
crate, which requires [CMake](https://cmake.org),
46-
[Abseil](https://github.com/abseil/abseil-cpp) and OpenSSL. These can be installed
47-
on MacOS with [Homebrew](https://brew.sh):
98+
to build.
99+
100+
!!!note "`sedona-s2geography`"
101+
At this time, the only crate that requires this is the `sedona-s2geography`
102+
crate, which requires [CMake](https://cmake.org),
103+
[Abseil](https://github.com/abseil/abseil-cpp) and OpenSSL.
104+
105+
#### macOS
106+
107+
These can be installed on macOS with [Homebrew](https://brew.sh):
48108
49109
```shell
50110
brew install abseil openssl cmake geos
51111
```
52112
113+
#### Linux and Windows
114+
53115
On Linux and Windows, it is recommended to use [vcpkg](https://github.com/microsoft/vcpkg)
54116
to provide external dependencies. This can be done by setting the `CMAKE_TOOLCHAIN_FILE`
55117
environment variable:
@@ -58,7 +120,9 @@ environment variable:
58120
export CMAKE_TOOLCHAIN_FILE=/path/to/vcpkg/scripts/buildsystems/vcpkg.cmake
59121
```
60122
61-
When using VSCode, it may be necessary to set this environment variable in settings.json
123+
#### Visual Studio Code (VSCode) Configuration
124+
125+
When using VSCode, it may be necessary to set this environment variable in `settings.json`
62126
such that it can be found by rust-analyzer when running build/run tasks:
63127
64128
```json
@@ -75,8 +139,9 @@ such that it can be found by rust-analyzer when running build/run tasks:
75139
## Python
76140
77141
Python bindings to SedonaDB are built with the [Maturin](https://www.maturin.rs) build
78-
backend. Installing a development version of the main Python bindings the first time
79-
can be done with:
142+
backend.
143+
144+
To install a development version of the main Python bindings for the first time, run the following commands:
80145
81146
```shell
82147
cd python/sedonadb
@@ -92,12 +157,16 @@ maturin develop
92157
93158
## Debugging
94159
160+
### Rust
161+
95162
Debugging Rust code is most easily done by writing or finding a test that triggers
96163
the desired behavior and running it using the *Debug* selection in
97164
[VSCode](https://code.visualstudio.com/) with the
98165
[rust-analyzer](https://marketplace.visualstudio.com/items?itemName=rust-lang.rust-analyzer)
99-
extension. Rust code can also debugged using the CLI by finding the `main()` function in
100-
sedona-cli and choosing the *Debug* run option.
166+
extension. Rust code can also be debugged using the CLI by finding the `main()` function in
167+
`sedona-cli` and choosing the *Debug* run option.
168+
169+
### Python, C, and C++
101170
102171
Installation of Python bindings with `maturin develop` ensures a debug-friendly build for
103172
debugging Rust, Python, or C/C++ code. Python code can be debugged using breakpoints in
@@ -114,7 +183,9 @@ In general, there is at least one benchmark for every implementation of a functi
114183
and a few other benchmarks for low-level iteration where work was done to optimize
115184
specific cases.
116185
117-
Briefly, benchmarks for a specific crate can be run with `cargo bench`:
186+
### Running benchmarks
187+
188+
Benchmarks for a specific crate can be run with `cargo bench`:
118189
119190
```shell
120191
cd rust/sedona-geo
@@ -129,17 +200,22 @@ to read for a specific crate).
129200
cargo bench -- st_area
130201
```
131202
203+
### Managing results
204+
132205
By default, criterion saves the last run and will report the difference between the
133206
current benchmark and the last time it was run (although there are options to
134-
save and load various baselines). A report containing the last run for any
135-
benchmark that was ever run can be opened with:
207+
save and load various baselines).
136208
137-
```shell
138-
# MacOS
139-
open target/criterion/report/index.html
140-
# Ubuntu
141-
xdg-open target/criterion/report/index.html
142-
```
209+
A report of the latest results for all benchmarks can be opened with the following command:
210+
211+
=== "macOS"
212+
```shell
213+
open target/criterion/report/index.html
214+
```
215+
=== "Ubuntu"
216+
```shell
217+
xdg-open target/criterion/report/index.html
218+
```
143219
144220
All previous saved benchmark runs can be cleared with:
145221
@@ -149,6 +225,16 @@ rm -rf target/criterion
149225
150226
## Documentation
151227
152-
* `mkdocs serve` - Start the live-reloading docs server.
153-
* `mkdocs build` - Build the documentation site.
154-
* `mkdocs -h` - Print help message and exit.
228+
To contribute to the SedonaDB documentation:
229+
230+
1. Clone the repository and create a fork.
231+
1. Install the Documentation dependencies:
232+
```sh
233+
pip install -r docs/requirements.txt
234+
```
235+
1. Make your changes to the documentation files.
236+
1. Preview your changes locally using these commands:
237+
* `mkdocs serve` - Start the live-reloading docs server.
238+
* `mkdocs build` - Build the documentation site.
239+
* `mkdocs -h` - Print help message and exit.
240+
1. Push your changes and open a pull request.

docs/index.md

Lines changed: 23 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
---
22
hide:
3-
- navigation
43

54
title: Introducing SedonaDB
65
---
@@ -24,30 +23,45 @@ title: Introducing SedonaDB
2423
under the License.
2524
-->
2625

27-
SedonaDB is a high-performance, dependency-free geospatial compute engine designed for single-node processing, making it ideal for smaller datasets on local machines or cloud instances.
26+
SedonaDB is a single-node analytical database engine with geospatial as the first-class citizen.
27+
28+
Fast and dependency-free, SedonaDB is ideal for working with smaller datasets located on local machines or cloud instances.
2829

2930
The initial `0.1` release supports a core set of vector operations, with comprehensive vector and raster computation capabilities planned for the near future.
3031

32+
For distributed workloads, you can still leverage the power of SedonaSpark, SedonaFlink, or SedonaSnow.
33+
3134
## Key features
3235

3336
SedonaDB has several advantages:
3437

3538
* **Exceptional Performance:** Built in Rust to process massive geospatial datasets with exceptional speed.
3639
* **Unified Geospatial Toolkit:** Access a comprehensive suite of functions for both vector and raster data in a single, powerful library.
37-
* **Seamless Ecosystem Integration:** Built on Apache Arrow for smooth interoperability with popular data science libraries like GeoPandas, DuckDB, and Polars.
40+
* **Extensive Ecosystem Integration:** Built on Apache Arrow for smooth interoperability with popular data science libraries like GeoPandas, DuckDB, and Polars.
3841
* **Flexible APIs:** Effortlessly switch between Python and SQL interfaces to match your preferred workflow and skill set.
3942
* **Guaranteed CRS Propagation:** Automatically manages coordinate reference systems (CRS) to ensure spatial accuracy and prevent common errors.
4043
* **Broad File Format Support:** Work with a wide range of both modern and legacy geospatial file formats like geoparquet.
4144
* **Highly Extensible:** Easily customize and extend the library's functionality to meet your project's unique requirements.
4245

43-
## Run a query in SQL, Python, or Rust
46+
## Install SedonaDB
47+
48+
Here's how to install SedonaDB with various build tools:
49+
50+
=== "pip"
51+
52+
```bash
53+
pip install "apache-sedona[db]"
54+
```
55+
56+
=== "R"
4457

45-
SedonaDB offers a flexible query interface in SQL, Python, or Rust.
58+
```bash
59+
install.packages("sedonadb", repos = "https://community.r-multiverse.org")
60+
```
4661

47-
Engineered for speed, SedonaDB provides performant geospatial processing on a single machine. This makes it perfect for the rapid analysis of smaller datasets, whether you're working locally or on a cloud server. While the initial release focuses on core vector operations, a full suite of vector and raster computations is on the roadmap.
62+
## Run a query in SQL, Python, Rust, or R
4863

49-
For massive, distributed workloads, you can leverage the power of SedonaSpark,
50-
SedonaFlink, or SedonaSnow.
64+
SedonaDB offers a flexible query interface.
5165

5266
=== "SQL"
5367

@@ -58,7 +72,7 @@ SedonaFlink, or SedonaSnow.
5872
=== "Python"
5973

6074
```python
61-
import seonda.db
75+
import sedona.db
6276

6377
sd = sedona.db.connect()
6478
sd.sql("SELECT ST_Point(0, 1) as geom")
@@ -86,21 +100,6 @@ SedonaFlink, or SedonaSnow.
86100
sd_sql("SELECT ST_Point(0, 1) as geom")
87101
```
88102

89-
## Install SedonaDB
90-
91-
Here's how to install SedonaDB with various build tools:
92-
93-
=== "pip"
94-
95-
```bash
96-
pip install "apache-sedona[db]"
97-
```
98-
99-
=== "R"
100-
101-
```bash
102-
install.packages("sedonadb", repos = "https://community.r-multiverse.org")
103-
```
104103

105104
## Have questions?
106105

0 commit comments

Comments
 (0)