Skip to content

Commit 48a3975

Browse files
authored
docs(minor-fix): fix a bunch of minor language issues in docs (#771)
1 parent ed9d6bc commit 48a3975

File tree

7 files changed

+22
-23
lines changed

7 files changed

+22
-23
lines changed

docs/docs/about/community.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ Welcome with a huge coconut hug 🥥⋆。˚🤗.
99

1010
We are super excited for community contributions of all kinds - whether it's code improvements, documentation updates, issue reports, feature requests on [GitHub](https://github.com/cocoindex-io/cocoindex), and discussions in our [Discord](https://discord.com/invite/zpA9S2DR7s).
1111

12-
We would love to fostering an inclusive, welcoming, and supportive environment. Contributing to CocoIndex should feel collaborative, friendly and enjoyable for everyone. Together, we can build better AI applications through robust data infrastructure.
12+
We would love to foster an inclusive, welcoming, and supportive environment. Contributing to CocoIndex should feel collaborative, friendly and enjoyable for everyone. Together, we can build better AI applications through robust data infrastructure.
1313

1414
:::tip Start hacking CocoIndex
1515
Check out our [Contributing guide](./contributing) to get started!

docs/docs/about/contributing.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,10 +18,10 @@ We tag issues with the ["good first issue"](https://github.com/cocoindex-io/coco
1818
## How to Contribute
1919
- If you decide to work on an issue, unless the PR can be sent immediately (e.g. just a few lines of code), we recommend you to leave a comment on the issue like **`I'm working on it`** or **`Can I work on this issue?`** to avoid duplicating work.
2020
- For larger features, we recommend you to discuss with us first in our [Discord server](https://discord.com/invite/zpA9S2DR7s) to coordinate the design and work.
21-
- Our [Discord server](https://discord.com/invite/zpA9S2DR7s) are constantly open. If you are unsure about anything, it is a good place to discuss! We'd love to collaborate and will always be friendly.
21+
- Our [Discord server](https://discord.com/invite/zpA9S2DR7s) is constantly open. If you are unsure about anything, it is a good place to discuss! We'd love to collaborate and will always be friendly.
2222

2323
## Start hacking! Setting Up Development Environment
24-
Following the steps below to get cocoindex build on latest codebase locally - if you are making changes to cocoindex funcionality and want to test it out.
24+
Follow the steps below to get cocoindex built on the latest codebase locally - if you are making changes to cocoindex functionality and want to test it out.
2525

2626
- 🦀 [Install Rust](https://rust-lang.org/tools/install)
2727

docs/docs/core/flow_def.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -468,7 +468,7 @@ Then reference it when building a spec that takes an auth entry:
468468
Note that CocoIndex backends use the key of an auth entry to identify the backend.
469469

470470
* Keep the key stable.
471-
If the key doesn't change, it's considered to be the same backend (even if the underlying way to connect/authenticate change).
471+
If the key doesn't change, it's considered to be the same backend (even if the underlying way to connect/authenticate changes).
472472

473473
* If a key is no longer referenced in any operation spec, keep it until the next flow setup / drop action,
474-
so that cocoindex will be able to clean up the backends.
474+
so that CocoIndex will be able to clean up the backends.

docs/docs/core/flow_methods.mdx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -44,9 +44,9 @@ For a flow, its persistent backends need to be ready before it can run, includin
4444
The desired state of the backends for a flow is derived based on the flow definition itself.
4545
CocoIndex supports two types of actions to manage the persistent backends automatically:
4646

47-
* *Setup* a flow, which will change the backends owned by the flow to a state to the desired state, e.g. create new tables for new flow, drop an existing table if the corresponding target is gone, add new column to a target table if a new field is collected, etc. It's no-op if the backend states are already in the desired state.
47+
* *Setup* a flow, which will change the backends owned by the flow to the desired state, e.g. create new tables for new flow, drop an existing table if the corresponding target is gone, add new column to a target table if a new field is collected, etc. It's no-op if the backend states are already in the desired state.
4848

49-
* *Drop* a flow, which will drop all backends owned by the flow. It's no-op if there's no existing backends owned by the flow (e.g. never setup or already dropped).
49+
* *Drop* a flow, which will drop all backends owned by the flow. It's no-op if there are no existing backends owned by the flow (e.g. never setup or already dropped).
5050

5151
### CLI
5252

@@ -138,7 +138,7 @@ This is to achieve best efficiency.
138138

139139
The `cocoindex update` subcommand creates/updates data in the target.
140140

141-
Once it's done, the target data is fresh up to the moment when the function is called.
141+
Once it's done, the target data is fresh up to the moment when the command is called.
142142

143143
```sh
144144
cocoindex update main.py
@@ -203,7 +203,7 @@ To perform live update, run the `cocoindex update` subcommand with `-L` option:
203203
cocoindex update main.py -L
204204
```
205205

206-
If there's at least one data source with change capture mechanism enabled, it will keep running until the aborted (e.g. by `Ctrl-C`).
206+
If there's at least one data source with change capture mechanism enabled, it will keep running until aborted (e.g. by `Ctrl-C`).
207207
Otherwise, it falls back to the same behavior as one time update, and will finish after a one-time update is done.
208208

209209
With a `--setup` option, it will also setup the flow first if needed.

docs/docs/getting_started/quickstart.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,10 @@ import ReactPlayer from 'react-player'
77

88
# Build your first CocoIndex project
99

10-
This guide will help you get up and running with CocoIndex in just a few minutes, that does:
10+
This guide will help you get up and running with CocoIndex in just a few minutes. We'll build a project that does:
1111
* Read files from a directory
1212
* Perform basic chunking and embedding
13-
* loads the data into a vector store (PG Vector)
13+
* Load the data into a vector store (PG Vector)
1414

1515
<ReactPlayer controls url='https://www.youtube.com/watch?v=gv5R8nOXsWU' />
1616

@@ -107,11 +107,11 @@ Notes:
107107
3. A *data source* extracts data from an external source.
108108
In this example, the `LocalFile` data source imports local files as a KTable (table with a key field, see [KTable](../core/data_types#ktable) for details), each row has `"filename"` and `"content"` fields.
109109

110-
4. After defining the KTable, we extended a new field `"chunks"` to each row by *transforming* the `"content"` field using `SplitRecursively`. The output of the `SplitRecursively` is also a KTable representing each chunk of the document, with `"location"` and `"text"` fields.
110+
4. After defining the KTable, we extend a new field `"chunks"` to each row by *transforming* the `"content"` field using `SplitRecursively`. The output of the `SplitRecursively` is also a KTable representing each chunk of the document, with `"location"` and `"text"` fields.
111111

112-
5. After defining the KTable, we extended a new field `"embedding"` to each row by *transforming* the `"text"` field using `SentenceTransformerEmbed`.
112+
5. After defining the KTable, we extend a new field `"embedding"` to each row by *transforming* the `"text"` field using `SentenceTransformerEmbed`.
113113

114-
6. In CocoIndex, a *collector* collects multiple entries of data together. In this example, the `doc_embeddings` collector collects data from all `chunk`s across all `doc`s, and using the collected data to build a vector index `"doc_embeddings"`, using `Postgres`.
114+
6. In CocoIndex, a *collector* collects multiple entries of data together. In this example, the `doc_embeddings` collector collects data from all `chunk`s across all `doc`s, and uses the collected data to build a vector index `"doc_embeddings"`, using `Postgres`.
115115

116116
## Step 3: Run the indexing pipeline and queries
117117

@@ -271,7 +271,7 @@ Now we can run the same Python file, which will run the new added main logic:
271271
python quickstart.py
272272
```
273273
274-
It will ask you to enter a query and it will return the top 10 results.
274+
It will ask you to enter a query and it will return the top 5 results.
275275
276276
## Next Steps
277277

docs/docs/ops/sources.md

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -111,10 +111,9 @@ This is how to setup:
111111

112112
* In the [Amazon S3 Console](https://s3.console.aws.amazon.com/s3/home), open your S3 bucket. Under *Properties* tab, click *Create event notification*.
113113
* Fill in an arbitrary event name, e.g. `S3ChangeNotifications`.
114-
* If you want your AmazonS3 data source expose a subset of files sharing a prefix, set the same prefix here. Otherwise, leave it empty.
114+
* If you want your AmazonS3 data source to expose a subset of files sharing a prefix, set the same prefix here. Otherwise, leave it empty.
115115
* Select the following event types: *All object create events*, *All object removal events*.
116116
* Select *SQS queue* as the destination, and specify the SQS queue you created above.
117-
and enable *Change Event Notifications* for your bucket, and specify the SQS queue as the destination.
118117

119118
AWS's [Guide of Configuring a Bucket for Notifications](https://docs.aws.amazon.com/AmazonS3/latest/userguide/ways-to-add-notification-config-to-bucket.html#step1-create-sqs-queue-for-notification) provides more details.
120119

@@ -141,7 +140,7 @@ The spec takes the following fields:
141140
:::info
142141

143142
We will delete messages from the queue after they're processed.
144-
If there're unrelated messages in the queue (e.g. test messages that SQS will send automatically on queue creation, messages for a different bucket, for non-included files, etc.), we will delete the message upon receiving it, to avoid keeping receiving irrelevant messages again and again after they're redelivered.
143+
If there are unrelated messages in the queue (e.g. test messages that SQS will send automatically on queue creation, messages for a different bucket, for non-included files, etc.), we will delete the message upon receiving it, to avoid repeatedly receiving irrelevant messages after they're redelivered.
145144

146145
:::
147146

@@ -253,12 +252,12 @@ The spec takes the following fields:
253252
it's typically cheaper than a full refresh by setting the [refresh interval](../core/flow_def#refresh-interval) especially when the folder contains a large number of files.
254253
So you can usually set it with a smaller value compared to the `refresh_interval`.
255254

256-
On the other hand, this only detects changes for files still exists.
257-
If the file is deleted (or the current account no longer has access to), this change will not be detected by this change stream.
255+
On the other hand, this only detects changes for files that still exist.
256+
If the file is deleted (or the current account no longer has access to it), this change will not be detected by this change stream.
258257

259-
So when a `GoogleDrive` source enabled `recent_changes_poll_interval`, it's still recommended to set a `refresh_interval`, with a larger value.
258+
So when a `GoogleDrive` source has `recent_changes_poll_interval` enabled, it's still recommended to set a `refresh_interval`, with a larger value.
260259
So that most changes can be covered by polling recent changes (with low latency, like 10 seconds), and remaining changes (files no longer exist or accessible) will still be covered (with a higher latency, like 5 minutes, and should be larger if you have a huge number of files like 1M).
261-
In reality, configure them based on your requirement: how freshness do you need to target index to be?
260+
In reality, configure them based on your requirement: how fresh do you need the target index to be?
262261

263262
:::
264263

docs/docs/ops/targets.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -413,7 +413,7 @@ If you don't have a Neo4j database, you can start a Neo4j database using our doc
413413
docker compose -f <(curl -L https://raw.githubusercontent.com/cocoindex-io/cocoindex/refs/heads/main/dev/neo4j.yaml) up -d
414414
```
415415

416-
If will bring up a Neo4j instance, which can be accessed by username `neo4j` and password `cocoindex`.
416+
This will bring up a Neo4j instance, which can be accessed by username `neo4j` and password `cocoindex`.
417417
You can access the Neo4j browser at [http://localhost:7474](http://localhost:7474).
418418

419419
:::warning

0 commit comments

Comments
 (0)