|
1 |
| -# DEVGUIDE.md |
| 1 | +# DEVGUIDE.md (WIP) |
| 2 | + |
| 3 | +> **Note: this DEVGUIDE is under construction and is not complete yet; see `scripts/docs` for documentation on each of the available scripts.** |
2 | 4 |
|
3 | 5 | ## Contents
|
4 |
| -1. [Running the tests](#running-the-tests) |
5 |
| - 1. [Prerequisites](#prerequisites) |
6 |
| - 2. [I can't be bothered to read all of this](#i-cant-be-bothered-to-read-all-of-this) |
7 |
| - 3. [The custom test script](#the-custom-test-script) |
8 |
| - 4. [Test tags](#test-tags) |
9 |
| - 5. [Running vectorize tests](#running-vectorize-tests) |
10 |
| - 6. [Running the tests on local Stargate](#running-the-tests-on-local-stargate) |
11 |
| - 7. [The custom Mocha wrapper](#the-custom-mocha-wrapper) |
12 |
| -2. [Typechecking & Linting](#typechecking--linting) |
13 |
| -3. [Building the library](#building-the-library) |
14 |
| -4. [Publishing](#publishing) |
15 |
| -5. [Miscellaneous](#miscellaneous) |
| 6 | +1. [I can't be bothered to read all of this](#i-cant-be-bothered-to-read-all-of-this) |
| 7 | +2. [Building the library](#building-the-library) |
| 8 | +3. [Publishing](#publishing) |
| 9 | +4. [Miscellaneous](#miscellaneous) |
16 | 10 | 1. [nix-shell + direnv support](#nix-shell--direnv-support)
|
17 | 11 |
|
18 |
| -## Running the tests |
19 |
| - |
20 |
| -### Prerequisites |
21 |
| - |
22 |
| -- `npm`/`npx` |
23 |
| -- A running Data API instance |
24 |
| -- A `.env` with the credentials filled out |
25 |
| - |
26 |
| -<sub>*DISCLAIMER: The test suite will create any necessary namespaces/collections, and any existing collections in |
27 |
| -the database will be deleted.*</sub> |
28 |
| - |
29 |
| -<sub>*Also, if you for some reason already have an existing namespace called 'slania', it too will be deleted. Not |
30 |
| -sure why you'd have a namespace named that, but if you do, you have a good taste in music.*</sub> |
31 |
| - |
32 |
| -### I can't be bothered to read all of this |
33 |
| - |
34 |
| -1. Just make sure `CLIENT_DB_URL` and `CLIENT_DB_TOKEN` are set in your `.env` file |
35 |
| -2. If you're running the full test suite, copy `vectorize_test_spec.example.json`, fill out the providers you want |
36 |
| - to test, and delete the rest |
37 |
| -3. Run one of the following commands: |
38 |
| - |
39 |
| -```sh |
40 |
| -# Add '-e dse' or '-e hcd' to the command if running on either of those |
41 |
| - |
42 |
| -# Runs the full test suite (~10m) |
43 |
| -sh scripts/test.sh -all # -e dse|hcd |
44 |
| - |
45 |
| -# Runs a version of the test suite that omits all longer-running tests (~2m) |
46 |
| -sh scripts/test.sh -light # -e dse|hcd |
47 |
| -``` |
48 |
| - |
49 |
| -### The custom test script |
50 |
| - |
51 |
| -The `astra-db-ts` test suite uses a custom wrapper around [ts-mocha](https://www.npmjs.com/package/ts-mocha), including |
52 |
| -its own custom test script. |
53 |
| - |
54 |
| -While this undeniably adds in extra complexity and getting-started overhead, you can read the complete rationale as to |
55 |
| -why [here](https://github.com/datastax/astra-db-ts/pull/66#issue-2430902926), but TL;DR: |
56 |
| -- We sped up the complete test suite by 500% |
57 |
| -- We improved the test filtering capabilities |
58 |
| -- We made it easier to write and work with `astra-db-ts`-esque tests |
59 |
| - |
60 |
| -The API for the test script is as the following: |
61 |
| - |
62 |
| -```sh |
63 |
| -1. scripts/test.sh |
64 |
| -2. [-all | -light | -coverage] |
65 |
| -3. [-fand | -for] [-f/F <filter>]+ [-g/G <regex>]+ |
66 |
| -4. [-w/W <vectorize_whitelist>] |
67 |
| -5. [-b | -bail] |
68 |
| -6. [-R | -no-report] |
69 |
| -7. [-c <http_client>] |
70 |
| -8. [-e <environment>] |
71 |
| -``` |
72 |
| - |
73 |
| -#### 1. The test file (`scripts/test.sh`) |
74 |
| - |
75 |
| -While you can use `npm run test` or `bun run test` if you so desire, attempting to use the test script's flags with it |
76 |
| -may be a bit iffy, as the inputs are first "de-quoted" (evaluated) when you use the shell command, but they're |
77 |
| -"de-quoted" again when the package manager runs the actual shell command. |
78 |
| - |
79 |
| -Just use `scripts/test.sh` (or `sh scripts/test.sh`) directly if you're using command-line flags and want to |
80 |
| -avoid a headache. |
81 |
| - |
82 |
| -#### 2. The test types (`[-all | -light | -coverage]`) |
83 |
| - |
84 |
| -There are three main test types: |
85 |
| -- `-all`: This is a shorthand for running enabling the `(LONG)`, `(ADMIN)`, and `(VECTORIZE)` tests (alongside all the normal tests that always run) |
86 |
| -- `-light`: This is a shorthand for disabling the aforementioned tests. This runs only the normal tests, which are much quicker to run in comparison |
87 |
| -- `-coverage`: This runs all tests, but uses `nyc` to test for coverage statistics. Enabled the `-b` (bail) flag, as no point continuing if a test fails |
88 |
| - |
89 |
| -By default, just running `scripts/test.sh` will be like using `-light`, but you can set the default config for which tests |
90 |
| -to enable in your `.env` file, through the `CLIENT_RUN_*_TESTS` env vars. |
91 |
| - |
92 |
| -#### 3. The test filters (`[-fand | -for] [-f/F <filter>]+ [-g/G <regex>]+`) |
93 |
| - |
94 |
| -The `astra-db-ts` test suite implements fully custom test filtering, inspired by Mocha's, but improved upon. |
95 |
| - |
96 |
| -You can add a basic filter using `-f <filter>` which acts like Mocha's own `-f` flag. Like Mocha, we also support `-g`, |
97 |
| -which is like `-f`, but for regex. Each only needs to match a part of the test name (or its parent describes' names) to |
98 |
| -succeed, so use `^$` as necessary. |
99 |
| - |
100 |
| -Unlike Mocha, there is no `-i` flag—instead, you can invert a filter by using `-F <filter>` or `-G <regex>`, so that the |
101 |
| -test needs to NOT match that string/regex to run. |
102 |
| - |
103 |
| -You can also use multiple filters by simply using multiple of `-f`, `-g`, `-F`, and `-G` as you please. By default, |
104 |
| -it'll only run a test if it satisfies all the filters (`-fand`), but you can use the `-for` flag to run a test if |
105 |
| -it satisfies any one of the filters. |
106 |
| - |
107 |
| -In case filters overlap, an inverted filter always wins over a regular filter, and the conflicted test won't run. |
108 |
| - |
109 |
| -#### 4. The vectorize whitelist (`[-w/W <vectorize_whitelist>]`) |
110 |
| - |
111 |
| -There's a special filtering system just for vectorize tests, called the "vectorize whitelist", of which there are two |
112 |
| -different types: either a piece of regex, or a special filter operator. |
113 |
| - |
114 |
| -##### Regex filtering |
115 |
| - |
116 |
| -Every vectorize test is given a test name representing every branch it took to become that specific test. It is |
117 |
| -of the following format: |
118 |
| - |
119 |
| -```sh |
120 |
| -# providerName@modelName@authType@dimension |
121 |
| -# where dimension := 'specified' | 'default' | <some_number> |
122 |
| -# where authType := 'header' | 'providerKey' | 'none' |
123 |
| -``` |
124 |
| - |
125 |
| -Again, the regex only needs to match part of each test's name to succeed, so use `^$` as necessary. |
126 |
| - |
127 |
| -##### Filter operators |
128 |
| - |
129 |
| -The vectorize test suite also defines some custom "filter operators" to provide filtering that can't be done through |
130 |
| -basic regex. They come of the format `-w $<operator>:<colon_separated_args>` |
131 |
| - |
132 |
| -1. `$limit:<number>` - This is a limit over the total number of vectorize tests, only running up to the specified amount |
133 |
| -2. `$provider-limit:<number>` - This limits the amount of vectorize tests that can be run per provider |
134 |
| -3. `$model-limit:<number>` - Akin to the above, but limits per model. |
135 |
| - |
136 |
| -The default whitelist is `$limit-per-model:1`. |
137 |
| - |
138 |
| -#### 5. Bailing (`[-b | -bail]`) |
139 |
| - |
140 |
| -Simply sets the bail flag, as it does in Mocha. Forces the test script to exit after a single test failure. |
141 |
| - |
142 |
| -#### 6. Disabling error reporting (`[-R | -no-report]`) |
143 |
| - |
144 |
| -By default, the test suite logs the complete error objects of any that may've been thrown during your tests to the |
145 |
| -`./etc/test-reports` directory for greatest debuggability. However, this can be disabled for a test run using the |
146 |
| -`-R`/`-no-report` flag. |
147 |
| - |
148 |
| -#### 7. The HTTP client (`[-c <http_client>]`) |
149 |
| - |
150 |
| -By default, `astra-db-ts` will run its tests on `fetch-h2` using `HTTP/2`, but you can specify a specific client, which |
151 |
| -is one of `default:http1`, `default:http2`, or `fetch`. |
152 |
| - |
153 |
| -#### 8. The Data API environment (`[-e <environment>]`) |
154 |
| - |
155 |
| -By default, `astra-db-ts` assumes you're running on Astra, but you can specify the Data API environment through this |
156 |
| -flag. It should be one of `dse`, `hcd`, `cassandra`, or `other`. You can also provide `astra`, but it wouldn't really |
157 |
| -do anything. But I'm not the boss of you; you can make your own big-boy/girl/other decisions. |
158 |
| - |
159 |
| -### Test tags |
160 |
| - |
161 |
| -The `astra-db-ts` test suite uses the concept of "test tags" to further advance test filtering. These are tags in |
162 |
| -the names of test blocks, such as `(LONG) createCollection tests` or `(ADMIN) (ASTRA) AstraAdmin tests`. |
163 |
| - |
164 |
| -These tags are automatically parsed and filtered through the custom wrapper our test suite uses, though |
165 |
| -you can still interact with them through test filters as well. For example, I commonly use `-f VECTORIZE` to |
166 |
| -only run the vectorize tests. |
167 |
| - |
168 |
| -Current tags include: |
169 |
| - - `VECTORIZE` - Enabled if `CLIENT_RUN_VECTORIZE_TESTS` is set (or `-all` is set) |
170 |
| - - `LONG` - Enabled if `CLIENT_RUN_LONG_TESTS` is set (or `-all` is set) |
171 |
| - - `ADMIN` - Enabled if `CLIENT_RUN_ADMIN_TESTS` is set (or `-all` is set) |
172 |
| - - `DEV` - Automatically enabled if running on Astra-dev |
173 |
| - - `NOT-DEV` - Automatically enabled if not running on Astra-dev |
174 |
| - - `ASTRA` - Automatically enabled if running on Astra |
175 |
| - |
176 |
| -Attempting to set any other test tag will throw an error. (All test tags must contain only uppercase letters & |
177 |
| -hyphens—any tag not matching `\([A-Za]+?\)` will not be counted.) |
178 |
| - |
179 |
| -### Running vectorize tests |
180 |
| - |
181 |
| -To run vectorize tests, you need to have a vectorize-enabled kube running, with the correct tags enabled. |
182 |
| - |
183 |
| -Ensure `CLIENT_RUN_VECTORIZE_TESTS` and `CLIENT_RUN_LONG_TESTS` are enabled as well (or just pass the `-all` flag to |
184 |
| -the test script). |
185 |
| - |
186 |
| -Lastly, you must create a file, `vectorize_tests.json`, in the root folder, with the following format: |
187 |
| - |
188 |
| -```ts |
189 |
| -type VectorizeTestSpec = { |
190 |
| - [providerName: string]: { |
191 |
| - headers?: { |
192 |
| - [header: `x-${string}`]: string, |
193 |
| - }, |
194 |
| - sharedSecret?: { |
195 |
| - providerKey?: string, |
196 |
| - }, |
197 |
| - dimension?: { |
198 |
| - [modelNameRegex: string]: number, |
199 |
| - }, |
200 |
| - parameters?: { |
201 |
| - [modelNameRegex: string]: Record<string, string>, |
202 |
| - }, |
203 |
| - warmupErr?: string, |
204 |
| - }, |
205 |
| -} |
206 |
| -``` |
207 |
| -
|
208 |
| -where: |
209 |
| -- `providerName` is the name of the provider (e.g. `nvidia`, `openai`, etc.) as found in `findEmbeddingProviders`. |
210 |
| -- `headers` sets the embedding headers to be used for header auth. |
211 |
| - - resolves to an `EmbeddingHeadersProvider` under the hood—throws error if no corresponding one found. |
212 |
| - - optional if no header auth test wanted. |
213 |
| -- `sharedSecret` is the block for KMS auth (isomorphic to `providerKey`, but it's an object for future-compatability). |
214 |
| - - `providerKey` is the provider key for the provider (which will be passed in @ collection creation). |
215 |
| - - optional if no KMS auth test wanted. |
216 |
| -- `parameters` is a mapping of model names to their corresponding parameters. The model name can be some regex that partially matches the full model name. |
217 |
| - - `"text-embedding-3-small"`, `"3-small"`, and `".*"` will all match `"text-embedding-3-small"`. |
218 |
| - - optional if not required. `azureOpenAI`, for example, will need this. |
219 |
| -- `dimension` is also a mapping of model name regex to their corresponding dimensions, like the `parameters` field. |
220 |
| - - optional if not required. `huggingfaceDedicated`, for example, will need this. |
221 |
| -- `warmupErr` may be set if the provider errors on a cold start |
222 |
| - - if set, the provider will be called in a `while (true)` loop until it stops throwing an error matching this message |
223 |
| -
|
224 |
| -This file is .gitignore-d by default and will not be checked into VCS. |
225 |
| -
|
226 |
| -See `vectorize_test_spec.example.json` for, guess what, an example. |
227 |
| -
|
228 |
| -This spec is cross-referenced with `findEmbeddingProviders` to create a suite of tests branching off each possible |
229 |
| -parameter, with tests names of the format `providerName@modelName@authType@dimension`, where each section is another |
230 |
| -potential branch. |
231 |
| -
|
232 |
| -To run *only* the vectorize tests, a common pattern I use is `scripts/test.sh -all -f VECTORIZE [-w <vectorize_whitelist>]`. |
233 |
| -
|
234 |
| -### Running the tests on local Stargate |
235 |
| -In another terminal tab, you can do `sh scripts/start-stargate-4-tests.sh` to spin up an ephemeral Data API on DSE |
236 |
| -instance which will destroy itself on script exit. The test suite will set up any keyspaces/collections as necessary. |
237 |
| -
|
238 |
| -Then, be sure to set the following vars in `.env` exactly. |
239 |
| -```dotenv |
240 |
| -CLIENT_DB_URL=http://localhost:8181 |
241 |
| -CLIENT_DB_TOKEN=Cassandra:Y2Fzc2FuZHJh:Y2Fzc2FuZHJh |
242 |
| -CLIENT_DB_ENVIRONMENT=dse |
243 |
| -``` |
244 |
| -
|
245 |
| -Once the local Data API instance is fully started and ready for requests, you can run the tests. |
246 |
| -
|
247 |
| -### The custom Mocha wrapper |
248 |
| -
|
249 |
| -The `astra-db-ts` test suite is massively IO-bound, and desires a more advanced test filtering system than |
250 |
| -Mocha provides by default. As such, we have written a (relatively) light custom wrapper around Mocha, extending |
251 |
| -it to allow us to squeeze all possible performance out of our tests, and make it easier to write, scale, and work |
252 |
| -with tests in both the present, and the future. |
253 |
| -
|
254 |
| -#### The custom test functions |
255 |
| -
|
256 |
| -The most prominent changes are the introduction of 5 new Mocha-API-esque functions (two of which are overhauls) |
257 |
| -- [`describe`](https://github.com/datastax/astra-db-ts/blob/60fa445192b6a648b7a139a45986af8525a37ffb/tests/testlib/describe.ts) - An overhaul to the existing `dynamic` block |
258 |
| - - Provides fresh instances of the "common fixtures" in its callback |
259 |
| - - Performs "tag filtering" on the suite names |
260 |
| - - Some suite options to reduce boilerplate |
261 |
| - - `truncateColls: 'default'` - Does `deleteMany({})` on the default collection in the default namespace after each test case |
262 |
| - - `truncateColls: 'both'` - Does `deleteMany({})` on the default collection in both test namespaces after each test case |
263 |
| - - `dropEphemeral: 'after'` - Drops all non-default collections in both test namespaces after all the test cases in the suite |
264 |
| - - `dropEphemeral: 'afterEach'` - Drops all non-default collections in both test namespaces each test case |
265 |
| -- [`it`](https://github.com/datastax/astra-db-ts/blob/60fa445192b6a648b7a139a45986af8525a37ffb/tests/testlib/it.ts) - An overhaul to the existing `it` block |
266 |
| - - Performs "tag filtering" on the test names |
267 |
| - - Provides unique string keys for every test case |
268 |
| -- [`parallel`](https://github.com/datastax/astra-db-ts/blob/60fa445192b6a648b7a139a45986af8525a37ffb/tests/testlib/parallel.ts) - A wrapper around `describe` which runs all of its test cases in parallel |
269 |
| - - Only allows `it`, `before`, `after`, and a single layer of `describe` functions |
270 |
| - - Will run all tests simultaneously in a `before` hook, capture any exceptions, and rethrow them in reconstructed `it`/`describe` blocks for the most native-like behavior |
271 |
| - - Performs tag and test filtering as normal |
272 |
| - - Nearly all integration tests have been made parallel |
273 |
| -- [`background`](https://github.com/datastax/astra-db-ts/blob/60fa445192b6a648b7a139a45986af8525a37ffb/tests/testlib/background.ts) - A version of `describe` which runs in the background while all the other test cases run |
274 |
| - - Only allows `it` blocks |
275 |
| - - Will run the test at the very start of the test script, capture any exceptions, and rethrow them in reconstructed `it`/`describe` blocks for the most native-like behavior at the end of the test script |
276 |
| - - Performs tag and test filtering as normal |
277 |
| - - Meant for independent tests that take a very long time to execute (such as the `integration.devops.db-admin` lifecycle test) |
278 |
| -
|
279 |
| -These are not globals like Mocha's—rather, they are imported, like so: |
280 |
| -```ts |
281 |
| -import { background, describe, it, parallel } from '@/tests/testlib'; |
282 |
| -``` |
283 |
| - |
284 |
| -#### Examples |
285 |
| - |
286 |
| -You can find examples of usages of each in most, if not all, test files, such as: |
287 |
| -- [`/tests/integration/miscs/timeouts.test.ts`](https://github.com/datastax/astra-db-ts/blob/60fa445192b6a648b7a139a45986af8525a37ffb/tests/integration/misc/timeouts.test.ts) (`describe`, `parallel`, `it`) |
288 |
| -- [`/tests/integration/devops/lifecycle.test.ts`](https://github.com/datastax/astra-db-ts/blob/60fa445192b6a648b7a139a45986af8525a37ffb/tests/integration/devops/lifecycle.test.ts) (`background`) |
289 |
| - |
290 |
| -## Typechecking & Linting |
291 |
| - |
292 |
| -The test script also provides typechecking and linting through the following commands: |
293 |
| - |
294 |
| -```sh |
295 |
| -# Full typechecking |
296 |
| -scripts/test.sh -tc |
297 |
| - |
298 |
| -# Linting |
299 |
| -scripts/test.sh -lint |
| 12 | +## I can't be bothered to read all of this |
300 | 13 |
|
301 |
| -# Or even both |
302 |
| -scripts/test.sh -lint -tc |
303 |
| -``` |
| 14 | +yeah, fair enough. |
304 | 15 |
|
305 | 16 | ## Building the library
|
306 | 17 |
|
|
0 commit comments