Skip to content

Commit 9ce7503

Browse files
committed
docs(scoresql): add to README and add descriptions to examples
[skip ci]
1 parent 6d57023 commit 9ce7503

File tree

2 files changed

+6
-1
lines changed

2 files changed

+6
-1
lines changed

β€ŽREADME.mdβ€Ž

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,7 @@
7676
| [safenames](docs/help/safenames.md)<br>![CKAN](docs/images/ckan.png) | <a name="safenames_deeplink"></a>Modify headers of a CSV to only have ["safe" names](/src/cmd/safenames.rs#L5-L14) - guaranteed "database-ready"/"CKAN-ready" names. |
7777
| [sample](docs/help/sample.md)<br>πŸ“‡πŸŒπŸŽοΈ | Randomly draw rows (with optional seed) from a CSV using seven different sampling methods - [reservoir](https://en.wikipedia.org/wiki/Reservoir_sampling) (default), [indexed](https://en.wikipedia.org/wiki/Random_access), [bernoulli](https://en.wikipedia.org/wiki/Bernoulli_sampling), [systematic](https://en.wikipedia.org/wiki/Systematic_sampling), [stratified](https://en.wikipedia.org/wiki/Stratified_sampling), [weighted](https://doi.org/10.1016/j.ipl.2005.11.003) & [cluster sampling](https://en.wikipedia.org/wiki/Cluster_sampling). Supports sampling from CSVs on remote URLs. |
7878
| [schema](docs/help/schema.md)<br>πŸ“‡πŸ˜£πŸŽοΈπŸ‘†πŸͺ„πŸ»β€β„οΈ | <a name="schema_deeplink"></a>Infer either a [JSON Schema Validation Draft 2020-12](https://json-schema.org/draft/2020-12/json-schema-validation) ([Example](https://github.com/dathere/qsv/blob/master/resources/test/311_Service_Requests_from_2010_to_Present-2022-03-04.csv.schema.json)) or [Polars Schema](https://docs.pola.rs/user-guide/lazy/schemas/) ([Example](https://github.com/dathere/qsv/blob/master/resources/test/NYC_311_SR_2010-2020-sample-1M.pschema.json)) from CSV data.<br>In JSON Schema Validation mode, it produces a `.schema.json` file replete with inferred data type & domain/range validation rules derived from [`stats`](#stats_deeplink). Uses multithreading to go faster if an index is present. See [`validate`](#validate_deeplink) command to use the generated JSON Schema to validate if similar CSVs comply with the schema.<br>With the `--polars` option, it produces a `.pschema.json` file that all polars commands (`sqlp`, `joinp` & `pivotp`) use to determine the data type of each column & to optimize performance.<br>Both schemas are editable and can be fine-tuned. For JSON Schema, to refine the inferred validation rules. For Polars Schema, to change the inferred Polars data types. |
79+
| [scoresql](docs/help/scoresql.md)<br>πŸ»β€β„οΈπŸͺ„ | Analyze a SQL query against CSV file caches (stats, moarstats, frequency) to produce a performance score with actionable optimization suggestions BEFORE running the query. Supports Polars (default) and DuckDB modes. |
7980
| [search](docs/help/search.md)<br>πŸ“‡πŸŽοΈπŸ‘† | Run a regex over a CSV. Applies the regex to selected fields & shows only matching rows. |
8081
| [searchset](docs/help/searchset.md)<br>πŸ“‡πŸŽοΈπŸ‘† | _Run multiple regexes over a CSV in a single pass._ Applies the regexes to each field individually & shows only matching rows. |
8182
| [select](docs/help/select.md)<br>πŸ‘† | Select, re-order, reverse, duplicate or drop columns. |

β€Žsrc/cmd/scoresql.rsβ€Ž

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,15 +19,19 @@ Caches are auto-generated when missing:
1919
2020
Examples:
2121
22+
# Score a simple filter query against a single CSV file
2223
$ qsv scoresql data.csv "SELECT * FROM data WHERE col1 > 10"
2324
25+
# Output the score report as JSON instead of the default human-readable format
2426
$ qsv scoresql --json data.csv "SELECT col1, col2 FROM data ORDER BY col1"
2527
28+
# Score a join query across two CSV files
2629
$ qsv scoresql data.csv data2.csv "SELECT * FROM data JOIN data2 ON data.id = data2.id"
2730
31+
# Use DuckDB for query plan analysis instead of Polars
2832
$ qsv scoresql --duckdb data.csv "SELECT * FROM data WHERE status = 'active'"
2933
30-
# use _t_N aliases just like sqlp (see sqlp documentation)
34+
# Use _t_N aliases just like sqlp (see sqlp documentation)
3135
$ qsv scoresql data.csv data2.csv "SELECT * FROM _t_1 JOIN _t_2 ON _t_1.id = _t_2.id"
3236
3337
For more examples, see https://github.com/dathere/qsv/blob/master/tests/test_scoresql.rs.

0 commit comments

Comments
Β (0)