docs(scoresql): add to README and add descriptions to examples

jqnatividad · jqnatividad · commit 9ce7503e674a · 2026-03-13T23:03:08.000-04:00
[skip ci]
diff --git a/README.md b/README.md
@@ -76,6 +76,7 @@
 | [safenames](docs/help/safenames.md)<br>![CKAN](docs/images/ckan.png) | <a name="safenames_deeplink"></a>Modify headers of a CSV to only have ["safe" names](/src/cmd/safenames.rs#L5-L14) - guaranteed "database-ready"/"CKAN-ready" names.  |
 | [sample](docs/help/sample.md)<br>📇🌐🏎️ | Randomly draw rows (with optional seed) from a CSV using seven different sampling methods - [reservoir](https://en.wikipedia.org/wiki/Reservoir_sampling) (default), [indexed](https://en.wikipedia.org/wiki/Random_access), [bernoulli](https://en.wikipedia.org/wiki/Bernoulli_sampling), [systematic](https://en.wikipedia.org/wiki/Systematic_sampling), [stratified](https://en.wikipedia.org/wiki/Stratified_sampling), [weighted](https://doi.org/10.1016/j.ipl.2005.11.003) & [cluster sampling](https://en.wikipedia.org/wiki/Cluster_sampling). Supports sampling from CSVs on remote URLs. |
 | [schema](docs/help/schema.md)<br>📇😣🏎️👆🪄🐻‍❄️ | <a name="schema_deeplink"></a>Infer either a [JSON Schema Validation Draft 2020-12](https://json-schema.org/draft/2020-12/json-schema-validation) ([Example](https://github.com/dathere/qsv/blob/master/resources/test/311_Service_Requests_from_2010_to_Present-2022-03-04.csv.schema.json)) or [Polars Schema](https://docs.pola.rs/user-guide/lazy/schemas/) ([Example](https://github.com/dathere/qsv/blob/master/resources/test/NYC_311_SR_2010-2020-sample-1M.pschema.json)) from CSV data.<br>In JSON Schema Validation mode, it produces a `.schema.json` file replete with inferred data type & domain/range validation rules derived from [`stats`](#stats_deeplink). Uses multithreading to go faster if an index is present. See [`validate`](#validate_deeplink) command to use the generated JSON Schema to validate if similar CSVs comply with the schema.<br>With the `--polars` option, it produces a `.pschema.json` file that all polars commands (`sqlp`, `joinp` & `pivotp`) use to determine the data type of each column & to optimize performance.<br>Both schemas are editable and can be fine-tuned. For JSON Schema, to refine the inferred validation rules. For Polars Schema, to change the inferred Polars data types. |
+| [scoresql](docs/help/scoresql.md)<br>🐻‍❄️🪄 | Analyze a SQL query against CSV file caches (stats, moarstats, frequency) to produce a performance score with actionable optimization suggestions BEFORE running the query. Supports Polars (default) and DuckDB modes. |
 | [search](docs/help/search.md)<br>📇🏎️👆 | Run a regex over a CSV. Applies the regex to selected fields & shows only matching rows.  |
 | [searchset](docs/help/searchset.md)<br>📇🏎️👆 | _Run multiple regexes over a CSV in a single pass._ Applies the regexes to each field individually & shows only matching rows.  |
 | [select](docs/help/select.md)<br>👆 | Select, re-order, reverse, duplicate or drop columns.  |
diff --git a/src/cmd/scoresql.rs b/src/cmd/scoresql.rs
@@ -19,15 +19,19 @@ Caches are auto-generated when missing:
 
 Examples:
 
+  # Score a simple filter query against a single CSV file
   $ qsv scoresql data.csv "SELECT * FROM data WHERE col1 > 10"
 
+  # Output the score report as JSON instead of the default human-readable format
   $ qsv scoresql --json data.csv "SELECT col1, col2 FROM data ORDER BY col1"
 
+  # Score a join query across two CSV files
   $ qsv scoresql data.csv data2.csv "SELECT * FROM data JOIN data2 ON data.id = data2.id"
 
+  # Use DuckDB for query plan analysis instead of Polars
   $ qsv scoresql --duckdb data.csv "SELECT * FROM data WHERE status = 'active'"
 
-  # use _t_N aliases just like sqlp (see sqlp documentation)
+  # Use _t_N aliases just like sqlp (see sqlp documentation)
   $ qsv scoresql data.csv data2.csv "SELECT * FROM _t_1 JOIN _t_2 ON _t_1.id = _t_2.id"
 
 For more examples, see https://github.com/dathere/qsv/blob/master/tests/test_scoresql.rs.