|
| 1 | +# scoresql |
| 2 | + |
| 3 | +> Analyze a SQL query against CSV file caches (stats, moarstats, frequency) to produce a performance score with actionable optimization suggestions BEFORE running the query. Supports Polars (default) and DuckDB modes. |
| 4 | +
|
| 5 | +**[Table of Contents](TableOfContents.md)** | **Source: [src/cmd/scoresql.rs](https://github.com/dathere/qsv/blob/master/src/cmd/scoresql.rs)** | [🐻❄️](TableOfContents.md#legend "command powered/accelerated by vectorized query engine.")[🪄](TableOfContents.md#legend "\"automagical\" commands that uses stats and/or frequency tables to work \"smarter\" & \"faster\".") |
| 6 | + |
| 7 | +<a name="nav"></a> |
| 8 | +[Description](#description) | [Examples](#examples) | [Usage](#usage) | [Scoresql Options](#scoresql-options) | [Common Options](#common-options) |
| 9 | + |
| 10 | +<a name="description"></a> |
| 11 | + |
| 12 | +## Description [↩](#nav) |
| 13 | + |
| 14 | +Analyze a SQL query against CSV file caches (stats, moarstats, frequency) to produce a |
| 15 | +performance score with actionable optimization suggestions BEFORE running the query. |
| 16 | + |
| 17 | +Accepts the same input/SQL arguments as sqlp. Outputs a human-readable performance report |
| 18 | +(default) or JSON (--json). Supports Polars mode (default) and DuckDB mode (--duckdb). |
| 19 | + |
| 20 | +Scoring factors include: |
| 21 | +* Query plan analysis (EXPLAIN output from Polars or DuckDB) |
| 22 | +* Type optimization (column types vs. usage in query) |
| 23 | +* Join key cardinality and data distribution |
| 24 | +* Filter selectivity from frequency cache |
| 25 | +* Query anti-pattern detection (SELECT *, missing LIMIT, cartesian joins, etc.) |
| 26 | +* Infrastructure checks (index files, cache freshness) |
| 27 | + |
| 28 | +Caches are auto-generated when missing: |
| 29 | +* stats cache via `qsv stats --everything --stats-jsonl` |
| 30 | +* frequency cache via `qsv frequency --frequency-jsonl` |
| 31 | + |
| 32 | + |
| 33 | +<a name="examples"></a> |
| 34 | + |
| 35 | +## Examples [↩](#nav) |
| 36 | + |
| 37 | +> Score a simple filter query against a single CSV file |
| 38 | +
|
| 39 | +```console |
| 40 | +qsv scoresql data.csv "SELECT * FROM data WHERE col1 > 10" |
| 41 | +``` |
| 42 | + |
| 43 | +> Output the score report as JSON instead of the default human-readable format |
| 44 | +
|
| 45 | +```console |
| 46 | +qsv scoresql --json data.csv "SELECT col1, col2 FROM data ORDER BY col1" |
| 47 | +``` |
| 48 | + |
| 49 | +> Score a join query across two CSV files |
| 50 | +
|
| 51 | +```console |
| 52 | +qsv scoresql data.csv data2.csv "SELECT * FROM data JOIN data2 ON data.id = data2.id" |
| 53 | +``` |
| 54 | + |
| 55 | +> Use DuckDB for query plan analysis instead of Polars |
| 56 | +
|
| 57 | +```console |
| 58 | +qsv scoresql --duckdb data.csv "SELECT * FROM data WHERE status = 'active'" |
| 59 | +``` |
| 60 | + |
| 61 | +> Use _t_N aliases just like sqlp (see sqlp documentation) |
| 62 | +
|
| 63 | +```console |
| 64 | +qsv scoresql data.csv data2.csv "SELECT * FROM _t_1 JOIN _t_2 ON _t_1.id = _t_2.id" |
| 65 | +``` |
| 66 | + |
| 67 | +For more examples, see [tests](https://github.com/dathere/qsv/blob/master/tests/test_scoresql.rs). |
| 68 | + |
| 69 | + |
| 70 | +<a name="usage"></a> |
| 71 | + |
| 72 | +## Usage [↩](#nav) |
| 73 | + |
| 74 | +```console |
| 75 | +qsv scoresql [options] <input>... <sql> |
| 76 | +qsv scoresql --help |
| 77 | +``` |
| 78 | + |
| 79 | +<a name="scoresql-options"></a> |
| 80 | + |
| 81 | +## Scoresql Options [↩](#nav) |
| 82 | + |
| 83 | +| Option | Type | Description | Default | |
| 84 | +|--------|------|-------------|--------| |
| 85 | +| `--json` | flag | Output results as JSON instead of human-readable report. | | |
| 86 | +| `--duckdb` | flag | Use DuckDB for query plan analysis instead of Polars. Requires the QSV_DESCRIBEGPT_DB_ENGINE environment variable to be set to the path of the DuckDB binary. | | |
| 87 | +| `--try-parsedates` | flag | Automatically try to parse dates/datetimes and time. | | |
| 88 | +| `--infer-len` | string | Number of rows to scan when inferring schema. | `10000` | |
| 89 | +| `--ignore-errors` | flag | Ignore errors when parsing CSVs. | | |
| 90 | +| `--truncate-ragged-lines` | flag | Truncate lines with more fields than the header. | | |
| 91 | + |
| 92 | +<a name="common-options"></a> |
| 93 | + |
| 94 | +## Common Options [↩](#nav) |
| 95 | + |
| 96 | +| Option | Type | Description | Default | |
| 97 | +|--------|------|-------------|--------| |
| 98 | +| `-h,`<br>`--help` | flag | Display this message | | |
| 99 | +| `-o,`<br>`--output` | string | Write output to <file> instead of stdout. | | |
| 100 | +| `-d,`<br>`--delimiter` | string | The field delimiter for reading CSV data. Must be a single character. | `,` | |
| 101 | +| `-q,`<br>`--quiet` | flag | Do not print informational messages to stderr. | | |
| 102 | + |
| 103 | +--- |
| 104 | +**Source:** [`src/cmd/scoresql.rs`](https://github.com/dathere/qsv/blob/master/src/cmd/scoresql.rs) |
| 105 | +| **[Table of Contents](TableOfContents.md)** | **[README](../../README.md)** |
0 commit comments