|
| 1 | +# Subgraph:SQL Service |
| 2 | + |
| 3 | +The Subgraph:SQL Service, developed by Semiotic Labs in collaboration with The Guild |
| 4 | +and Edge & Node, offers a secure SQL interface for querying a subgraph's entities. |
| 5 | +To deploy this with minimal changes to the existing indexer stack, consumers (or the |
| 6 | +Studio they use) can wrap an SQL query in a GraphQL query. |
| 7 | + |
| 8 | +## Querying with Subgraph:SQL Service |
| 9 | + |
| 10 | +### Running Queries |
| 11 | + |
| 12 | +Say we have the following SQL query: |
| 13 | + |
| 14 | +```sql |
| 15 | +SELECT * FROM users WHERE age > 18 |
| 16 | +``` |
| 17 | + |
| 18 | +The Subgraph:SQL Service allows consumers to create a corresponding GraphQL query |
| 19 | +using the Subgraph:SQL Service `sql` field, with a `query` field containing the SQL |
| 20 | +query: |
| 21 | + |
| 22 | +```graphql |
| 23 | +query { |
| 24 | + sql(input: { |
| 25 | + query: "SELECT * FROM users WHERE age > 18", |
| 26 | + format: JSON |
| 27 | + }) { |
| 28 | + ... on SqlJSONOutput { |
| 29 | + columns |
| 30 | + rowCount |
| 31 | + rows |
| 32 | + } |
| 33 | + } |
| 34 | +} |
| 35 | +``` |
| 36 | + |
| 37 | +We use the `sql` field in the GraphQL query, passing an input object with the SQL |
| 38 | +query, optional parameters, and format. The SQL query selects all columns from the |
| 39 | +`users` table where the `age` column is greater than 18, returning the requested |
| 40 | +data formatted as JSON. |
| 41 | + |
| 42 | +### SQL Parameters and Bind Parameters |
| 43 | + |
| 44 | +#### SQL Query Parameters |
| 45 | + |
| 46 | +You can pass optional SQL query parameters to the SQL query as positional parameters. |
| 47 | +The parameters are converted to the SQL types based on the GraphQL types of the parameters. |
| 48 | +In the GraphQL schema, parameters are passed as an array of `SqlVariable` objects |
| 49 | +within the `parameters` field of the `SqlInput` input object. See the GraphQL schema |
| 50 | +types in `graph/src/schema/sql.graphql`. |
| 51 | + |
| 52 | +#### Bind Parameters |
| 53 | + |
| 54 | +We currently do not support bind parameters, but plan to support this feature in a future |
| 55 | +version of Graph Node. |
| 56 | + |
| 57 | +## Configuration |
| 58 | + |
| 59 | +The Subgraph:SQL Service can be enabled or disabled using the `GRAPH_GRAPHQL_ENABLE_SQL_SERVICE` |
| 60 | +environment variable. |
| 61 | + |
| 62 | +- **Environment Variable:** `GRAPH_GRAPHQL_ENABLE_SQL_SERVICE` |
| 63 | +- **Default State:** Off (Disabled) |
| 64 | +- **Purpose:** Enables queries on the `sql()` field of the root query. |
| 65 | +- **Impact on Schema:** Adds a global `SqlInput` type to the GraphQL schema. The `sql` |
| 66 | +field accepts values of this type. |
| 67 | + |
| 68 | +To enable the Subgraph:SQL Service, set the `GRAPH_GRAPHQL_ENABLE_SQL_SERVICE` environment |
| 69 | +variable to `true` or `1`. This allows clients to execute SQL queries using the |
| 70 | +`sql()` field in GraphQL queries. |
| 71 | + |
| 72 | +```bash |
| 73 | +export GRAPH_GRAPHQL_ENABLE_SQL_SERVICE=true |
| 74 | +``` |
| 75 | + |
| 76 | +Alternatively, configure the environment variable in your deployment scripts or |
| 77 | +environment setup as needed. |
| 78 | + |
| 79 | +### SQL Coverage |
| 80 | + |
| 81 | +The Subgraph:SQL Service covers a wide range of SQL functionality, allowing you to execute |
| 82 | +`SELECT` queries against your database. It supports basic querying, parameter binding, and |
| 83 | +result formatting into JSON or CSV. |
| 84 | + |
| 85 | +#### Whitelisted and Blacklisted SQL Functions |
| 86 | + |
| 87 | +The `POSTGRES_WHITELISTED_FUNCTIONS` constant contains a whitelist of SQL functions that are |
| 88 | +permitted to be used within SQL queries executed by the Subgraph:SQL Service, while `POSTGRES_BLACKLISTED_FUNCTIONS` |
| 89 | +serves as a safety mechanism to restrict the usage of certain PostgreSQL functions within SQL |
| 90 | +queries. These blacklisted functions are deemed inappropriate or potentially harmful to the |
| 91 | +system's integrity or performance. Both constants are defined in `store/postgres/src/sql/constants.rs`. |
| 92 | + |
| 93 | +### SQL Query Validation |
| 94 | + |
| 95 | +Graph Node's SQL query validation ensures that SQL queries adhere to predefined criteria: |
| 96 | + |
| 97 | +- **Function Name Validation**: Validates function names used within SQL queries, distinguishing |
| 98 | +between unknown, whitelisted, and blacklisted functions. |
| 99 | +- **Statement Validation**: Validates SQL statements, ensuring that only `SELECT` queries are |
| 100 | +supported and that multi-statement queries are not allowed. |
| 101 | +- **Table Name Validation**: Validates table names referenced in SQL queries, identifying |
| 102 | +unknown tables and ensuring compatibility with the schema. |
| 103 | +- **Common Table Expression (CTE) Handling**: Handles common table expressions, adding them |
| 104 | +to the set of known tables during validation. |
| 105 | + |
| 106 | +See the test suite in `store/postgres/src/sql/validation.rs` for examples of various scenarios |
| 107 | +and edge cases encountered during SQL query validation, including function whitelisting and |
| 108 | +blacklisting, multi-statement queries, unknown table references, and more. |
| 109 | + |
| 110 | +### Relating GraphQL Schema to Tables |
| 111 | + |
| 112 | +The GraphQL schema provided by the Subgraph:SQL Service reflects the structure of the SQL queries |
| 113 | +it can execute. It does not directly represent tables in a database. Users need to |
| 114 | +construct SQL queries compatible with their database schema. |
| 115 | + |
| 116 | +### Queryable Attributes/Columns |
| 117 | + |
| 118 | +The columns that can be queried depend on the SQL query provided. In the example GraphQL |
| 119 | +query above, the columns returned would be all columns from the `users` table. |
0 commit comments