You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
refactor: Rewrite execute functions into on one function
- Enhanced logging in examples to provide feedback on the number of affected rows and results of DDL operations.
Signed-off-by: Edmund Miller <[email protected]>
Co-authored-by: Paolo Di Tommaso <[email protected]>
NOTE: THIS IS A PREVIEW TECHNOLOGY, FEATURES AND CONFIGURATION SETTINGS CAN CHANGE IN FUTURE RELEASES.
16
16
@@ -24,7 +24,6 @@ plugins {
24
24
}
25
25
```
26
26
27
-
28
27
## Configuration
29
28
30
29
You can configure any number of databases under the `sql.db` configuration scope. For example:
@@ -79,7 +78,7 @@ The following options are available:
79
78
80
79
`batchSize`
81
80
: Query the data in batches of the given size. This option is recommended for queries that may return large a large result set, so that the entire result set is not loaded into memory at once.
82
-
: *NOTE:* this feature requires that the underlying SQL database supports `LIMIT` and `OFFSET`.
81
+
: _NOTE:_ this feature requires that the underlying SQL database supports `LIMIT` and `OFFSET`.
83
82
84
83
`emitColumns`
85
84
: When `true`, the column names in the `SELECT` statement are emitted as the first tuple in the resulting channel.
INSERT INTO SAMPLE (NAME, LEN) VALUES ('WORLD!', 6);
105
104
```
106
105
107
-
*NOTE:* the target table (e.g. `SAMPLE` in the above example) must be created beforehand.
106
+
_NOTE:_ the target table (e.g. `SAMPLE` in the above example) must be created beforehand.
108
107
109
108
The following options are available:
110
109
@@ -125,21 +124,23 @@ The following options are available:
125
124
126
125
`setup`
127
126
: A SQL statement that is executed before inserting the data, e.g. to create the target table.
128
-
: *NOTE:* the underlying database should support the *create table if not exist* idiom, as the plugin will execute this statement every time the script is run.
127
+
: _NOTE:_ the underlying database should support the _create table if not exist_ idiom, as the plugin will execute this statement every time the script is run.
129
128
130
129
## SQL Execution Functions
131
130
132
-
This plugin provides the following functions for executing SQL statements that don't return data, such as DDL (Data Definition Language) and DML (Data Manipulation Language) operations.
131
+
This plugin provides the following function for executing SQL statements that don't return data, such as DDL (Data Definition Language) and DML (Data Manipulation Language) operations.
133
132
134
133
### sqlExecute
135
134
136
-
The `sqlExecute` function executes a SQL statement that doesn't return a result set, such as `CREATE`, `ALTER`, `DROP`, `INSERT`, `UPDATE`, or `DELETE` statements. For example:
135
+
The `sqlExecute` function executes a SQL statement that doesn't return a result set, such as `CREATE`, `ALTER`, `DROP`, `INSERT`, `UPDATE`, or `DELETE` statements. For DML statements (`INSERT`, `UPDATE`, `DELETE`), it returns the number of rows affected. For DDL statements (`CREATE`, `ALTER`, `DROP`), it returns `null`.
136
+
137
+
For example:
137
138
138
139
```nextflow
139
140
include { sqlExecute } from 'plugin/nf-sqldb'
140
141
141
-
// Create a table
142
-
sqlExecute(
142
+
// Create a table (returns null for DDL operations)
143
+
def createResult = sqlExecute(
143
144
db: 'foo',
144
145
statement: '''
145
146
CREATE TABLE IF NOT EXISTS sample_table (
@@ -149,51 +150,24 @@ sqlExecute(
149
150
)
150
151
'''
151
152
)
153
+
println "Create result: $createResult" // null
152
154
153
-
// Insert data
154
-
sqlExecute(
155
+
// Insert data (returns 1 for number of rows affected)
statement: "DELETE FROM sample_table WHERE id = 1"
163
-
)
164
-
```
165
-
166
-
The following options are available:
167
-
168
-
`db`
169
-
: The database handle. It must be defined under `sql.db` in the Nextflow configuration.
170
-
171
-
`statement`
172
-
: The SQL statement to execute. This can be any DDL or DML statement that doesn't return a result set.
173
-
174
-
### executeUpdate
175
-
176
-
The `executeUpdate` function is similar to `sqlExecute`, but it returns the number of rows affected by the SQL statement. This is particularly useful for DML operations like `INSERT`, `UPDATE`, and `DELETE` where you need to know how many rows were affected. For example:
177
-
178
-
```nextflow
179
-
include { executeUpdate } from 'plugin/nf-sqldb'
180
-
181
-
// Insert data and get the number of rows inserted
statement: "UPDATE sample_table SET value = 30.5 WHERE name = 'beta'"
165
+
statement: "UPDATE sample_table SET value = 30.5 WHERE name = 'alpha'"
192
166
)
193
167
println "Updated $updatedRows row(s)"
194
168
195
-
// Delete data and get the number of rows deleted
196
-
def deletedRows = executeUpdate(
169
+
// Delete data (returns number of rows deleted)
170
+
def deletedRows = sqlExecute(
197
171
db: 'foo',
198
172
statement: "DELETE FROM sample_table WHERE value > 25"
199
173
)
@@ -206,25 +180,27 @@ The following options are available:
206
180
: The database handle. It must be defined under `sql.db` in the Nextflow configuration.
207
181
208
182
`statement`
209
-
: The SQL statement to execute. This should be a DML statement that can return a count of affected rows.
183
+
: The SQL statement to execute. This can be any DDL or DML statement that doesn't return a result set.
210
184
211
-
## Differences Between Dataflow Operators and Execution Functions
185
+
## Differences Between Dataflow Operators and Execution Function
212
186
213
187
The plugin provides two different ways to interact with databases:
214
188
215
189
1.**Dataflow Operators** (`fromQuery` and `sqlInsert`): These are designed to integrate with Nextflow's dataflow programming model, operating on channels.
190
+
216
191
-`fromQuery`: Queries data from a database and returns a channel that emits the results.
217
192
-`sqlInsert`: Takes data from a channel and inserts it into a database.
218
193
219
-
2.**Execution Functions** (`sqlExecute` and `executeUpdate`): These are designed for direct SQL statement execution that doesn't require channel integration.
220
-
-`sqlExecute`: Executes a SQL statement without returning any data.
221
-
-`executeUpdate`: Executes a SQL statement and returns the count of affected rows.
194
+
2.**Execution Function** (`sqlExecute`): This is designed for direct SQL statement execution that doesn't require channel integration.
195
+
-`sqlExecute`: Executes a SQL statement. For DML operations, it returns the count of affected rows. For DDL operations, it returns null.
222
196
223
197
Use **Dataflow Operators** when you need to:
198
+
224
199
- Query data that will flow into your pipeline processing
225
200
- Insert data from your pipeline processing into a database
- Perform one-off operations (deleting all records, truncating a table, etc.)
@@ -262,8 +238,7 @@ The `CSVREAD` function provided by the H2 database engine allows you to query an
262
238
263
239
Like all dataflow operators in Nextflow, the operators provided by this plugin are executed asynchronously.
264
240
265
-
In particular, data inserted using the `sqlInsert` operator is *not* guaranteed to be available to any subsequent queries using the `fromQuery` operator, as it is not possible to make a channel factory operation dependent on some upstream operation.
266
-
241
+
In particular, data inserted using the `sqlInsert` operator is _not_ guaranteed to be available to any subsequent queries using the `fromQuery` operator, as it is not possible to make a channel factory operation dependent on some upstream operation.
0 commit comments