Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Changed

- Change SQLite driver operations over to use bulk inserts where possible now that sqlc has better support for `json_each`. [PR #1276](https://github.com/riverqueue/river/pull/1276)

### Fixed

- Fix `JobCancel` having no effect on running jobs when using a poll-only driver (e.g. `riverdatabasesql`). The `controlActionCancel` event was silently dropped in `fetchAndRunLoop`'s `queueControlCh` handler instead of being forwarded to `maybeCancelJob`. Note: this fix only works within a single process; cross-process cancels in poll-only setups must wait for the next poll cycle. [PR #1245](https://github.com/riverqueue/river/pull/1245).
Expand Down
148 changes: 138 additions & 10 deletions riverdriver/riversqlite/internal/dbsqlc/river_job.sql
Original file line number Diff line number Diff line change
Expand Up @@ -189,15 +189,6 @@ WHERE state = 'running'
ORDER BY id
LIMIT @max;

-- Insert a job.
--
-- This is supposed to be a batch insert, but various limitations of the
-- combined SQLite + sqlc has left me unable to find a way of injecting many
-- arguments en masse (like how we slightly abuse arrays to pull it off for the
-- Postgres drivers), so we loop over many insert operations instead, with the
-- expectation that this may be fixable in the future. Because SQLite targets
-- will often be local and therefore with a very minimal round trip compared to
-- a network, looping over operations is probably okay performance-wise.
-- name: JobInsertFast :one
INSERT INTO /* TEMPLATE: schema */river_job(
id,
Expand Down Expand Up @@ -246,6 +237,56 @@ ON CONFLICT (unique_key)
DO UPDATE SET kind = EXCLUDED.kind
RETURNING *;

-- name: JobInsertFastMany :many
INSERT INTO /* TEMPLATE: schema */river_job(
id,
args,
created_at,
kind,
max_attempts,
metadata,
priority,
queue,
scheduled_at,
state,
tags,
unique_key,
unique_states
)
SELECT
cast(json_extract(value, '$.id') AS integer),
json(cast(json_extract(value, '$.args') AS blob)),
coalesce(cast(json_extract(value, '$.created_at') AS text), datetime('now', 'subsec')),
cast(json_extract(value, '$.kind') AS text),
cast(json_extract(value, '$.max_attempts') AS integer),
json(cast(json_extract(value, '$.metadata') AS blob)),
cast(json_extract(value, '$.priority') AS integer),
cast(json_extract(value, '$.queue') AS text),
coalesce(cast(json_extract(value, '$.scheduled_at') AS text), datetime('now', 'subsec')),
cast(json_extract(value, '$.state') AS text),
json(cast(json_extract(value, '$.tags') AS blob)),
CASE WHEN length(cast(json_extract(value, '$.unique_key') AS text)) = 0 THEN NULL ELSE unhex(cast(json_extract(value, '$.unique_key') AS text)) END,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid unhex in SQLite bulk inserts

When the SQLite driver is used with libSQL or older SQLite builds, the new bulk insert paths fail before inserting any job with a non-empty unique key because they now depend on the SQL unhex() scalar function to reconstruct the blob. The package still advertises libSQL compatibility, and unhex() is not available in all supported SQLite-compatible engines, so JobInsertFastMany, JobInsertFastManyNoReturning, and JobInsertFullMany can start returning no such function: unhex for otherwise valid unique jobs; keep passing the key as a blob or add a compatibility-safe decoder.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think Codex got a little confused on this one. The wording is kinda odd — it seems to suggest that unhex isn't supported on libSQL, but then instead of committing to that (because it's not true), it says that's not available "in all supported SQLite-compatible engines".

I looked into this a bit and unhex has been around since SQLite 3.41.0 released February 21, 2023, which gives us a solid three year buffer. It's been around in libSQL about as long.

We could probably get rid of unhex, but not sure this is a good enough reason to. Just gonna leave it.

nullif(cast(json_extract(value, '$.unique_states') AS integer), 0)
FROM json_each(cast(@jobs AS blob))
WHERE true
ON CONFLICT (unique_key)
WHERE unique_key IS NOT NULL
AND unique_states IS NOT NULL
AND CASE state
WHEN 'available' THEN unique_states & (1 << 0)
WHEN 'cancelled' THEN unique_states & (1 << 1)
WHEN 'completed' THEN unique_states & (1 << 2)
WHEN 'discarded' THEN unique_states & (1 << 3)
WHEN 'pending' THEN unique_states & (1 << 4)
WHEN 'retryable' THEN unique_states & (1 << 5)
WHEN 'running' THEN unique_states & (1 << 6)
WHEN 'scheduled' THEN unique_states & (1 << 7)
ELSE 0
END >= 1
-- Something needs to be updated for a row to be returned on a conflict.
DO UPDATE SET kind = EXCLUDED.kind
RETURNING *;

-- name: JobInsertFastNoReturning :execrows
INSERT INTO /* TEMPLATE: schema */river_job(
args,
Expand Down Expand Up @@ -290,6 +331,52 @@ ON CONFLICT (unique_key)
END >= 1
DO NOTHING;

-- name: JobInsertFastManyNoReturning :execrows
INSERT INTO /* TEMPLATE: schema */river_job(
args,
created_at,
kind,
max_attempts,
metadata,
priority,
queue,
scheduled_at,
state,
tags,
unique_key,
unique_states
)
SELECT
json(cast(json_extract(value, '$.args') AS blob)),
coalesce(cast(json_extract(value, '$.created_at') AS text), datetime('now', 'subsec')),
cast(json_extract(value, '$.kind') AS text),
cast(json_extract(value, '$.max_attempts') AS integer),
json(cast(json_extract(value, '$.metadata') AS blob)),
cast(json_extract(value, '$.priority') AS integer),
cast(json_extract(value, '$.queue') AS text),
coalesce(cast(json_extract(value, '$.scheduled_at') AS text), datetime('now', 'subsec')),
cast(json_extract(value, '$.state') AS text),
json(cast(json_extract(value, '$.tags') AS blob)),
CASE WHEN length(cast(json_extract(value, '$.unique_key') AS text)) = 0 THEN NULL ELSE unhex(cast(json_extract(value, '$.unique_key') AS text)) END,
nullif(cast(json_extract(value, '$.unique_states') AS integer), 0)
FROM json_each(cast(@jobs AS blob))
WHERE true
ON CONFLICT (unique_key)
WHERE unique_key IS NOT NULL
AND unique_states IS NOT NULL
AND CASE state
WHEN 'available' THEN unique_states & (1 << 0)
WHEN 'cancelled' THEN unique_states & (1 << 1)
WHEN 'completed' THEN unique_states & (1 << 2)
WHEN 'discarded' THEN unique_states & (1 << 3)
WHEN 'pending' THEN unique_states & (1 << 4)
WHEN 'retryable' THEN unique_states & (1 << 5)
WHEN 'running' THEN unique_states & (1 << 6)
WHEN 'scheduled' THEN unique_states & (1 << 7)
ELSE 0
END >= 1
DO NOTHING;

-- name: JobInsertFull :one
INSERT INTO /* TEMPLATE: schema */river_job(
args,
Expand Down Expand Up @@ -329,6 +416,47 @@ INSERT INTO /* TEMPLATE: schema */river_job(
@unique_states
) RETURNING *;

-- name: JobInsertFullMany :many
INSERT INTO /* TEMPLATE: schema */river_job(
args,
attempt,
attempted_at,
attempted_by,
created_at,
errors,
finalized_at,
kind,
max_attempts,
metadata,
priority,
queue,
scheduled_at,
state,
tags,
unique_key,
unique_states
)
SELECT
json(cast(json_extract(value, '$.args') AS blob)),
cast(json_extract(value, '$.attempt') AS integer),
cast(json_extract(value, '$.attempted_at') AS text),
CASE WHEN json_type(value, '$.attempted_by') IS NULL THEN NULL ELSE json(cast(json_extract(value, '$.attempted_by') AS blob)) END,
coalesce(cast(json_extract(value, '$.created_at') AS text), datetime('now', 'subsec')),
CASE WHEN json_type(value, '$.errors') IS NULL THEN NULL ELSE json(cast(json_extract(value, '$.errors') AS blob)) END,
cast(json_extract(value, '$.finalized_at') AS text),
cast(json_extract(value, '$.kind') AS text),
cast(json_extract(value, '$.max_attempts') AS integer),
json(cast(json_extract(value, '$.metadata') AS blob)),
cast(json_extract(value, '$.priority') AS integer),
cast(json_extract(value, '$.queue') AS text),
coalesce(cast(json_extract(value, '$.scheduled_at') AS text), datetime('now', 'subsec')),
cast(json_extract(value, '$.state') AS text),
json(cast(json_extract(value, '$.tags') AS blob)),
CASE WHEN length(cast(json_extract(value, '$.unique_key') AS text)) = 0 THEN NULL ELSE unhex(cast(json_extract(value, '$.unique_key') AS text)) END,
nullif(cast(json_extract(value, '$.unique_states') AS integer), 0)
FROM json_each(cast(@jobs AS blob))
RETURNING *;

-- name: JobKindList :many
SELECT DISTINCT kind
FROM /* TEMPLATE: schema */river_job
Expand Down Expand Up @@ -513,4 +641,4 @@ SET
metadata = CASE WHEN cast(@metadata_do_update AS boolean) THEN json(cast(@metadata AS blob)) ELSE metadata END,
state = CASE WHEN cast(@state_do_update AS boolean) THEN @state ELSE state END
WHERE id = @id
RETURNING *;
RETURNING *;
Loading
Loading