Skip to content

Conversation

joriszwart
Copy link

Incremental Blob I/O

This PR adds support for Incremental Blob I/O.

Notes

  • Modeled after backup using a driver connection.
  • The interfaces io.Reader and io.Closer have been implemented.
  • Related to issue Blob I/O #239

If this PR makes sense, I'll enhance it with additional io interfaces.

@joriszwart joriszwart marked this pull request as ready for review September 1, 2022 14:39
@mattn
Copy link
Owner

mattn commented Sep 1, 2022

interesting.

@joriszwart joriszwart marked this pull request as draft September 1, 2022 15:10
@joriszwart
Copy link
Author

joriszwart commented Sep 1, 2022

@joriszwart joriszwart marked this pull request as ready for review September 1, 2022 15:32
@joriszwart
Copy link
Author

I have implemented write and seek support. As far as I'm concerned, this pull request is ready for further review. Thanks so far.

@joriszwart
Copy link
Author

joriszwart commented Sep 3, 2022

I was tempted to implement the io.ReaderAt and io.WriterAt interfaces as well, but clients can do that themselves by combining io.Reader (or io.Writer) and io.Seeker:

type Foo struct {
	io.ReadSeeker
}

func (f *Foo) ReadAt(p []byte, off int64) (int, error) {
	_, err := f.Seek(off, io.SeekStart)
	if err != nil {
		return 0, err
	}

	n, err := f.Read(p)
	return n, err
}

@joriszwart joriszwart requested a review from rittneje September 3, 2022 18:50
@joriszwart
Copy link
Author

The Linux builds with libsqlite3 fail with:

=== RUN   TestBlobRead
    blob_io_test.go:46: no such vfs: memdb
--- FAIL: TestBlobRead (0.00s)

I don't know how to fix this.

@graf0
Copy link

graf0 commented Nov 28, 2022

Hello!

The Linux builds with libsqlite3 fail with:

=== RUN   TestBlobRead
    blob_io_test.go:46: no such vfs: memdb
--- FAIL: TestBlobRead (0.00s)

I don't know how to fix this.

Problem is like this:

  • github actions uses ubuntu-latest, which is focal release, version 20.04 (stable release) at time of writing this answer
  • default version of sqlite3 in this release of ubuntu is 3.31.0
  • it's compiled without SQLITE_ENABLE_DESERIALIZE which is required to enable memdb vfs in sqlite verison < 3.36.0

So - vfs=memdb will not work in ubuntu focal. You would need to recompile libsqlite3 deb package and add SQLITE_ENABLE_DESERIALIZE option to make. Without it memdb is just not there. And package libsqlite3-dev do not consist c code of sqlite3 - it's .a and .so files, already precompiled.

This behaviour was changed in sqlite 3.36.0 - SQLITE_ENABLE_DESERIALIZE is defined by default, you need to explicity disable it by defining SQLITE_OMIT_DESERIALIZE option during sqlite compilation.

So, there are imho there are two solution of this problem:

  • use connection string: "file:foobar?mode=memory&cache=shared" - maybe only if version of sqlite3 is < 3.36.0
  • or use connection string ":memory:" and set db.SetMaxOpenConns(1) (mayb followed by db.Ping() to create conn?) to limit all access to the same memory database on all versions of sqlite, including sqlite3

This problem will disappear after ubuntu will release next LTS version, and github action will change ubuntu-latest to this LTS version - there should be at least sqlite3 3.37.0 there.

@graf0
Copy link

graf0 commented Nov 28, 2022

sugested fix (will work with any version of sqlite):

diff --git a/blob_io_test.go b/blob_io_test.go
index 3e6fb91..b98df8d 100644
--- a/blob_io_test.go
+++ b/blob_io_test.go
@@ -25,12 +25,14 @@ var _ io.Closer = &SQLiteBlob{}
 type driverConnCallback func(*testing.T, *SQLiteConn)
 
 func blobTestData(t *testing.T, dbname string, rowid int64, blob []byte, c driverConnCallback) {
-       db, err := sql.Open("sqlite3", "file:/"+dbname+"?vfs=memdb")
+       db, err := sql.Open("sqlite3", ":memory:")
        if err != nil {
                t.Fatal(err)
        }
        defer db.Close()
 
+       db.SetMaxOpenConns(1)
+
        // Test data
        query := `
                CREATE TABLE data (

@joriszwart
Copy link
Author

@rittneje do you agree with the proposed fix?

@rittneje
Copy link
Collaborator

@joriszwart In this particular case :memory: should work. However, the need to throttle the connection pool to one can cause some issues in general. If you make that change, you should also add comments explaining it is specifically to support older SQLite versions and developers should always use file:/<name>?vfs=memdb instead whenever possible. (Developers may reference these test cases to see how to use the blob i/o feature.)

Really I think we need a better approach for dealing with testing libsqlite3 in general. But that is an issue beyond the scope of your changes.

@jlelse
Copy link

jlelse commented Feb 3, 2023

Any updates on this? It looks very interesting and enables "streaming" to and from the database! 👍

@joriszwart
Copy link
Author

@rittneje Is there anything I can do to get this approved?

n = len(b)
}

if n != len(b) {
Copy link
Collaborator

@rittneje rittneje Mar 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to remember - is there a reason not to do this check after the call to sqlite3_blob_write and only write what we can instead of nothing? I guess the current implementation is consistent with what sqlite3_blob_write internally does.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joriszwart following up on this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what to do. Sorry.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rittneje can you help me out?

Copy link

@gabriel-samfira gabriel-samfira Sep 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function implements io.Writer{}, which states:

https://pkg.go.dev/io#Writer

Write writes len(p) bytes from p to the underlying data stream. It returns the number of bytes written from p (0 <= n <= len(p)) and any error encountered that caused the write to stop early. Write must return a non-nil error if it returns n < len(p). Write must not modify the slice data, even temporarily.

The interesting bit of this is:

Write must return a non-nil error if it returns n < len(p).

So if b is partially written, then n should return the number of bytes written and a non nil error. But to be honest, the way it's written now, should be fine. I mean, partial writes are worse than no writes. At least the caller knows it erred and can retry using the same byte slice.

At least that's what I understand from the io.Writer docs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing that would be nice to have is maybe an error that denotes not enough space. Something like ENOSPC on linux and ERROR_HANDLE_DISK_FULL on Windows (or if sqlite already has an error code for not enough space, to use that). But that's just a nit. It would help the caller when checking with errors.Is(err, NotEnoughSpacePlatformSpecificError).

I think sqlite has SQLITE_FULL and SQLITE_IOERR_DISKFULL.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

partial writes are worse than no writes

This is a subjective statement.

At least the caller knows it erred and can retry using the same byte slice.

If the error is because the blob is full, retrying is never going to work.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right on both counts.

@rittneje
Copy link
Collaborator

@joriszwart I am away from my dev machine or I'd just add those two len checks myself. They should be pretty simple to add.

@joriszwart
Copy link
Author

@joriszwart I am away from my dev machine or I'd just add those two len checks myself. They should be pretty simple to add.

Can you add them?

@joriszwart
Copy link
Author

joriszwart commented Aug 21, 2023

Anyone else? @graf0 @pokstad @jlelse @lezhnev74 @mitar?

@joriszwart joriszwart requested a review from rittneje October 8, 2023 18:26
if err != nil {
t.Fatal(err)
}
defer driverConn.Close()
Copy link
Collaborator

@rittneje rittneje Oct 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove. This is superfluous with the call to conn.Close() above. (And also you aren't supposed to do anything with the conn outside the Raw callback.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove those 4 lines? Or only the last?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove only the last line?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think just line 72. The driverConn is just conn cast as the underlying type that implements the interface (conn). Calling Close() on conn will close driverConn as well.

@joriszwart
Copy link
Author

@rittneje can I leave the suggested changes to you? That would make this process more efficient.

@joriszwart joriszwart requested a review from rittneje November 13, 2023 09:01
jrossi added a commit to jrossi/go-sqlite3 that referenced this pull request Mar 27, 2024
Pulling in this RP for Blob access
@joriszwart joriszwart closed this May 24, 2024
@gabriel-samfira
Copy link

A shame this was abandoned. Thanks for trying @joriszwart !

@joriszwart
Copy link
Author

@mattn @rittneje any interest in this?

@joriszwart joriszwart reopened this Sep 26, 2025
@gabriel-samfira
Copy link

@mattn @rittneje any interest in this?

for what it's worth, I would love to see this happen.


// Write implements the io.Writer interface.
func (s *SQLiteBlob) Write(b []byte) (n int, err error) {
if len(b) == 0 {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend you create a copy of b and use that throughout. Slices may be mutated after passed to Write(). You might end up writing the first few bytes from the initial data and the rest from the new set when b is overwritten by the caller. In go only the slice header is passed by value. The underlying array is shared with the caller.

Something like:

tmp := make([]byte, len(b))
copy(tmp, b)

Then use tmp instead of b.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't making a copy defeat the purpose of streaming blobs?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, as you'd only be copying the buffer (which is small - usually around 1024 bytes). But generally speaking you'd only have to copy if you need to guard against caller reuse of the buffer. This is usually done in logging writers that may want to return control back to the caller while the write operation against the logging backend is still ongoing.

I'm not sure that would apply here though (given that we want to wait for the write to happen and only then return), so I just added a comment as a "recommendation". Feel free to ignore me.

I haven't mentioned this, but thanks for reopening this PR!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to make a copy of the slice, as sqlite3_blob_write is going to copy the data. And the caller is not allowed to modify the slice during the call the Write. (Even if they did, copy itself would be subject to a race condition anyway.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants