-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add Blob I/O #1083
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add Blob I/O #1083
Conversation
interesting. |
|
I have implemented write and seek support. As far as I'm concerned, this pull request is ready for further review. Thanks so far. |
I was tempted to implement the type Foo struct {
io.ReadSeeker
}
func (f *Foo) ReadAt(p []byte, off int64) (int, error) {
_, err := f.Seek(off, io.SeekStart)
if err != nil {
return 0, err
}
n, err := f.Read(p)
return n, err
} |
The Linux builds with libsqlite3 fail with:
I don't know how to fix this. |
Hello!
Problem is like this:
So - vfs=memdb will not work in ubuntu focal. You would need to recompile libsqlite3 deb package and add SQLITE_ENABLE_DESERIALIZE option to make. Without it memdb is just not there. And package libsqlite3-dev do not consist c code of sqlite3 - it's .a and .so files, already precompiled. This behaviour was changed in sqlite 3.36.0 - SQLITE_ENABLE_DESERIALIZE is defined by default, you need to explicity disable it by defining SQLITE_OMIT_DESERIALIZE option during sqlite compilation. So, there are imho there are two solution of this problem:
This problem will disappear after ubuntu will release next LTS version, and github action will change ubuntu-latest to this LTS version - there should be at least sqlite3 3.37.0 there. |
sugested fix (will work with any version of sqlite): diff --git a/blob_io_test.go b/blob_io_test.go
index 3e6fb91..b98df8d 100644
--- a/blob_io_test.go
+++ b/blob_io_test.go
@@ -25,12 +25,14 @@ var _ io.Closer = &SQLiteBlob{}
type driverConnCallback func(*testing.T, *SQLiteConn)
func blobTestData(t *testing.T, dbname string, rowid int64, blob []byte, c driverConnCallback) {
- db, err := sql.Open("sqlite3", "file:/"+dbname+"?vfs=memdb")
+ db, err := sql.Open("sqlite3", ":memory:")
if err != nil {
t.Fatal(err)
}
defer db.Close()
+ db.SetMaxOpenConns(1)
+
// Test data
query := `
CREATE TABLE data ( |
@rittneje do you agree with the proposed fix? |
@joriszwart In this particular case Really I think we need a better approach for dealing with testing |
Any updates on this? It looks very interesting and enables "streaming" to and from the database! 👍 |
@rittneje Is there anything I can do to get this approved? |
n = len(b) | ||
} | ||
|
||
if n != len(b) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to remember - is there a reason not to do this check after the call to sqlite3_blob_write
and only write what we can instead of nothing? I guess the current implementation is consistent with what sqlite3_blob_write
internally does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joriszwart following up on this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what to do. Sorry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rittneje can you help me out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function implements io.Writer{}
, which states:
Write writes len(p) bytes from p to the underlying data stream. It returns the number of bytes written from p (0 <= n <= len(p)) and any error encountered that caused the write to stop early. Write must return a non-nil error if it returns n < len(p). Write must not modify the slice data, even temporarily.
The interesting bit of this is:
Write must return a non-nil error if it returns n < len(p).
So if b
is partially written, then n
should return the number of bytes written and a non nil
error. But to be honest, the way it's written now, should be fine. I mean, partial writes are worse than no writes. At least the caller knows it erred and can retry using the same byte slice.
At least that's what I understand from the io.Writer
docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only thing that would be nice to have is maybe an error that denotes not enough space. Something like ENOSPC
on linux and ERROR_HANDLE_DISK_FULL
on Windows (or if sqlite already has an error code for not enough space, to use that). But that's just a nit. It would help the caller when checking with errors.Is(err, NotEnoughSpacePlatformSpecificError)
.
I think sqlite has SQLITE_FULL
and SQLITE_IOERR_DISKFULL
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
partial writes are worse than no writes
This is a subjective statement.
At least the caller knows it erred and can retry using the same byte slice.
If the error is because the blob is full, retrying is never going to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right on both counts.
@joriszwart I am away from my dev machine or I'd just add those two |
Can you add them? |
Anyone else? @graf0 @pokstad @jlelse @lezhnev74 @mitar? |
if err != nil { | ||
t.Fatal(err) | ||
} | ||
defer driverConn.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove. This is superfluous with the call to conn.Close()
above. (And also you aren't supposed to do anything with the conn
outside the Raw
callback.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove those 4 lines? Or only the last?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove only the last line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think just line 72. The driverConn
is just conn
cast as the underlying type that implements the interface (conn
). Calling Close() on conn
will close driverConn
as well.
@rittneje can I leave the suggested changes to you? That would make this process more efficient. |
Pulling in this RP for Blob access
A shame this was abandoned. Thanks for trying @joriszwart ! |
|
||
// Write implements the io.Writer interface. | ||
func (s *SQLiteBlob) Write(b []byte) (n int, err error) { | ||
if len(b) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend you create a copy of b
and use that throughout. Slices may be mutated after passed to Write(). You might end up writing the first few bytes from the initial data and the rest from the new set when b
is overwritten by the caller. In go only the slice header is passed by value. The underlying array is shared with the caller.
Something like:
tmp := make([]byte, len(b))
copy(tmp, b)
Then use tmp instead of b
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't making a copy defeat the purpose of streaming blobs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, as you'd only be copying the buffer (which is small - usually around 1024 bytes). But generally speaking you'd only have to copy if you need to guard against caller reuse of the buffer. This is usually done in logging writers that may want to return control back to the caller while the write operation against the logging backend is still ongoing.
I'm not sure that would apply here though (given that we want to wait for the write to happen and only then return), so I just added a comment as a "recommendation". Feel free to ignore me.
I haven't mentioned this, but thanks for reopening this PR!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to make a copy of the slice, as sqlite3_blob_write
is going to copy the data. And the caller is not allowed to modify the slice during the call the Write
. (Even if they did, copy
itself would be subject to a race condition anyway.)
Incremental Blob I/O
This PR adds support for Incremental Blob I/O.
Notes
If this PR makes sense, I'll enhance it with additional io interfaces.