Segmentation fault on `try_clone` misuse

After spending some time in a multithreaded program using duckdb I ran into a segmentation fault. I can be a bit more careful in my Rust code, but I was wondering if I could fix it for you. To do so I need some guidance in the way the underlying API works. The failing test can be viewed [here](https://github.com/max-celus/duckdb-rs/blob/447d37a212e7e2d326c2e75982ce13b40d235171/src/lib.rs#L1007). Its code:

```rust
    #[test]
    fn test_clone_after_close() {
        // Additional querying test to make sure our connections are still
        // usable. The original crash would happen without doing any queries.
        fn assert_can_query(conn: &Connection) {
            conn.execute("INSERT INTO test (c1) VALUES (1)", []).expect("insert");
        }

        // 1. Open owned connection
        let owned = checked_memory_handle();
        owned
            .execute_batch("create table test (c1 bigint)")
            .expect("create table");
        assert_can_query(&owned);

        // 2. Create a first clone from owned
        let clone1 = owned.try_clone().expect("first clone");
        assert_can_query(&owned);

        // 3. Close owned connection
        drop(owned);
        assert_can_query(&clone1);

        // 4. Create a second clone from the first clone. Crashes on the inner
        //    `duckdb_connect` with a segmentation fault.
        let clone2 = clone1.try_clone().expect("second clone");
        assert_can_query(&clone1);
        assert_can_query(&clone2);

        // 5. Small additional test
        drop(clone1);
        assert_can_query(&clone2);
    }
```

The problem is calling `InnerConnection::new(...)` when the `ffi::duckdb_database` is set to a null pointer after closing.

I would like to fix it, but there is some overhead involved. And I need some guidance on the internals of the C-interface to duckdb. We could put the `ffi::duckdb_database` behind some reference counting (like an `Arc`), and only call `ffi::duckdb_close(...)` when that database object is dropped. This would introduce an extra pointer indirection through the `Arc` and the overhead of atomic operations.

But to then make the entire implementation a bit more sound we would need to change the calls:
- `Connection::open_from_raw`: as this would now require an `Arc`-wrapped object.
- `InnerConnection::close(...)`: must be changed to consume `self`. Which causes a problem in `Connection::close(...)` with the `Connection::db`, which is a `RefCell<InnerConnection>`. As we would have to consume that inner connection as well. To solve this (and to reflect the fact that one cannot share a `Connection` across threads anyway, only move it between threads, is to make the methods on `Connection` take a `&mut self` parameter.

Here is where I stopped checking out the code. As I would like to hear if you would even want to see such far-reaching changes to the API.

Perhaps there is another (simpler) solution, some options, with varying degrees of ugliness:
- Have an `OwnedConnection` that actually owns the `InnerConnection`. Only that `OwnedConnection` supports `try_clone`. Can be implemented with typestate as `Connection<Owned>`.
- Perhaps just returning an `Err` from `try_clone` when we detect that the "owned" `InnerConnection` has been closed, but this would require some thread-safe sharing of a flag, with the implied overhead.

I'd love to hear from you. For now I'll just keep the "owned" `InnerConnection` in a very special place in my code :).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmentation fault on `try_clone` misuse #312

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Segmentation fault on try_clone misuse #312

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Segmentation fault on `try_clone` misuse #312