Skip to content

Commit 64dcbd3

Browse files
committed
Add online documentation about using custom table providers
1 parent 15458c7 commit 64dcbd3

File tree

2 files changed

+57
-0
lines changed

2 files changed

+57
-0
lines changed

docs/source/user-guide/io/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,3 +26,4 @@ IO
2626
csv
2727
json
2828
parquet
29+
table_provider
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
.. Licensed to the Apache Software Foundation (ASF) under one
2+
.. or more contributor license agreements. See the NOTICE file
3+
.. distributed with this work for additional information
4+
.. regarding copyright ownership. The ASF licenses this file
5+
.. to you under the Apache License, Version 2.0 (the
6+
.. "License"); you may not use this file except in compliance
7+
.. with the License. You may obtain a copy of the License at
8+
9+
.. http://www.apache.org/licenses/LICENSE-2.0
10+
11+
.. Unless required by applicable law or agreed to in writing,
12+
.. software distributed under the License is distributed on an
13+
.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
.. KIND, either express or implied. See the License for the
15+
.. specific language governing permissions and limitations
16+
.. under the License.
17+
18+
Custom Table Provider
19+
=====================
20+
21+
If you have a custom data source that you want to integrate with DataFusion, you can do so by
22+
implementing the `TableProvider <https://datafusion.apache.org/library-user-guide/custom-table-providers.html>`_
23+
interface in Rust and then exposing it in Python. To do so,
24+
you must use DataFusion 43.0.0 or later and expose a `FFI_TableProvider <https://crates.io/crates/datafusion-ffi>`_
25+
via `PyCapsule <https://pyo3.rs/main/doc/pyo3/types/struct.pycapsule>`_.
26+
27+
A complete example can be found in the `examples folder <https://github.com/apache/datafusion-python/tree/main/examples>`_.
28+
29+
.. code-block:: rust
30+
31+
#[pymethods]
32+
impl MyTableProvider {
33+
34+
fn __datafusion_table_provider__<'py>(
35+
&self,
36+
py: Python<'py>,
37+
) -> PyResult<Bound<'py, PyCapsule>> {
38+
let name = CString::new("datafusion_table_provider").unwrap();
39+
40+
let provider = Arc::new(self.clone())
41+
.map_err(|e| PyRuntimeError::new_err(e.to_string()))?;
42+
let provider = FFI_TableProvider::new(Arc::new(provider), false);
43+
44+
PyCapsule::new_bound(py, provider, Some(name.clone()))
45+
}
46+
}
47+
48+
Once you have this library available, in python you can register your table provider
49+
to the ``SessionContext``.
50+
51+
.. code-block:: python
52+
53+
provider = MyTableProvider()
54+
ctx.register_table_provider("my_table", provider)
55+
56+
ctx.table("my_table").show()

0 commit comments

Comments
 (0)