You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+85Lines changed: 85 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,6 +6,91 @@ Based on [`calamine`](https://github.com/tafia/calamine) and [Apache Arrow](http
6
6
7
7
Docs available [here](https://fastexcel.toucantoco.dev/).
8
8
9
+
## Installation
10
+
11
+
```bash
12
+
# Lightweight installation (no pyarrow dependency)
13
+
pip install fastexcel
14
+
15
+
# With Polars support only (no pyarrow needed)
16
+
pip install fastexcel[polars]
17
+
18
+
# With pandas support (includes pyarrow)
19
+
pip install fastexcel[pandas]
20
+
21
+
# With pyarrow support
22
+
pip install fastexcel[pyarrow]
23
+
24
+
# With all integrations
25
+
pip install fastexcel[pandas,polars]
26
+
```
27
+
28
+
## Quick Start
29
+
30
+
### Modern usage (recommended)
31
+
32
+
FastExcel supports the [Arrow PyCapsule Interface](https://arrow.apache.org/docs/format/CDataInterface/PyCapsuleInterface.html) for zero-copy data exchange with libraries like Polars, without requiring pyarrow as a dependency.
33
+
Use fastexcel with any Arrow-compatible library without requiring pyarrow.
34
+
35
+
```python
36
+
import fastexcel
37
+
38
+
# Load an Excel file
39
+
reader = fastexcel.read_excel("data.xlsx")
40
+
sheet = reader.load_sheet(0) # Load first sheet
41
+
42
+
# Use with Polars (zero-copy, no pyarrow needed)
43
+
import polars as pl
44
+
df = pl.DataFrame(sheet) # Direct PyCapsule interface
45
+
print(df)
46
+
47
+
# Or use the to_polars() method (also via PyCapsule)
48
+
df = sheet.to_polars()
49
+
print(df)
50
+
51
+
# Or access the raw Arrow data via PyCapsule interface
52
+
schema = sheet.__arrow_c_schema__()
53
+
array_data = sheet.__arrow_c_array__()
54
+
```
55
+
56
+
### Traditional usage (with pandas/pyarrow)
57
+
58
+
```python
59
+
import fastexcel
60
+
61
+
reader = fastexcel.read_excel("data.xlsx")
62
+
sheet = reader.load_sheet(0)
63
+
64
+
# Convert to pandas (requires `pandas` extra)
65
+
df = sheet.to_pandas()
66
+
67
+
# Or get pyarrow RecordBatch directly
68
+
record_batch = sheet.to_arrow()
69
+
```
70
+
71
+
### Working with tables
72
+
73
+
```python
74
+
reader = fastexcel.read_excel("data.xlsx")
75
+
76
+
# List available tables
77
+
tables = reader.table_names()
78
+
print(f"Available tables: {tables}")
79
+
80
+
# Load a specific table
81
+
table = reader.load_table("MyTable")
82
+
df = pl.DataFrame(table) # Zero-copy via PyCapsule, no pyarrow needed
83
+
```
84
+
85
+
## Key Features
86
+
87
+
-**Zero-copy data exchange** via [Arrow PyCapsule Interface](https://arrow.apache.org/docs/format/CDataInterface/PyCapsuleInterface.html)
88
+
-**Flexible dependencies** - use with Polars (no PyArrow needed) or Pandas (includes PyArrow)
89
+
-**Seamless Polars integration** - `pl.DataFrame(sheet)` and `sheet.to_polars()` work without PyArrow via PyCapsule interface
90
+
-**High performance** - written in Rust with [calamine](https://github.com/tafia/calamine) and [Apache Arrow](https://arrow.apache.org/)
91
+
-**Memory efficient** - lazy loading and optional eager evaluation
92
+
-**Type safety** - automatic type inference with manual override options
0 commit comments