Development on this project is mostly driven by volunteer contributors. We welcome new contributors, including not only those who develop new features, but also those who are able to help with documentation and provide detailed bug reports.
Please take note of our code of conduct.
If you want to start contributing, first look at our good first issues: https://github.com/delta-io/delta-rs/contribute
If you want to contribute something more substantial, see our "Projects seeking contributors" section on our roadmap: delta-io#1128
We recognise that AI coding assistants are now a regular part of many developers' workflows and can improve productivity. Thoughtful use of these tools can be beneficial, but AI-generated PRs can sometimes lead to undesirable additional maintainer burden. PRs that appear to be fully generated by AI with little to no engagement from the author may be closed without further review.
Human-generated mistakes tend to be easier to spot and reason about, and code review is intended to be a collaborative learning experience that benefits both submitter and reviewer. When a PR appears to have been generated without much engagement from the submitter, reviewers with access to AI tools could more efficiently generate the code directly, and since the submitter is not likely to learn from the review process, their time is more productively spent researching and reporting on the issue.
We are not opposed to the use of AI tools in generating PRs, but recommend the following:
- Only submit a PR if you are able to debug and own the changes yourself - review all generated code to understand every detail. Apache Datafusion has a useful explanation of why fully AI-generated PRs without understanding are not helpful.
- Match the style and conventions used in the rest of the codebase, including PR titles and descriptions
- Be upfront about AI usage and summarise what was AI-generated
- If there are parts you don't fully understand, leave comments on your own PR explaining what steps you took to verify correctness
- Watch for AI's tendency to generate overly verbose comments, unnecessary test cases, and incorrect fixes
- Break down large PRs into smaller ones to make review easier
PR authors are also responsible for disclosing any copyrighted materials in submitted contributions. See the Apache Software Foundation's guidance on AI-generated code for further information on licensing considerations.
If you want to claim an issue to work on, you can write the word take as a comment in it and you will be automatically assigned.
-
Install Rust.
-
Install the uv Python package manager.
-
Build the project for development. This will install
deltalakeinto the Python virtual environment managed by uv.cd python make develop -
Run some Python code, e.g. to run a specific test
uv run pytest tests/test_writer.py -s -k "test_with_deltalake_schema" -
Run some Rust code, e.g. run an example
cd ../crates/deltalake cargo run --example basic_operations --features="datafusion"
Preview your doc and docstring changes in a web browser:
-
Install Rust.
-
Install the uv Python package manager.
-
Build the project for development. This will install
deltalakeinto the Python virtual environment managed by uv.cd python make develop -
From the root directory, activate the uv environment and install the Python docs requirements.
cd .. source python/.venv/bin/activate pip install -r docs/requirements.txt
-
Run
mkdocs serveto preview your doc changes at http://127.0.0.1:8000/delta-io/delta-rs/.
Make sure all the following steps run/pass locally before submitting a PR
cargo fmt -- --check
cd python
make check-rust
make check-python
make develop
make unit-test
make build-docsPull requests should be in lower case and conform to conventional commits.
The sign-off is a simple line at the end of the explanation for the patch. Your signature certifies that you wrote the patch or otherwise have the right to pass it on as an open-source patch. The rules are pretty simple: if you can certify the below (from developercertificate.org):
Developer Certificate of Origin
Version 1.1
Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129
Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.
Developer's Certificate of Origin 1.1
By making a contribution to this project, I certify that:
(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or
(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or
(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.
(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.
Then you just add a line to every git commit message:
Signed-off-by: Jane Smith <jane.smith@email.com>
Use your real name (sorry, no pseudonyms or anonymous contributions.)
If you set your user.name and user.email git configs, you can sign your commit automatically with git commit -s.
These are just some basic steps/components to get you started, there are many other very useful extensions for VSCode
- For a better Rust development experience, install rust extension
- For debugging Rust code, install CodeLLDB. The extension should even create Debug launch configurations for the project if you allow it, an easy way to get started. Just set a breakpoint and run the relevant configuration.
- For debugging from Python into Rust, follow this procedure:
- Add this to
.vscode/launch.json
{
"type": "lldb",
"request": "attach",
"name": "LLDB Attach to Python",
"program": "${command:python.interpreterPath}",
"pid": "${command:pickMyProcess}",
"args": [],
"stopOnEntry": false,
"environment": [],
"externalConsole": true,
"MIMode": "lldb",
"cwd": "${workspaceFolder}"
}- Add a
breakpoint()statement somewhere in your Python code (main function or at any point in Python code you know will be executed when you run it) - Add a breakpoint in Rust code in VSCode editor where you want to drop into the debugger
- Run the relevant Python code function in your terminal, execution should drop into the Python debugger showing
PDBprompt - Run the following in that prompt to get the Python process ID:
import os; os.getpid() - Run the
LLDB Attach to Pythonfrom theRun and Debugpanel of VSCode. This will prompt you for a Process ID to attach to, enter the Python process ID obtained earlier (this will also be in the dropdown but that dropdown will have many process IDs) - LLDB may take a couple of seconds to attach to the process
- When the debugger is attached to the process (you will notice the debugger panels get filled with extra info), enter
c+Enter in thePDBprompt in your terminal - the execution should continue until the breakpoint in Rust code is hit. From this point it's a standard debugging process.