|
| 1 | +<!--- |
| 2 | + Licensed to the Apache Software Foundation (ASF) under one |
| 3 | + or more contributor license agreements. See the NOTICE file |
| 4 | + distributed with this work for additional information |
| 5 | + regarding copyright ownership. The ASF licenses this file |
| 6 | + to you under the Apache License, Version 2.0 (the |
| 7 | + "License"); you may not use this file except in compliance |
| 8 | + with the License. You may obtain a copy of the License at |
| 9 | +
|
| 10 | + http://www.apache.org/licenses/LICENSE-2.0 |
| 11 | +
|
| 12 | + Unless required by applicable law or agreed to in writing, |
| 13 | + software distributed under the License is distributed on an |
| 14 | + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| 15 | + KIND, either express or implied. See the License for the |
| 16 | + specific language governing permissions and limitations |
| 17 | + under the License. |
| 18 | +--> |
| 19 | + |
| 20 | +# DataFusion Ray Release Process |
| 21 | + |
| 22 | +Development happens on the `main` branch, and most of the time, we depend on DataFusion using GitHub dependencies |
| 23 | +rather than using an official release from crates.io. This allows us to pick up new features and bug fixes frequently |
| 24 | +by creating PRs to move to a later revision of the code. It also means we can incrementally make updates that are |
| 25 | +required due to changes in DataFusion rather than having a large amount of work to do when the next official release |
| 26 | +is available. |
| 27 | + |
| 28 | +When there is a new official release of DataFusion, we update the `main` branch to point to that, update the version |
| 29 | +number, and create a new release branch, such as `branch-0.2`. Once this branch is created, we switch the `main` branch |
| 30 | +back to using GitHub dependencies. The release activity (such as generating the changelog) can then happen on the |
| 31 | +release branch without blocking ongoing development in the `main` branch. |
| 32 | + |
| 33 | +We can cherry-pick commits from the `main` branch into `branch-0.2` as needed and then create new patch releases |
| 34 | +from that branch. |
| 35 | + |
| 36 | +## Detailed Guide |
| 37 | + |
| 38 | +### Pre-requisites |
| 39 | + |
| 40 | +Releases can currently only be created by PMC members due to the permissions needed. |
| 41 | + |
| 42 | +You will need a GitHub Personal Access Token. Follow |
| 43 | +[these instructions](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token) |
| 44 | +to generate one if you do not already have one. |
| 45 | + |
| 46 | +You will need a PyPI API token. Create one at https://test.pypi.org/manage/account/#api-tokens, setting the “Scope” to |
| 47 | +“Entire account”. |
| 48 | + |
| 49 | +You will also need access to the [datafusion-ray](https://test.pypi.org/project/datafusion-ray/) project on testpypi. |
| 50 | + |
| 51 | +### Preparing the `main` Branch |
| 52 | + |
| 53 | +Before creating a new release: |
| 54 | + |
| 55 | +- We need to ensure that the main branch does not have any GitHub dependencies |
| 56 | +- a PR should be created and merged to update the major version number of the project |
| 57 | +- A new release branch should be created, such as `branch-0.2` |
| 58 | + |
| 59 | +### Change Log |
| 60 | + |
| 61 | +We maintain a `CHANGELOG.md` so our users know what has been changed between releases. |
| 62 | + |
| 63 | +The changelog is generated using a Python script: |
| 64 | + |
| 65 | +```bash |
| 66 | +$ GITHUB_TOKEN=<TOKEN> ./dev/release/generate-changelog.py 0.1.0 HEAD 0.2.0 > dev/changelog/0.2.0.md |
| 67 | +``` |
| 68 | + |
| 69 | +This script creates a changelog from GitHub PRs based on the labels associated with them as well as looking for |
| 70 | +titles starting with `feat:`, `fix:`, or `docs:` . The script will produce output similar to: |
| 71 | + |
| 72 | +``` |
| 73 | +Fetching list of commits between 24.0.0 and HEAD |
| 74 | +Fetching pull requests |
| 75 | +Categorizing pull requests |
| 76 | +Generating changelog content |
| 77 | +``` |
| 78 | + |
| 79 | +### Preparing a Release Candidate |
| 80 | + |
| 81 | +### Tag the Repository |
| 82 | + |
| 83 | +```bash |
| 84 | +git tag 0.2.0-rc1 |
| 85 | +git push apache 0.2.0-rc1 |
| 86 | +``` |
| 87 | + |
| 88 | +### Create a source release |
| 89 | + |
| 90 | +```bash |
| 91 | +./dev/release/create-tarball.sh 0.2.0 1 |
| 92 | +``` |
| 93 | + |
| 94 | +This will also create the email template to send to the mailing list. |
| 95 | + |
| 96 | +Create a draft email using this content, but do not send until after completing the next step. |
| 97 | + |
| 98 | +### Publish Python Artifacts to testpypi |
| 99 | + |
| 100 | +This section assumes some familiarity with publishing Python packages to PyPi. For more information, refer to \ |
| 101 | +[this tutorial](https://packaging.python.org/en/latest/tutorials/packaging-projects/#uploading-the-distribution-archives). |
| 102 | + |
| 103 | +#### Publish Python Wheels to testpypi |
| 104 | + |
| 105 | +Pushing an `rc` tag to the release branch will cause a GitHub Workflow to run that will build the Python wheels. |
| 106 | + |
| 107 | +Go to https://github.com/apache/datafusion-ray/actions and look for an action named "Python Release Build" |
| 108 | +that has run against the pushed tag. |
| 109 | + |
| 110 | +Click on the action and scroll down to the bottom of the page titled "Artifacts". Download `dist.zip`. It should |
| 111 | +contain files such as: |
| 112 | + |
| 113 | +```text |
| 114 | +datafusion-ray-0.2.0-cp37-abi3-macosx_10_7_x86_64.whl |
| 115 | +datafusion-ray-0.2.0-cp37-abi3-macosx_11_0_arm64.whl |
| 116 | +datafusion-ray-0.2.0-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl |
| 117 | +datafusion-ray-0.2.0-cp37-abi3-win_amd64.whl |
| 118 | +``` |
| 119 | + |
| 120 | +Upload the wheels to testpypi. |
| 121 | + |
| 122 | +```bash |
| 123 | +unzip dist.zip |
| 124 | +python3 -m pip install --upgrade setuptools twine build |
| 125 | +python3 -m twine upload --repository testpypi datafusion-ray-0.2.0-cp37-abi3-*.whl |
| 126 | +``` |
| 127 | + |
| 128 | +When prompted for username, enter `__token__`. When prompted for a password, enter a valid GitHub Personal Access Token |
| 129 | + |
| 130 | +#### Publish Python Source Distribution to testpypi |
| 131 | + |
| 132 | +Download the source tarball created in the previous step, untar it, and run: |
| 133 | + |
| 134 | +```bash |
| 135 | +maturin sdist |
| 136 | +``` |
| 137 | + |
| 138 | +This will create a file named `dist/datafusion-ray-0.2.0.tar.gz`. Upload this to testpypi: |
| 139 | + |
| 140 | +```bash |
| 141 | +python3 -m twine upload --repository testpypi dist/datafusion-ray-0.2.0.tar.gz |
| 142 | +``` |
| 143 | + |
| 144 | +### Send the Email |
| 145 | + |
| 146 | +Send the email to start the vote. |
| 147 | + |
| 148 | +## Verifying a Release |
| 149 | + |
| 150 | +Running the unit tests against a testpypi release candidate: |
| 151 | + |
| 152 | +```bash |
| 153 | +# clone a fresh repo |
| 154 | +git clone https://github.com/apache/datafusion-ray.git |
| 155 | +cd datafusion-ray |
| 156 | + |
| 157 | +# checkout the release commit |
| 158 | +git fetch --tags |
| 159 | +git checkout 0.2.0-rc1 |
| 160 | + |
| 161 | +# create the env |
| 162 | +python3 -m venv venv |
| 163 | +source venv/bin/activate |
| 164 | + |
| 165 | +# install release candidate |
| 166 | +pip install --extra-index-url https://test.pypi.org/simple/ datafusion-ray==0.2.0 |
| 167 | + |
| 168 | +# only dep needed to run tests is pytest |
| 169 | +pip install pytest |
| 170 | + |
| 171 | +# run the tests |
| 172 | +pytest --import-mode=importlib python/tests |
| 173 | +``` |
| 174 | + |
| 175 | +Try running one of the examples from the top-level README, or write some custom Python code to query some available |
| 176 | +data files. |
| 177 | + |
| 178 | +## Publishing a Release |
| 179 | + |
| 180 | +### Publishing Apache Source Release |
| 181 | + |
| 182 | +Once the vote passes, we can publish the release. |
| 183 | + |
| 184 | +Create the source release tarball: |
| 185 | + |
| 186 | +```bash |
| 187 | +./dev/release/release-tarball.sh 0.2.0 1 |
| 188 | +``` |
| 189 | + |
| 190 | +### Publishing Python Artifacts to PyPi |
| 191 | + |
| 192 | +Go to the Test PyPI page of Datafusion, and download |
| 193 | +[all published artifacts](https://test.pypi.org/project/datafusion-ray/#files) under `dist-release/` directory. Then proceed |
| 194 | +uploading them using `twine`: |
| 195 | + |
| 196 | +```bash |
| 197 | +twine upload --repository pypi dist-release/* |
| 198 | +``` |
| 199 | + |
| 200 | +### Push the Release Tag |
| 201 | + |
| 202 | +```bash |
| 203 | +git checkout 0.2.0-rc1 |
| 204 | +git tag 0.2.0 |
| 205 | +git push apache 0.2.0 |
| 206 | +``` |
| 207 | + |
| 208 | +### Add the release to Apache Reporter |
| 209 | + |
| 210 | +Add the release to https://reporter.apache.org/addrelease.html?datafusion with a version name prefixed with `DATAFUSION-RAY`, |
| 211 | +for example `DATAFUSION-RAY-0.2.0`. |
| 212 | + |
| 213 | +The release information is used to generate a template for a board report (see example from Apache Arrow |
| 214 | +[here](https://github.com/apache/arrow/pull/14357)). |
| 215 | + |
| 216 | +### Delete old RCs and Releases |
| 217 | + |
| 218 | +See the ASF documentation on [when to archive](https://www.apache.org/legal/release-policy.html#when-to-archive) |
| 219 | +for more information. |
| 220 | + |
| 221 | +#### Deleting old release candidates from `dev` svn |
| 222 | + |
| 223 | +Release candidates should be deleted once the release is published. |
| 224 | + |
| 225 | +Get a list of DataFusion release candidates: |
| 226 | + |
| 227 | +```bash |
| 228 | +svn ls https://dist.apache.org/repos/dist/dev/datafusion | grep datafusion-ray |
| 229 | +``` |
| 230 | + |
| 231 | +Delete a release candidate: |
| 232 | + |
| 233 | +```bash |
| 234 | +svn delete -m "delete old DataFusion RC" https://dist.apache.org/repos/dist/dev/datafusion/apache-datafusion-ray-0.1.0-rc1/ |
| 235 | +``` |
| 236 | + |
| 237 | +#### Deleting old releases from `release` svn |
| 238 | + |
| 239 | +Only the latest release should be available. Delete old releases after publishing the new release. |
| 240 | + |
| 241 | +Get a list of DataFusion releases: |
| 242 | + |
| 243 | +```bash |
| 244 | +svn ls https://dist.apache.org/repos/dist/release/datafusion | grep datafusion-ray |
| 245 | +``` |
| 246 | + |
| 247 | +Delete a release: |
| 248 | + |
| 249 | +```bash |
| 250 | +svn delete -m "delete old DataFusion release" https://dist.apache.org/repos/dist/release/datafusion/datafusion-ray-0.1.0 |
| 251 | +``` |
0 commit comments