|
| 1 | +.. _ai_tool_policy: |
| 2 | + |
| 3 | +================================================================================ |
| 4 | +AI/LLM tool policy |
| 5 | +================================================================================ |
| 6 | + |
| 7 | +Policy |
| 8 | +------ |
| 9 | + |
| 10 | +GDAL's policy is that contributors can use whatever tools they would like to |
| 11 | +craft their contributions, but there must be a **human in the loop**. |
| 12 | +Contributors must read and review all AI (Artificial Intelligence) / Large Language Model (LLM)-generated code |
| 13 | +or text before they ask other project members to review it. |
| 14 | +The contributor is always the author and is fully accountable for their contributions. |
| 15 | +Contributors should be sufficiently confident that the contribution is high enough |
| 16 | +quality that asking for a review is a good use of scarce maintainer time, and they |
| 17 | +should be **able to answer questions about their work** during review. |
| 18 | + |
| 19 | +We expect that new contributors will be less confident in their contributions, |
| 20 | +and our guidance to them is to **start with small contributions** that they can |
| 21 | +fully understand to build confidence. We aspire to be a welcoming community |
| 22 | +that helps new contributors grow their expertise, but learning involves taking |
| 23 | +small steps, getting feedback, and iterating. Passing maintainer feedback to an |
| 24 | +LLM doesn't help anyone grow, and does not sustain our community. |
| 25 | + |
| 26 | +Contributors **must be transparent and label contributions that |
| 27 | +contain substantial amounts of tool-generated content**, and always mention it. |
| 28 | +The pull request and issue templates contain a checkbox for that purpose. |
| 29 | +Failure to do so, or lies when asked by a reviewer, will be considered as a violation. |
| 30 | +Our policy on labeling is intended to facilitate reviews, and not to track which parts of |
| 31 | +GDAL are generated. Contributors should note tool usage in their pull request |
| 32 | +description, commit message, or wherever authorship is normally indicated for |
| 33 | +the work. For instance, use a commit message trailer like Assisted-by: <name of |
| 34 | +code assistant>. This transparency helps the community develop best practices |
| 35 | +and understand the role of these new tools. |
| 36 | + |
| 37 | +This policy includes, but is not limited to, the following kinds of |
| 38 | +contributions: |
| 39 | + |
| 40 | +- Code, usually in the form of a pull request |
| 41 | +- RFCs or design proposals |
| 42 | +- Issue or security vulnerability reporting |
| 43 | +- Comments and feedback on pull requests |
| 44 | + |
| 45 | +Details |
| 46 | +------- |
| 47 | + |
| 48 | +To ensure sufficient self review and understanding of the work, it is strongly |
| 49 | +recommended that contributors write PR descriptions themselves (if needed, |
| 50 | +using tools for translation or copy-editing), in particular to avoid over-verbose |
| 51 | +descriptions that LLMs are prone to generate. The description should explain |
| 52 | +the motivation, implementation approach, expected impact, and any open |
| 53 | +questions or uncertainties to the same extent as a contribution made without |
| 54 | +tool assistance. |
| 55 | + |
| 56 | +An important implication of this policy is that it bans agents that take action |
| 57 | +in our digital spaces without human approval, such as the GitHub `@claude` |
| 58 | +agent. Similarly, automated review tools that |
| 59 | +publish comments without human review are not allowed. However, an opt-in |
| 60 | +review tool that **keeps a human in the loop** is acceptable under this policy. |
| 61 | +As another example, using an LLM to generate documentation, which a contributor |
| 62 | +manually reviews for correctness and relevance, edits, and then posts as a PR, |
| 63 | +is an approved use of tools under this policy. |
| 64 | + |
| 65 | +Extractive Contributions |
| 66 | +------------------------ |
| 67 | + |
| 68 | +The reason for our "human-in-the-loop" contribution policy is that processing |
| 69 | +patches, PRs, RFCs, comments, issues, security alerts to GDAL is not free -- |
| 70 | +it takes a lot of maintainer time and energy to review those contributions! Sending the |
| 71 | +unreviewed output of an LLM to open source project maintainers *extracts* work |
| 72 | +from them in the form of design and code review, so we call this kind of |
| 73 | +contribution an "extractive contribution". |
| 74 | + |
| 75 | +Our **golden rule** is that a contribution should be worth more to the project |
| 76 | +than the time it takes to review it. These ideas are captured by this quote |
| 77 | +from the book `Working in Public <https://press.stripe.com/working-in-public>`__ by Nadia Eghbal: |
| 78 | + |
| 79 | + When attention is being appropriated, producers need to weigh the costs and |
| 80 | + benefits of the transaction. To assess whether the appropriation of attention |
| 81 | + is net-positive, it's useful to distinguish between *extractive* and |
| 82 | + *non-extractive* contributions. Extractive contributions are those where the |
| 83 | + marginal cost of reviewing and merging that contribution is greater than the |
| 84 | + marginal benefit to the project's producers. In the case of a code |
| 85 | + contribution, it might be a pull request that's too complex or unwieldy to |
| 86 | + review, given the potential upside. |
| 87 | + |
| 88 | + -- Nadia Eghbal |
| 89 | + |
| 90 | + |
| 91 | +Prior to the advent of LLMs, open source project maintainers would often review |
| 92 | +any and all changes sent to the project simply because posting a change for |
| 93 | +review was a sign of interest from a potential long-term contributor. While new |
| 94 | +tools enable more development, it shifts effort from the implementor to the |
| 95 | +reviewer, and our policy exists to ensure that we value and do not squander |
| 96 | +maintainer time. |
| 97 | + |
| 98 | +Handling Violations |
| 99 | +------------------- |
| 100 | + |
| 101 | +If a maintainer judges that a contribution doesn't comply with this policy, |
| 102 | +they should paste the following response to request changes: |
| 103 | + |
| 104 | +.. code-block:: text |
| 105 | +
|
| 106 | + This PR does not appear to comply with our policy on tool-generated content, |
| 107 | + and requires additional justification for why it is valuable enough to the |
| 108 | + project for us to review it. Please see our developer policy on |
| 109 | + AI-generated contributions: |
| 110 | + https://gdal.org/community/_ai_tool_policy.html |
| 111 | +
|
| 112 | +The best ways to make a change less extractive and more valuable are to reduce |
| 113 | +its size or complexity or to increase its usefulness to the community. These |
| 114 | +factors are impossible to weigh objectively, and our project policy leaves this |
| 115 | +determination up to the maintainers of the project, i.e. those who are doing |
| 116 | +the work of sustaining the project. |
| 117 | + |
| 118 | +If/or when it becomes clear that a GitHub issue or PR is off-track and not |
| 119 | +moving in the right direction, maintainers should apply the `extractive` label |
| 120 | +to help other reviewers prioritize their review time. |
| 121 | + |
| 122 | +If a contributor fails to make their change meaningfully less extractive, |
| 123 | +maintainers may lock the conversation and/or close the pull request/issue/RFC. |
| 124 | +In case of repeated violations of our policy, the GDAL project reserves itself |
| 125 | +the right to ban temporarily or definitely the infringing person/account. |
| 126 | + |
| 127 | +Copyright |
| 128 | +--------- |
| 129 | + |
| 130 | +Artificial intelligence systems raise many questions around copyright that have |
| 131 | +yet to be answered. Our policy on AI tools is similar to our copyright policy: |
| 132 | +Contributors are responsible for ensuring that they have the right to |
| 133 | +contribute code under the terms of our license, typically meaning that either |
| 134 | +they, their employer, or their collaborators hold the copyright. Using AI tools |
| 135 | +to regenerate copyrighted material does not remove the copyright, and |
| 136 | +contributors are responsible for ensuring that such material does not appear in |
| 137 | +their contributions. Contributions found to violate this policy will be removed |
| 138 | +just like any other offending contribution. If a reviewer has doubts about the |
| 139 | +legal aspects of a contribution, they may ask the contributor to provide more |
| 140 | +details on the origins of a particular piece of code. |
| 141 | + |
| 142 | +Credits for this document |
| 143 | +------------------------- |
| 144 | + |
| 145 | +This document is a quasi direct adaptation from the |
| 146 | +`LLVM software "AI Tool Use Policy" <https://github.com/llvm/llvm-project/blob/main/llvm/docs/AIToolPolicy.md>`__, |
| 147 | +and due credits go to its original authors: Reid Kleckner, Hubert Tong and |
| 148 | +"maflcko" |
| 149 | + |
| 150 | +.. below is an allow-list for spelling checker. |
| 151 | +
|
| 152 | +.. spelling:word-list:: |
| 153 | + Reid |
| 154 | + Kleckner |
| 155 | + Hubert |
| 156 | + Tong |
| 157 | + maflcko |
| 158 | + LLM |
| 159 | + unreviewed |
| 160 | + Eghbal |
| 161 | + implementor |
0 commit comments