Skip to content

Develop policy around LLM generated code in our packages submissionsΒ #331

@lwasser

Description

@lwasser

We have our first submission that acknowledges use of LLMs in the codebase development

I appreciate that it's acknowledged in the code base. I assume that many other submissions use / have used it too, but aren't acknowledging it.
JOSS discussed this topic, but didn't land on a policy (that i can see):

openjournals/joss#1297

JOSS has also faced challenges with LLM-driven submissions, which have made it easier for users to generate more code more quickly. This in part has stressed the review pipeline (humans trying to review rapidly developed packages).

I see several potential challenges to consider here:

  1. LICENSES: If maintainers use MIT / BSD-3 but the LLM-generated code was trained on other licenses that are less permissive (e.g., copy left) or other, how does that work (is it legal?)
  2. Human Impact: We have humans taking time to review code that is non-human-generated (in part or full)
  3. Ethical: there are significant ethical issues with LLM-generated code. However, we also know that MANY people do use LLMs in their daily workflows, and it helps them. We also know that it has the potential to be dangerous for groups that are underrepresented in open source.

We should develop a policy or at least have some language around this topic. However, I want to understand people's thoughts on this topic better first.

Other relevant links:

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions