Skip to content

Have SWE-smith challenge the model to generate *tests* #64

@ofirpress

Description

@ofirpress

Right now we only get the model to generate bug-fixes/feature-requests.

What if we could also get it to generate tests?

I think we should use an approach similar to SWT-bench https://swtbench.com/

where we have it generate a test and then test it by making sure the test fails before a gold-patch is applied and passes after the gold-patch is applied.

Testing is one of the most important aspects of SWE, so this enhancement could really drive up performance a lot.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions