Skip to content

Code Generation Evaluation Framework #6

@mcelrath

Description

@mcelrath

This Issue is to track the progress of an evaluation framework for code generation functionality.

  • Pass RAG data ChatBTC RAG data #3 to large-ish LLM
  • Ingest the codebase and dependencies
  • Delete functions and ask LLM to recreate them, use projects own tests to evaluate
  • Any other coding benchmarks we can think of, with a focus on using the contextual data.

Resources:

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions