Add QualityFlow: AI powered test generation pipeline project #242

htahir1 · 2025-08-25T09:25:15Z

Summary

QualityFlow Project Summary

QualityFlow is a ZenML-powered MLOps pipeline that demonstrates AI-driven test generation for Python
codebases.

What it does:

Clones a Git repository and analyzes Python files
Generates unit tests using LLMs (OpenAI/Anthropic) or a fake provider
Creates baseline tests using heuristic analysis for comparison
Runs both test suites and measures code coverage
Produces detailed reports comparing LLM vs baseline approaches

Key Features:

Real LLM integration with cost tracking
Multiple test generation strategies (AI vs heuristic)
Coverage analysis with detailed metrics
Configurable execution (file limits, providers, prompts)
ZenML best practices (steps, pipelines, artifacts, metadata)

Perfect for:

Learning ZenML pipeline patterns
Understanding LLM integration in MLOps
Demonstrating automated testing workflows
Comparing AI vs traditional approaches

Get started in 3 steps:

pip install -r requirements.txt
export OPENAI_API_KEY="..." (optional)
python run.py

A solid example of production-ready MLOps with practical business value - automated test generation at
scale.

Checklist

I have read the contributing guidelines.
I have run the necessary tests and linters.
I have updated relevant documentation where applicable.

Related Issues

Please link to any relevant issues or discussions.

…ions and code organization

dagshub · 2025-08-25T09:25:18Z

Join the discussion on DagsHub!

htahir1 · 2025-08-25T14:41:32Z

@strickvl merging this in because i need it now but please continue your review i will fix it in another branch

strickvl

I think the agentic test generation bit needs a closer look, but otherwise the rest of my comments are smaller stuff...

Use the junit.xml file instead of parsing the `stdout` output

We already have the junit.xml files, so we should use them. Makes the parsing less brittle.

We can defer to parsing the XML first, and then always fallback to the stdout parsing when that fails.

Duplication in evaluate_coverage.py (currently unused)

I think the logic in the evaluate_coverage.py is duplicated inside the report.py step.

So probably best to reintroduce the evaluate_coverage step into the pipeline+ remove the duplicated logic from the report step?

README stuff

list of pipeline steps is maybe a bit confusing. maybe split the test execution into two separate points, and also add an 'evaluation' one (if we restore the step as per above comment)?

Unimplemented features / things just needing a note in the README or something

The CHANGED_FILES strategy in analyze_code.py is a stub that falls back to selecting all files. This should be made clear in the documentation to avoid confusion.
The LLM cost estimation in steps/gen_tests_agent.py uses hardcoded price values. These will quickly go out of date. It would be better to add a comment indicating that these are estimates and link to the official pricing pages for OpenAI and Anthropic.

Dependencies

requirements.txt is missing hypothesis
also openai + anthropic are listed as optional, but I think the code will fail when they're not installed (i.e. the code doesn't handle their absence fully)

Test generation

I think this is probably the bit which needs the most work for it to be a bit more serious?

Both fake and baseline tests generate trivial code that doesn't actually test the source.

Fake tests: Always pass with assertions like self.assertTrue(True)
Baseline tests: Generate skeleton methods with just pass statements

Even when the LLM calls succeed for the agentic test generation, the current prompt templates include a commented-out import line, so the generated tests may still not import or exercise the target module unless the model decides to do so. Which makes the coverage low or misleading?

This makes coverage comparisons somewhat meaningless. Additionally, coverage is currently collected for the entire workspace path, which can include the generated tests themselves if I'm not mistaken?.

Resource cleanup

Temporary directories aren't always cleaned up properly on errors:

fetch_source.py creates temp dirs but cleanup only happens on some error paths
run_tests.py has finally block but uses ignore_errors=True

strickvl · 2025-08-25T13:25:30Z

ADDING_PROJECTS.md


-If your project only requires Python dependencies listed in `requirements.txt`, **do not include a Dockerfile**. The projects backend will automatically build your project using the generic Dockerfile available at:
-[https://github.com/zenml-io/zenml-projects-backend/blob/main/.docker/project.Dockerfile](https://github.com/zenml-io/zenml-projects-backend/blob/main/.docker/project.Dockerfile)
+If your project only requires Python dependencies listed in `requirements.txt`, **do not include a Dockerfile**. The projects backend will automatically build your project using the generic Dockerfile available at the zenml-projects-backend repo.


Suggested change

If your project only requires Python dependencies listed in `requirements.txt`, **do not include a Dockerfile**. The projects backend will automatically build your project using the generic Dockerfile available at the zenml-projects-backend repo.

If your project only requires Python dependencies listed in `requirements.txt`, **do not include a Dockerfile**. The projects backend will automatically build your project using the generic Dockerfile available at the [zenml-projects-backend](https://github.com/zenml-io/zenml-projects-backend) repo.

htahir1 and others added 5 commits August 24, 2025 22:29

Added new projecg

e0fac67

Formattingg

3656f91

Update loading prompt template and add log for missing file

f04f490

Remove unnecessary 'ast' requirement from file

da184b4

Update test generation pipeline for QualityFlow.- Improved configurat…

44c0954

…ions and code organization

htahir1 requested a review from AlexejPenner August 25, 2025 09:25

htahir1 added 3 commits August 25, 2025 12:16

Add local examples testing option

7294da9

Update project Dockerfile link and add QualityFlow project

5db0a78

Update link in ADDING_PROJECTS.md to zenml-projects-backend

80fbe12

htahir1 requested a review from strickvl August 25, 2025 11:56

strickvl added enhancement New feature or request internal labels Aug 25, 2025

htahir1 merged commit e7efccc into main Aug 25, 2025
5 checks passed

strickvl reviewed Aug 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add QualityFlow: AI powered test generation pipeline project #242

Add QualityFlow: AI powered test generation pipeline project #242

Uh oh!

htahir1 commented Aug 25, 2025

Uh oh!

dagshub bot commented Aug 25, 2025

Uh oh!

Uh oh!

htahir1 commented Aug 25, 2025

Uh oh!

strickvl left a comment

Uh oh!

strickvl Aug 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	If your project only requires Python dependencies listed in `requirements.txt`, do not include a Dockerfile. The projects backend will automatically build your project using the generic Dockerfile available at the zenml-projects-backend repo.
	If your project only requires Python dependencies listed in `requirements.txt`, do not include a Dockerfile. The projects backend will automatically build your project using the generic Dockerfile available at the [zenml-projects-backend](https://github.com/zenml-io/zenml-projects-backend) repo.

Add QualityFlow: AI powered test generation pipeline project #242

Add QualityFlow: AI powered test generation pipeline project #242

Uh oh!

Conversation

htahir1 commented Aug 25, 2025

Summary

Checklist

Related Issues

Uh oh!

dagshub bot commented Aug 25, 2025

Uh oh!

Uh oh!

htahir1 commented Aug 25, 2025

Uh oh!

strickvl left a comment

Choose a reason for hiding this comment

Use the junit.xml file instead of parsing the stdout output

Duplication in evaluate_coverage.py (currently unused)

README stuff

Unimplemented features / things just needing a note in the README or something

Dependencies

Test generation

Resource cleanup

Uh oh!

strickvl Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Use the junit.xml file instead of parsing the `stdout` output