Skip to content

Conversation

@htahir1
Copy link
Contributor

@htahir1 htahir1 commented Aug 25, 2025

Summary

QualityFlow Project Summary

QualityFlow is a ZenML-powered MLOps pipeline that demonstrates AI-driven test generation for Python
codebases.

What it does:

  • Clones a Git repository and analyzes Python files
  • Generates unit tests using LLMs (OpenAI/Anthropic) or a fake provider
  • Creates baseline tests using heuristic analysis for comparison
  • Runs both test suites and measures code coverage
  • Produces detailed reports comparing LLM vs baseline approaches

Key Features:

  • Real LLM integration with cost tracking
  • Multiple test generation strategies (AI vs heuristic)
  • Coverage analysis with detailed metrics
  • Configurable execution (file limits, providers, prompts)
  • ZenML best practices (steps, pipelines, artifacts, metadata)

Perfect for:

  • Learning ZenML pipeline patterns
  • Understanding LLM integration in MLOps
  • Demonstrating automated testing workflows
  • Comparing AI vs traditional approaches

Get started in 3 steps:

  1. pip install -r requirements.txt
  2. export OPENAI_API_KEY="..." (optional)
  3. python run.py

A solid example of production-ready MLOps with practical business value - automated test generation at
scale.

Checklist

  • I have read the contributing guidelines.
  • I have run the necessary tests and linters.
  • I have updated relevant documentation where applicable.

Related Issues

Please link to any relevant issues or discussions.

@dagshub
Copy link

dagshub bot commented Aug 25, 2025

@htahir1 htahir1 requested a review from AlexejPenner August 25, 2025 09:25
@htahir1 htahir1 requested a review from strickvl August 25, 2025 11:56
@strickvl strickvl added enhancement New feature or request internal labels Aug 25, 2025
@htahir1 htahir1 merged commit e7efccc into main Aug 25, 2025
5 checks passed
@htahir1
Copy link
Contributor Author

htahir1 commented Aug 25, 2025

@strickvl merging this in because i need it now but please continue your review i will fix it in another branch

Copy link
Contributor

@strickvl strickvl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the agentic test generation bit needs a closer look, but otherwise the rest of my comments are smaller stuff...

Use the junit.xml file instead of parsing the stdout output

We already have the junit.xml files, so we should use them. Makes the parsing less brittle.

We can defer to parsing the XML first, and then always fallback to the stdout parsing when that fails.

Duplication in evaluate_coverage.py (currently unused)

I think the logic in the evaluate_coverage.py is duplicated inside the report.py step.

So probably best to reintroduce the evaluate_coverage step into the pipeline+ remove the duplicated logic from the report step?

README stuff

list of pipeline steps is maybe a bit confusing. maybe split the test execution into two separate points, and also add an 'evaluation' one (if we restore the step as per above comment)?

Unimplemented features / things just needing a note in the README or something

  • The CHANGED_FILES strategy in analyze_code.py is a stub that falls back to selecting all files. This should be made clear in the documentation to avoid confusion.
  • The LLM cost estimation in steps/gen_tests_agent.py uses hardcoded price values. These will quickly go out of date. It would be better to add a comment indicating that these are estimates and link to the official pricing pages for OpenAI and Anthropic.

Dependencies

  • requirements.txt is missing hypothesis
  • also openai + anthropic are listed as optional, but I think the code will fail when they're not installed (i.e. the code doesn't handle their absence fully)

Test generation

I think this is probably the bit which needs the most work for it to be a bit more serious?

Both fake and baseline tests generate trivial code that doesn't actually test the source.

  • Fake tests: Always pass with assertions like self.assertTrue(True)
  • Baseline tests: Generate skeleton methods with just pass statements

Even when the LLM calls succeed for the agentic test generation, the current prompt templates include a commented-out import line, so the generated tests may still not import or exercise the target module unless the model decides to do so. Which makes the coverage low or misleading?

This makes coverage comparisons somewhat meaningless. Additionally, coverage is currently collected for the entire workspace path, which can include the generated tests themselves if I'm not mistaken?.

Resource cleanup

Temporary directories aren't always cleaned up properly on errors:

  • fetch_source.py creates temp dirs but cleanup only happens on some error paths
  • run_tests.py has finally block but uses ignore_errors=True


If your project only requires Python dependencies listed in `requirements.txt`, **do not include a Dockerfile**. The projects backend will automatically build your project using the generic Dockerfile available at:
[https://github.com/zenml-io/zenml-projects-backend/blob/main/.docker/project.Dockerfile](https://github.com/zenml-io/zenml-projects-backend/blob/main/.docker/project.Dockerfile)
If your project only requires Python dependencies listed in `requirements.txt`, **do not include a Dockerfile**. The projects backend will automatically build your project using the generic Dockerfile available at the zenml-projects-backend repo.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If your project only requires Python dependencies listed in `requirements.txt`, **do not include a Dockerfile**. The projects backend will automatically build your project using the generic Dockerfile available at the zenml-projects-backend repo.
If your project only requires Python dependencies listed in `requirements.txt`, **do not include a Dockerfile**. The projects backend will automatically build your project using the generic Dockerfile available at the [zenml-projects-backend](https://github.com/zenml-io/zenml-projects-backend) repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request internal

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants