Skip to content

[Umbrella] Improve RayJob UX, performance, and scalability #3907

@kevin85421

Description

@kevin85421

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

  • Startup time

    • Light-weight job submitter.
  • Stabiility (avoid cross Pod communication) and resource efficiency

  • Performance

    • Verify multi-threading reconciler stability.
    • Provide configurations for users to configure the frequency to query Ray dashboard instead of sending requests for every reconciliation.
    • Create a background goroutine pool that queries the Ray Dashboard for job statuses and caches the results.
  • Benchmark: Create multiple RayJob CRs and each one require autoscaling.

Use case

No response

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Sub-issues

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions