Releases · oracle-quickstart/oci-hpc-oke

30 Oct 19:01

OguzPastirmaci

v25.10.0

781eaba

OKE RDMA Quickstart Resource Manager template v25.10.0 Latest

Latest

Kubernetes upgrade: Added support for Kubernetes v1.34
Documentation: New guide — Deploying Prometheus & Grafana Stack with Dashboards and Alerts manually
Health checks:
- Added RCCL tests
- Added RocM Validation Suite (RVS) gst_single for AMD validation
Grafana access link: Default domain updated to endpoint.oci-hpc.ai, configurable for custom domains
Component updates: Refreshed dependencies and minor fixes across the stack

Full Changelog: v25.9.0...v25.10.0

Assets 3

25 Sep 19:33

OguzPastirmaci

v25.9.0

185aceb

OKE RDMA Quickstart Resource Manager template v25.9.0

Option to provision a shared Lustre file system and a PV backed by the Lustre file system
Fully private clusters using Resource Manager Private Endpoint for deployment
Same dashboards and notifications with the Slurm stack
Option to use Oracle Linux for non-RDMA pools
Component updates

Assets 3

18 Jun 23:13

OguzPastirmaci

v25.5.1

a3a2d04

OKE RDMA Quickstart Resource Manager template v25.5.1

This is a hotfix release to fix the breaking Helm provider change.

More info about the change here: hashicorp/terraform-provider-helm#1637

Assets 4

16 May 05:32

OguzPastirmaci

v25.5.0

0ce2acc

OKE RDMA Quickstart Resource Manager template v25.5.0

Added AMD Device Metrics Exporter
Added AMD dashboards

Assets 3

22 Apr 04:20

OguzPastirmaci

v25.4.0

3fa53ef

OKE RDMA Quickstart Resource Manager template v25.4.0

Added Kubernetes v1.32
Changed the default number of maximum pods per node to 110

Assets 3

31 Mar 04:54

OguzPastirmaci

v25.3.1

6bac725

OKE RDMA Quickstart Resource Manager template v25.3.1

OKE AMD GPU device plugin is enabled for BM.GPU.MI300X.8 shape
OKE DCGM Exporter is disabled (upstream DCGM Exporter is deployed)
Helm fix for Grafana load balancer not being deleted properly on Terraform destroy
Updated the health checks for Node Problem Detector
Updated Grafana dashboards
Added the required policies for Oracle Cloud Agent GPU/RDMA monitoring

Assets 3

18 Mar 20:49

OguzPastirmaci

v25.3.0

6bac725

OKE RDMA Quickstart Resource Manager template v25.3.0

VCN-native pod networking is now the default option for pod networking instead of Flannel.
Node Problem Detector is now deployed part of the stack and integrated with the Prometheus/Grafana stack for alerting.
Switched to using the upstream OKE Terraform module.

Assets 3

03 Mar 18:14

OguzPastirmaci

v25.3.0-beta

4abe3df

OKE RDMA Quickstart Resource Manager template v25.3.0-beta Pre-release

Pre-release

VCN-native pod networking is now the default option for pod networking instead of Flannel.
Node Problem Detector is now deployed part of the stack.
Fixed a Node Exporter issue preventing metrics from being streamed from bare metal GPU nodes.

Assets 3

05 Feb 23:43

OguzPastirmaci

v25.2.0

cd6b384

OKE RDMA Quickstart Resource Manager template v25.2.0

The OKE GPU Device plugin is now enabled by default.
Added Kubernetes version 1.30 & 1.31.

Assets 3

20 Oct 21:27

OguzPastirmaci

v24.10.0

b09f579

OKE RDMA Quickstart Resource Manager template v24.10.0

Important

Because we moved to Terraform v1.5, this new release is a breaking change. Do not deploy this stack in your existing OKE clusters, only use for deploying new clusters.

Updated to Terraform v1.5, the same templates can now be used for both OCI Resource Manager and regular Terraform.
The bastion and operator nodes now use Ubuntu.
Added an option to deploy the Prometheus/Grafana stack with DCGM Exporter.
Added an option to create a RAID 0 array using the local NVMe drives on the nodes and configure Kubernetes to use it for container storage.
Added options to create storage classes for FSS (File Storage Service) and high performance block volumes.

Assets 3

Releases: oracle-quickstart/oci-hpc-oke

OKE RDMA Quickstart Resource Manager template v25.10.0

Uh oh!

OKE RDMA Quickstart Resource Manager template v25.9.0

Uh oh!

OKE RDMA Quickstart Resource Manager template v25.5.1

Uh oh!

OKE RDMA Quickstart Resource Manager template v25.5.0

Uh oh!

OKE RDMA Quickstart Resource Manager template v25.4.0

Uh oh!

OKE RDMA Quickstart Resource Manager template v25.3.1

Uh oh!

OKE RDMA Quickstart Resource Manager template v25.3.0

Uh oh!

OKE RDMA Quickstart Resource Manager template v25.3.0-beta

Uh oh!

OKE RDMA Quickstart Resource Manager template v25.2.0

Uh oh!

OKE RDMA Quickstart Resource Manager template v24.10.0

Uh oh!