Skip to content

Commit 2a31573

Browse files
rename node groups, always enable materialize node group
* replace mz node group in eks module with system node group * this is a replace, not a rename, due to limitations on `moved` blocks. * rename swap node group to materialize node group * split up variables for different node groups * remove some vars that should never be modified * always enable materialize node group * default to swap enabled * move to bottlerocket on both node groups * remove openebs and support for lgalloc scratch-fs
1 parent ccb42c8 commit 2a31573

File tree

10 files changed

+101
-321
lines changed

10 files changed

+101
-321
lines changed

README.md

Lines changed: 23 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -61,41 +61,8 @@ You can also set the `AWS_PROFILE` environment variable to the name of the profi
6161
export AWS_PROFILE=your-profile-name
6262
```
6363

64-
## Disk Support for Materialize
65-
66-
This module supports configuring disk support for Materialize using NVMe instance storage and OpenEBS and lgalloc.
67-
68-
When using disk support, you need to use instance types from the `r7gd` or `r6gd` family or other instance types with NVMe instance storage.
69-
70-
### Enabling Disk Support
71-
72-
To enable disk support with default settings:
73-
74-
```hcl
75-
enable_disk_support = true
76-
```
77-
78-
This will:
79-
1. Install OpenEBS via Helm
80-
2. Configure NVMe instance store volumes using the bootstrap script
81-
3. Create appropriate storage classes for Materialize
82-
8364
### Advanced Configuration
8465

85-
In case that you need more control over the disk setup:
86-
87-
```hcl
88-
enable_disk_support = true
89-
90-
disk_support_config = {
91-
openebs_version = "4.3.3"
92-
storage_class_name = "custom-storage-class"
93-
storage_class_parameters = {
94-
volgroup = "custom-volume-group"
95-
}
96-
}
97-
```
98-
9966
## `materialize_instances` variable
10067

10168
The `materialize_instances` variable is a list of objects that define the configuration for each Materialize instance.
@@ -139,11 +106,11 @@ These flags configure default limits for clusters, connections, and tables. You
139106
| <a name="module_certificates"></a> [certificates](#module\_certificates) | ./modules/certificates | n/a |
140107
| <a name="module_database"></a> [database](#module\_database) | ./modules/database | n/a |
141108
| <a name="module_eks"></a> [eks](#module\_eks) | ./modules/eks | n/a |
109+
| <a name="module_materialize_node_group"></a> [materialize\_node\_group](#module\_materialize\_node\_group) | ./modules/eks-node-group | n/a |
142110
| <a name="module_networking"></a> [networking](#module\_networking) | ./modules/networking | n/a |
143111
| <a name="module_nlb"></a> [nlb](#module\_nlb) | ./modules/nlb | n/a |
144112
| <a name="module_operator"></a> [operator](#module\_operator) | github.com/MaterializeInc/terraform-helm-materialize | v0.1.35 |
145113
| <a name="module_storage"></a> [storage](#module\_storage) | ./modules/storage | n/a |
146-
| <a name="module_swap_node_group"></a> [swap\_node\_group](#module\_swap\_node\_group) | ./modules/eks-node-group | n/a |
147114

148115
## Resources
149116

@@ -175,11 +142,9 @@ These flags configure default limits for clusters, connections, and tables. You
175142
| <a name="input_db_instance_class"></a> [db\_instance\_class](#input\_db\_instance\_class) | Instance class for the RDS instance. This is used for concensus and metadata and is general not bottlnecked by memory or disk. Recomended instance family m7i, m6i, m7g, and m8g | `string` | `"db.m6i.large"` | no |
176143
| <a name="input_db_max_allocated_storage"></a> [db\_max\_allocated\_storage](#input\_db\_max\_allocated\_storage) | Maximum storage for autoscaling (in GB) | `number` | `100` | no |
177144
| <a name="input_db_multi_az"></a> [db\_multi\_az](#input\_db\_multi\_az) | Enable multi-AZ deployment for RDS | `bool` | `false` | no |
178-
| <a name="input_disk_support_config"></a> [disk\_support\_config](#input\_disk\_support\_config) | Advanced configuration for disk support (only used when enable\_disk\_support = true) | <pre>object({<br/> install_openebs = optional(bool, true)<br/> run_disk_setup_script = optional(bool, true)<br/> create_storage_class = optional(bool, true)<br/> openebs_version = optional(string, "4.3.3")<br/> openebs_namespace = optional(string, "openebs")<br/> storage_class_name = optional(string, "openebs-lvm-instance-store-ext4")<br/> storage_class_provisioner = optional(string, "local.csi.openebs.io")<br/> storage_class_parameters = optional(object({<br/> storage = optional(string, "lvm")<br/> fsType = optional(string, "ext4")<br/> volgroup = optional(string, "instance-store-vg")<br/> }), {})<br/> })</pre> | `{}` | no |
179145
| <a name="input_enable_bucket_encryption"></a> [enable\_bucket\_encryption](#input\_enable\_bucket\_encryption) | Enable server-side encryption for the S3 bucket | `bool` | `true` | no |
180146
| <a name="input_enable_bucket_versioning"></a> [enable\_bucket\_versioning](#input\_enable\_bucket\_versioning) | Enable versioning for the S3 bucket | `bool` | `true` | no |
181147
| <a name="input_enable_cluster_creator_admin_permissions"></a> [enable\_cluster\_creator\_admin\_permissions](#input\_enable\_cluster\_creator\_admin\_permissions) | To add the current caller identity as an administrator | `bool` | `true` | no |
182-
| <a name="input_enable_disk_support"></a> [enable\_disk\_support](#input\_enable\_disk\_support) | Enable disk support for Materialize using OpenEBS and NVMe instance storage. When enabled, this configures OpenEBS, runs the disk setup script for NVMe devices, and creates appropriate storage classes. | `bool` | `true` | no |
183148
| <a name="input_enable_monitoring"></a> [enable\_monitoring](#input\_enable\_monitoring) | Enable CloudWatch monitoring | `bool` | `true` | no |
184149
| <a name="input_environment"></a> [environment](#input\_environment) | Environment name (e.g., prod, staging, dev) | `string` | n/a | yes |
185150
| <a name="input_helm_chart"></a> [helm\_chart](#input\_helm\_chart) | Chart name from repository or local path to chart. For local charts, set the path to the chart directory. | `string` | `"materialize-operator"` | no |
@@ -191,17 +156,15 @@ These flags configure default limits for clusters, connections, and tables. You
191156
| <a name="input_kubernetes_namespace"></a> [kubernetes\_namespace](#input\_kubernetes\_namespace) | The Kubernetes namespace for the Materialize resources | `string` | `"materialize-environment"` | no |
192157
| <a name="input_log_group_name_prefix"></a> [log\_group\_name\_prefix](#input\_log\_group\_name\_prefix) | Prefix for the CloudWatch log group name (will be combined with environment name) | `string` | `"materialize"` | no |
193158
| <a name="input_materialize_instances"></a> [materialize\_instances](#input\_materialize\_instances) | Configuration for Materialize instances. Due to limitations in Terraform, `materialize_instances` cannot be defined on the first `terraform apply`. | <pre>list(object({<br/> name = string<br/> namespace = optional(string)<br/> database_name = string<br/> environmentd_version = optional(string)<br/> cpu_request = optional(string, "1")<br/> memory_request = optional(string, "1Gi")<br/> memory_limit = optional(string, "1Gi")<br/> create_database = optional(bool, true)<br/> create_nlb = optional(bool, true)<br/> internal_nlb = optional(bool, true)<br/> enable_cross_zone_load_balancing = optional(bool, true)<br/> in_place_rollout = optional(bool, false)<br/> request_rollout = optional(string)<br/> force_rollout = optional(string)<br/> balancer_memory_request = optional(string, "256Mi")<br/> balancer_memory_limit = optional(string, "256Mi")<br/> balancer_cpu_request = optional(string, "100m")<br/> license_key = optional(string)<br/> authenticator_kind = optional(string, "None")<br/> external_login_password_mz_system = optional(string)<br/> environmentd_extra_args = optional(list(string), [])<br/> }))</pre> | `[]` | no |
159+
| <a name="input_materialize_node_group_desired_size"></a> [materialize\_node\_group\_desired\_size](#input\_materialize\_node\_group\_desired\_size) | Desired number of worker nodes | `number` | `2` | no |
160+
| <a name="input_materialize_node_group_instance_types"></a> [materialize\_node\_group\_instance\_types](#input\_materialize\_node\_group\_instance\_types) | Instance types for worker nodes.<br/><br/>Recommended Configuration for Running Materialize with disk:<br/>- Tested instance types: `r6gd`, `r7gd` families (ARM-based Graviton instances)<br/>- Enable disk setup when using `r7gd`<br/>- Note: Ensure instance store volumes are available and attached to the nodes for optimal performance with disk-based workloads. | `list(string)` | <pre>[<br/> "r7gd.2xlarge"<br/>]</pre> | no |
161+
| <a name="input_materialize_node_group_max_size"></a> [materialize\_node\_group\_max\_size](#input\_materialize\_node\_group\_max\_size) | Maximum number of worker nodes | `number` | `4` | no |
162+
| <a name="input_materialize_node_group_min_size"></a> [materialize\_node\_group\_min\_size](#input\_materialize\_node\_group\_min\_size) | Minimum number of worker nodes | `number` | `1` | no |
194163
| <a name="input_metrics_retention_days"></a> [metrics\_retention\_days](#input\_metrics\_retention\_days) | Number of days to retain CloudWatch metrics | `number` | `7` | no |
195164
| <a name="input_namespace"></a> [namespace](#input\_namespace) | Namespace for all resources, usually the organization or project name | `string` | n/a | yes |
196165
| <a name="input_network_id"></a> [network\_id](#input\_network\_id) | The ID of the VPC in which resources will be deployed. Only used if create\_vpc is false. | `string` | `""` | no |
197166
| <a name="input_network_private_subnet_ids"></a> [network\_private\_subnet\_ids](#input\_network\_private\_subnet\_ids) | A list of private subnet IDs in the VPC. Only used if create\_vpc is false. | `list(string)` | `[]` | no |
198167
| <a name="input_network_public_subnet_ids"></a> [network\_public\_subnet\_ids](#input\_network\_public\_subnet\_ids) | A list of public subnet IDs in the VPC. Only used if create\_vpc is false. | `list(string)` | `[]` | no |
199-
| <a name="input_node_group_ami_type"></a> [node\_group\_ami\_type](#input\_node\_group\_ami\_type) | AMI type for the node group | `string` | `"AL2023_ARM_64_STANDARD"` | no |
200-
| <a name="input_node_group_capacity_type"></a> [node\_group\_capacity\_type](#input\_node\_group\_capacity\_type) | Capacity type for worker nodes (ON\_DEMAND or SPOT) | `string` | `"ON_DEMAND"` | no |
201-
| <a name="input_node_group_desired_size"></a> [node\_group\_desired\_size](#input\_node\_group\_desired\_size) | Desired number of worker nodes | `number` | `2` | no |
202-
| <a name="input_node_group_instance_types"></a> [node\_group\_instance\_types](#input\_node\_group\_instance\_types) | Instance types for worker nodes.<br/><br/>Recommended Configuration for Running Materialize with disk:<br/>- Tested instance types: `r6gd`, `r7gd` families (ARM-based Graviton instances)<br/>- Enable disk setup when using `r7gd`<br/>- Note: Ensure instance store volumes are available and attached to the nodes for optimal performance with disk-based workloads. | `list(string)` | <pre>[<br/> "r7gd.2xlarge"<br/>]</pre> | no |
203-
| <a name="input_node_group_max_size"></a> [node\_group\_max\_size](#input\_node\_group\_max\_size) | Maximum number of worker nodes | `number` | `4` | no |
204-
| <a name="input_node_group_min_size"></a> [node\_group\_min\_size](#input\_node\_group\_min\_size) | Minimum number of worker nodes | `number` | `1` | no |
205168
| <a name="input_operator_namespace"></a> [operator\_namespace](#input\_operator\_namespace) | Namespace for the Materialize operator | `string` | `"materialize"` | no |
206169
| <a name="input_operator_version"></a> [operator\_version](#input\_operator\_version) | Version of the Materialize operator to install | `string` | `null` | no |
207170
| <a name="input_orchestratord_version"></a> [orchestratord\_version](#input\_orchestratord\_version) | Version of the Materialize orchestrator to install | `string` | `null` | no |
@@ -210,7 +173,11 @@ These flags configure default limits for clusters, connections, and tables. You
210173
| <a name="input_public_subnet_cidrs"></a> [public\_subnet\_cidrs](#input\_public\_subnet\_cidrs) | CIDR blocks for public subnets | `list(string)` | <pre>[<br/> "10.0.101.0/24",<br/> "10.0.102.0/24",<br/> "10.0.103.0/24"<br/>]</pre> | no |
211174
| <a name="input_service_account_name"></a> [service\_account\_name](#input\_service\_account\_name) | Name of the service account | `string` | `"12345678-1234-1234-1234-123456789012"` | no |
212175
| <a name="input_single_nat_gateway"></a> [single\_nat\_gateway](#input\_single\_nat\_gateway) | Use a single NAT Gateway for all private subnets | `bool` | `false` | no |
213-
| <a name="input_swap_enabled"></a> [swap\_enabled](#input\_swap\_enabled) | Enable swap for Materialize. When enabled, this configures swap on a new nodepool, and adds it to the clusterd node selectors. | `bool` | `false` | no |
176+
| <a name="input_swap_enabled"></a> [swap\_enabled](#input\_swap\_enabled) | Enable swap for Materialize. When enabled, this configures swap on a new nodepool, and adds it to the clusterd node selectors. | `bool` | `true` | no |
177+
| <a name="input_system_node_group_desired_size"></a> [system\_node\_group\_desired\_size](#input\_system\_node\_group\_desired\_size) | Desired number of worker nodes | `number` | `2` | no |
178+
| <a name="input_system_node_group_instance_types"></a> [system\_node\_group\_instance\_types](#input\_system\_node\_group\_instance\_types) | Instance types for system nodes. | `list(string)` | <pre>[<br/> "r7g.xlarge"<br/>]</pre> | no |
179+
| <a name="input_system_node_group_max_size"></a> [system\_node\_group\_max\_size](#input\_system\_node\_group\_max\_size) | Maximum number of worker nodes | `number` | `4` | no |
180+
| <a name="input_system_node_group_min_size"></a> [system\_node\_group\_min\_size](#input\_system\_node\_group\_min\_size) | Minimum number of worker nodes | `number` | `1` | no |
214181
| <a name="input_tags"></a> [tags](#input\_tags) | Default tags to apply to all resources | `map(string)` | <pre>{<br/> "Environment": "dev",<br/> "Project": "materialize",<br/> "Terraform": "true"<br/>}</pre> | no |
215182
| <a name="input_use_local_chart"></a> [use\_local\_chart](#input\_use\_local\_chart) | Whether to use a local chart instead of one from a repository | `bool` | `false` | no |
216183
| <a name="input_use_self_signed_cluster_issuer"></a> [use\_self\_signed\_cluster\_issuer](#input\_use\_self\_signed\_cluster\_issuer) | Whether to install and use a self-signed ClusterIssuer for TLS. To work around limitations in Terraform, this will be treated as `false` if no materialize instances are defined. | `bool` | `true` | no |
@@ -263,6 +230,19 @@ More advanced TLS support using user-provided CAs or per-Materialize `Issuer`s a
263230

264231
## Upgrade Notes
265232

233+
#### v0.7.0
234+
235+
Breaking changes:
236+
* Swap is enabled by default.
237+
* Support for lgalloc, our legacy spill to disk mechanism, is removed.
238+
* We now always use two node groups, one for system workloads and one for Materialize workloads.
239+
* Variables for configuring these node groups have been renamed, so they may be configured separately.
240+
* Both node groups are now locked to Bottlerocket AMIs and ON\_DEMAND scheduling.
241+
242+
You must upgrade to at least v0.6.x before upgrading to v0.7.0 of this terraform code.
243+
244+
It is strongly recommended to have enabled swap on v0.6.x before upgrading to v0.7.0 or higher.
245+
266246
#### v0.6.1
267247

268248
We now have some initial support for swap.

docs/footer.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,19 @@ More advanced TLS support using user-provided CAs or per-Materialize `Issuer`s a
2525

2626
## Upgrade Notes
2727

28+
#### v0.7.0
29+
30+
Breaking changes:
31+
* Swap is enabled by default.
32+
* Support for lgalloc, our legacy spill to disk mechanism, is removed.
33+
* We now always use two node groups, one for system workloads and one for Materialize workloads.
34+
* Variables for configuring these node groups have been renamed, so they may be configured separately.
35+
* Both node groups are now locked to Bottlerocket AMIs and ON_DEMAND scheduling.
36+
37+
You must upgrade to at least v0.6.x before upgrading to v0.7.0 of this terraform code.
38+
39+
It is strongly recommended to have enabled swap on v0.6.x before upgrading to v0.7.0 or higher.
40+
2841
#### v0.6.1
2942

3043
We now have some initial support for swap.

docs/header.md

Lines changed: 0 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -60,41 +60,8 @@ You can also set the `AWS_PROFILE` environment variable to the name of the profi
6060
export AWS_PROFILE=your-profile-name
6161
```
6262

63-
## Disk Support for Materialize
64-
65-
This module supports configuring disk support for Materialize using NVMe instance storage and OpenEBS and lgalloc.
66-
67-
When using disk support, you need to use instance types from the `r7gd` or `r6gd` family or other instance types with NVMe instance storage.
68-
69-
### Enabling Disk Support
70-
71-
To enable disk support with default settings:
72-
73-
```hcl
74-
enable_disk_support = true
75-
```
76-
77-
This will:
78-
1. Install OpenEBS via Helm
79-
2. Configure NVMe instance store volumes using the bootstrap script
80-
3. Create appropriate storage classes for Materialize
81-
8263
### Advanced Configuration
8364

84-
In case that you need more control over the disk setup:
85-
86-
```hcl
87-
enable_disk_support = true
88-
89-
disk_support_config = {
90-
openebs_version = "4.3.3"
91-
storage_class_name = "custom-storage-class"
92-
storage_class_parameters = {
93-
volgroup = "custom-volume-group"
94-
}
95-
}
96-
```
97-
9865
## `materialize_instances` variable
9966

10067
The `materialize_instances` variable is a list of objects that define the configuration for each Materialize instance.

docs/operator-setup.md

Lines changed: 0 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -24,42 +24,6 @@ Verify the connection:
2424
kubectl get nodes
2525
```
2626

27-
## (Optional) Storage Configuration
28-
29-
The Materialize Operator requires fast, locally-attached NVMe storage for optimal performance. We'll set up OpenEBS with LVM Local PV for managing local volumes.
30-
31-
1. Install OpenEBS:
32-
```bash
33-
# Add the OpenEBS Helm repository
34-
helm repo add openebs https://openebs.github.io/openebs
35-
helm repo update
36-
37-
# Install OpenEBS with only Local PV enabled
38-
helm install openebs --namespace openebs openebs/openebs \
39-
--set engines.replicated.mayastor.enabled=false \
40-
--create-namespace
41-
```
42-
43-
2. Verify the installation:
44-
```bash
45-
kubectl get pods -n openebs -l role=openebs-lvm
46-
```
47-
48-
### LVM Configuration for AWS Bottlerocket nodes
49-
50-
TODO: Add more detailed instructions for setting up LVM on Bottlerocket nodes.
51-
52-
If you're using the recommended Bottlerocket AMI with the Terraform module, the LVM configuration needs to be done through the Bottlerocket bootstrap container. This is automatically handled by the EKS module using the provided user data script.
53-
54-
To verify the LVM setup:
55-
```bash
56-
kubectl debug -it node/<node-name> --image=amazonlinux:2
57-
chroot /host
58-
lvs
59-
```
60-
61-
You should see a volume group named `instance-store-vg`.
62-
6327
## Install the Materialize Operator
6428

6529
The Materialize Operator is installed automatically when you set the following in your Terraform configuration:

examples/simple/main.tf

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -50,16 +50,22 @@ module "materialize_infrastructure" {
5050
single_nat_gateway = true
5151

5252
# EKS Configuration
53-
cluster_version = "1.32"
54-
node_group_instance_types = ["r7gd.2xlarge"]
55-
node_group_desired_size = 1
56-
node_group_min_size = 1
57-
node_group_max_size = 2
58-
node_group_capacity_type = "ON_DEMAND"
59-
enable_cluster_creator_admin_permissions = true
53+
cluster_version = "1.32"
54+
55+
system_node_group_instance_types = ["m7g.medium"]
56+
system_node_group_desired_size = 2
57+
system_node_group_min_size = 2
58+
system_node_group_max_size = 2
59+
60+
materialize_node_group_instance_types = ["r7gd.2xlarge"]
61+
materialize_node_group_desired_size = 1
62+
materialize_node_group_min_size = 1
63+
materialize_node_group_max_size = 2
6064

6165
swap_enabled = var.swap_enabled
6266

67+
enable_cluster_creator_admin_permissions = true
68+
6369
# Storage Configuration
6470
bucket_force_destroy = true
6571

examples/simple/variables.tf

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
variable "swap_enabled" {
22
description = "Enable swap for Materialize. When enabled, this configures swap on a new nodepool, and adds it to the clusterd node selectors."
33
type = bool
4-
default = false
4+
default = true
55
}

0 commit comments

Comments
 (0)