Skip to content

Commit 2e0426f

Browse files
Bump releases to version v0.21.10 (#105)
1 parent 9cd79a6 commit 2e0426f

File tree

371 files changed

+30971
-11
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

371 files changed

+30971
-11
lines changed

docs/docs/05-Concepts/03-Network/network-physical-wiring.svg

Lines changed: 1 addition & 1 deletion
Loading
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
---
2+
slug: /release-notes/v0.21.10
3+
title: v0.21.10
4+
sidebar_position: 1
5+
---
6+
# metal-stack v0.21.10
7+
See original release note at [https://github.com/metal-stack/releases/releases/tag/v0.21.10](https://github.com/metal-stack/releases/releases/tag/v0.21.10)
8+
## General
9+
* [Gardener v1.118](https://github.com/gardener/gardener/releases/tag/v1.118.0)
10+
## Required Actions
11+
* The loopback addresses of the switch must be known to all peers that communicate with the switch over the default VRF. BGP sessions must be established accordingly. (metal-stack/metal-core#159)
12+
## Component Releases
13+
### go-ipam v1.14.13
14+
* Upgrade to go-1.25 (metal-stack/go-ipam#185) @majst01
15+
### metal-bmc v0.6.0
16+
* Update to go-1.25 (metal-stack/metal-bmc#86) @majst01
17+
* Correctly set version build flags. (metal-stack/metal-bmc#82) @Gerrit91
18+
* include sbom in container image (metal-stack/metal-bmc#81) @mac641
19+
* Let metal-bmc configure booting from disk via redfish (metal-stack/metal-bmc#85) @simcod
20+
### metal-api v0.42.3
21+
* Update to go-1.25 (metal-stack/metal-api#633) @majst01
22+
### metal-console v0.7.5
23+
* Update to go-1.25 and update all deps (metal-stack/metal-console#60) @majst01
24+
* include sbom in container images (metal-stack/metal-console#59) @mac641
25+
### metal-hammer v0.13.15
26+
* Update to go-1.25, u-root v0.15.0 (metal-stack/metal-hammer#166) @majst01
27+
* Check if password change is necessary for BMC superuser (metal-stack/metal-hammer#163) @simcod
28+
* Trigger re-read of partition table (metal-stack/metal-hammer#164) @simcod
29+
* Include SBOM as release asset (metal-stack/metal-hammer#158) @mac641
30+
* Fix missing recent nvidia gpu pci ids (metal-stack/metal-hammer#167) @majst01
31+
### metal-roles v0.17.15
32+
* Adaptions for g/g v1.118. (metal-stack/metal-roles#467) @Gerrit91
33+
### pixie v0.3.7
34+
* Update to go-1.25 (metal-stack/pixie#38) @majst01
35+
### metal-core v0.13.1
36+
* Update to go-1.25 and alpine 3.22 (metal-stack/metal-core#162) @majst01
37+
### gardener-extension-provider-metal v0.26.4
38+
* Fix firewall deployment patch update function called with empty string. (metal-stack/gardener-extension-provider-metal#472) @Gerrit91
39+
* Allow setting explicit hash. (metal-stack/gardener-extension-provider-metal#462) @Gerrit91
40+
# Merged Pull Requests
41+
This is a list of pull requests that were merged since the last release. The list does not contain pull requests from release-vector-repositories.
42+
43+
The fact that these pull requests were merged does not necessarily imply that they have already become part of this metal-stack release.
44+
45+
* Update to go-1.25 (metal-stack/go-lldpd#31) @majst01
46+
* Add new vendor Gigabyte (metal-stack/go-hal#75) @simcod
47+
* Update dependencies (metal-stack/go-hal#77) @majst01
48+
* Bump metal-api to version v0.42.3 (metal-stack/metal-python#158) @metal-robot[bot]
49+
* Bump metal-api to version v0.42.3 (metal-stack/metal-go#218) @metal-robot[bot]
50+
* Use systemd generator functionality for enabling getty instances (metal-stack/metal-images#338) @simcod
51+
* Check for typos (metal-stack/website#93) @simcod
52+
* Update go, kernels, lldpd, cri-tools (metal-stack/metal-images#339) @majst01
53+
* Add information about remote access to machines and firewalls (metal-stack/website#94) @simcod
54+
* Bump lint-staged from 16.1.5 to 16.1.6 (metal-stack/website#97) @dependabot[bot]
55+
* Add information about storing BMC credentials (metal-stack/website#91) @simcod
56+
* Bind to valid loopback address if no `bind_to` config option specified (metal-stack/nftables-exporter#30) @auvred
57+
* docs: Fix some lines in network physical wiring (metal-stack/website#101) @GeertJohan
58+
* Add partition tags to accounting tags (metal-stack/metal-lib#190) @thheinel
59+
* Remove julia parts (metal-stack/website#100) @BotondGalxc
60+
* Add image url query (metal-stack/api#42) @majst01
61+
* Bump to v0.21.9 release. (metal-stack/website#99) @Gerrit91
62+
* Next release (metal-stack/releases#251) @metal-robot[bot]

scripts/components.json

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@
3030
"releasePath": "docker-images.metal-stack.control-plane.ipam.tag",
3131
"repo": "metal-stack/go-ipam",
3232
"branch": "main",
33-
"tag": "v1.14.12",
33+
"tag": "v1.14.13",
3434
"position": 2,
3535
"withDocs": false
3636
},
@@ -48,7 +48,7 @@
4848
"releasePath": "docker-images.metal-stack.control-plane.metal-api.tag",
4949
"repo": "metal-stack/metal-api",
5050
"branch": "main",
51-
"tag": "v0.42.2",
51+
"tag": "v0.42.3",
5252
"position": 4,
5353
"withDocs": false
5454
},
@@ -57,7 +57,7 @@
5757
"releasePath": "docker-images.metal-stack.control-plane.metal-console.tag",
5858
"repo": "metal-stack/metal-console",
5959
"branch": "main",
60-
"tag": "v0.7.4",
60+
"tag": "v0.7.5",
6161
"position": 5,
6262
"withDocs": false
6363
}
@@ -80,7 +80,7 @@
8080
"releasePath": "docker-images.metal-stack.partition.metal-bmc.tag",
8181
"repo": "metal-stack/metal-bmc",
8282
"branch": "main",
83-
"tag": "v0.5.8",
83+
"tag": "v0.6.0",
8484
"position": 2,
8585
"withDocs": false
8686
},
@@ -89,7 +89,7 @@
8989
"releasePath": "docker-images.metal-stack.partition.metal-core.tag",
9090
"repo": "metal-stack/metal-core",
9191
"branch": "main",
92-
"tag": "v0.13.0",
92+
"tag": "v0.13.1",
9393
"position": 3,
9494
"withDocs": false
9595
},
@@ -98,7 +98,7 @@
9898
"releasePath": "binaries.metal-stack.metal-hammer.version",
9999
"repo": "metal-stack/metal-hammer",
100100
"branch": "main",
101-
"tag": "v0.13.13",
101+
"tag": "v0.13.15",
102102
"position": 4,
103103
"withDocs": false
104104
},
@@ -107,7 +107,7 @@
107107
"releasePath": "docker-images.metal-stack.partition.pixiecore.tag",
108108
"repo": "metal-stack/pixie",
109109
"branch": "main",
110-
"tag": "v0.3.6",
110+
"tag": "v0.3.7",
111111
"position": 5,
112112
"withDocs": false
113113
}

src/version.json

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1 @@
1-
{
2-
"version": "v0.21.9"
3-
}
1+
{"version": "v0.21.10"}
52.3 KB
Loading
48.8 KB
Loading
33.7 KB
Loading
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
<mxfile host="www.draw.io" modified="2020-01-13T13:05:59.591Z" agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) snap Chromium/79.0.3945.79 Chrome/79.0.3945.79 Safari/537.36" etag="Pcrs69XaZ4sZO_cn817q" version="12.5.1" type="device"><diagram name="Page-1" id="c4acf3e9-155e-7222-9cf6-157b1a14988f">7V3bcts2EP0aP8pDgPfH2ImbziRTt+m0zVMHpmCJMUUoFG1L/fqC4kUkAEoULwDkRC8WIBIkF2d3z+6C8JV5u9r+kqD18jOZ4+gKGvPtlfn+CtKPY9M/Wc8u7/FtM+9YJOE87wKHji/hf7joNIre53CON40DU0KiNFw3OwMSxzhIG30oSchr87BHEjWvukYLzHV8CVDE9/4dztNl3uvZxqH/Iw4Xy/LKwCh+eUDB0yIhz3FxvStoPu4/+c8rVI5VHL9Zojl5rXWZH67M24SQNP+22t7iKJNtKbb8vLuWX6v7TnCcdjkhWVoPJP5zF3/w/trOvG+vv70+z4rJe0HRMy4fw4noeDePhA5L5YmC/Afn+3N2pzcfcfSC0zBAh67s4dJd1DwuO3+22U/3O3oAgOtt/Qxnkf29R0kapiGJy6vS288vnP9eSK4aHFIhrrOvz6voLkEr+vXmdRmm+Ms6v89XClLat0xXEW0B+nU/QziTgUFb1SRkjYCswqD4HqEHHN1UU3pLIpLQn2ISZ9fYpAl5qvABise7Q6swymD/F07mKEZFd4FxkA2LonAR00ZAJwkn1RO94CTF29aJBBU8qNphssJpsqOHFCfMbLeAVKFywC0x+VpDcNm3rKHX9opOVGjNohr9gBz6pQDPGUACkEPSBif0OWkfHGEayyfzMqlWj2QaI4kUcCK1eJGajkikYASRrrboX/K79Qh+vYe77+Ef8RMBM4+TaECCp4SgYClDoJz0BDJuF6jFCNQ0O8oTGiPI055D4NsPc8+Yo0cAwMy0OJHhOfUDRZMk6ZIsSIyiD4dexnTUZDdHm+W+H3SwHNTE3YXZne5HwfH8Xea1souucZz3NH+v24+OZqZ1xrKHPDpfCY5QGr40naFI9PtT6a2jXe2ANQnjdFMb+T7rOMDAAoxaGdBvOqkzT6Bf8nsQn256LadXd7whz0mAi9MYQFVi6a+zrsCfDsTd4ZhPhKwL0H3DaborEICeU9LE5x50JcyCCG02mZ9rYBEcQ00upCOPWdAGOt4Cp0eOc8ZG4SCDypOdmkE1ADdTpVGlPGFNtTkTUuXQI/yYNTfUvobx4tO+9d50xrKeVhPHlsDBAyiwns5Uzsg5Krt2Dy9fdpUMVMgOuCh4QLZredg2fedhBji5MQT7bObckZJzBNt498OQ7HKm3Wm4jW0zXsbmEWaL6LdlTqWegLdesv1MC+Hp72T8SSgMxxkoN2yhquUYuZubjDP4nImgI6JohtahRmbVYsxqlcBR5pLKkPMtYb4U6uSgh23xmSTQlw9aRz3apJmNT5FGsIeedrA3j1ExzfMC0G6BnZS0gFieloCivcGYDXQN2oBeURu4mLCNtRXqozZwMWEbSy80kJ0ol6ModLv5GbqNgjIuza9B5OYp9zbjs1hJoRub7qVs4tqofWBzwKkp7UUEcmx6TD2hrQrkb0gDJqq/8PUSnk8r1IAKjXoHdWx2XQMVgAKuwerEoXLIhAd8dw3neBum/2R4vva8ovl137SL1vttgfZ9Y1dr3OMkpM+X+eWiNkmfNR8LOGW7NljWPIy2b+3qLXa8TurVllE/Hca4HVWw4fv5SS/7hmZc2KyxYzNoailNCkYymJHY2HhqNb/kDAS3MgFUpFBdDgL+IDkI2DUHAfXKQcCLyUFwpWMABSuZJHu3i8lCcMVjHaR3Mg8hbY3mzxLyVCVkv3Miwp0MZ9omIg4UtuSsX2vkVsxfB/goVXXnAxGRxeMuImHBEzZDoCxybXKZDdRPV/rj3pSUsuBKz9Jxb15GmoKiTD/g84mKywn9gK9f5GfysbRy0zJF5FcuwD8Z+Zn22GZo2PzwkbmmkV/1opk+oYt5PGzWKPCzWFOrgfD4qPln/fnC4z4AtAv7LNEKddYB/Zilh6PpmNOOrGsKU1H9AVg+g6neBQhQJR0lMXhLVIGIN4QCVhuXwr9TKqTvIm2fzKdYGr6f1vqiZKXxdkPhT6h7f822Or/VZnXU7IEqa9sN/Hjsy9tTKxmfHvoJlrPB4koCi8nSf8Pyr53Dx5WLHZ75o3V4nacX6ewFT9ch0chWM6badQSW2pVpqUvd11DXpGbj7dELwS3qw754LjspafPhnobJeMC9YK88Jeko8Uo1j+Oe5XL0UzGna0RTsu7JKwSA8WU+u8fKxLs4OKauxrd1lk/zEElvFtpmv7k7OdCe0Hjm4WNLtc8Onwiu7POMzn2WW9DGTHM1U19Q6QCmVDMtWgS0D9mv12WmcfZwiiHS50+bWjOCtNglU1WgVRMWFMXpk70U4vBxOi8spERYM2hoJy1tV64McMqSVqFwhQ/ZvNfhsww6FuNN/RahlHekcxKUyxSrz4G6auOFqtGtejEtKZS31o0tfPVlhTPTsBlEWdkrTwda6HQyX+fuZMc/QQHa9hvltrLzGmc0t7IbbQM6vjKSJd6Uswb2VU31rMG9JELP5l3U83lXSYJSiRmdPPdosablaKDb1VTyywfdPgH0uZaSf5rjhpJbBi3HTvLhaNNOqglF2bWxUs2keGNnTk7Vvs7ti9+02dfZZsEld19noUQhJ1EVlvSsFZ+MBTwZ0gqfu2AmdVIqPM4aaGQHTc7RV1tHXu45ENoMvxRuZjJZRNo+c5KWew4SHrcBgHLRqdkEY0TtFhSRhMf5KrUb8gost1bYY3WK1NnJNxdUNT181nuaEvi4hhf4ULn1UJvTOrcG3r++PYoy+BehDNLy4sO0wWTLtOq1QZMdPUdELOjKnZUittxtUpkVgr2sEKjZoNKSyTBDnSd1aEDUK43DRheGb9cBcvL4MhvZdrx7/PjBWZ+jIq9ZBmo4E7KlkqHgNUH+2tGpF1rVcw4otfwoliX//6mUqH/FJdzuZKI3cxlT/btycqX5EMGmlhdKNaESrtl5lod67l5GnjPC7P+QZIuaFjx+xkRmmw8ML8Jss2km9Ua73GjuMhIgvWz7iMor2q7uCEDHXzbB/vMJg90zcrzFWWIB6OHjVUwpHDqlw/SUf39Kx1QYYMtr6oN/qDoI7ctPFJm4zpnhiUycrdrECZLOGubZ9FO0Mu/3hnxD18SwWtdwZNe+KdatjVws8UTAtaQCF7414JYb2p0mNbZK5Ir23dMXuRy38UT/JmAk5NJmQrLtlw6uLUHr5Wcyx9kR/wM=</diagram></mxfile>
30.8 KB
Loading
Lines changed: 141 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,141 @@
1+
---
2+
slug: /MEP-1-distributed-metal-control-plane
3+
title: MEP-1
4+
sidebar_position: 1
5+
---
6+
7+
# Distributed Metal Control Plane
8+
9+
This enhancement proposal was replaced by [MEP18](../MEP18/README.md).
10+
11+
## Problem Statement
12+
13+
We face the situation that we argue for running bare metal on premise because this way the customers can control where and how their software and data are processed and stored.
14+
On the other hand, we have currently decided that our metal-api control plane components run on a kubernetes cluster (in our case on a cluster provided by one of the available hyperscalers).
15+
16+
Running the control plane on Kubernetes has the following benefits:
17+
18+
- Ease of deployment
19+
- Get most, if not all, of the required infrastructure services like (probably incomplete):
20+
- IPs
21+
- DNS
22+
- L7-Loadbalancing
23+
- Storage
24+
- S3 Backup
25+
- High Availability
26+
27+
Using a kubernetes as a service offering from one of the hyperscalers, enables us to focus on using kubernetes instead of maintaining it as well.
28+
29+
## Goal
30+
31+
It would be much saner if metal-stack has no, or only minimal dependencies to external services. Imagine a metal-stack deployment in a plant, it would be optimal if we only have to deliver a single rack with servers and networking gear installed and wired, plug that rack to the power supply and a internet uplink and its ready to go.
32+
33+
Have a second plant which you want to be part of all your plants? Just tell both that they are part of something bigger and metal-api knows of two partitions.
34+
35+
## Possible Solutions
36+
37+
We can think of two different solutions to this vision:
38+
39+
1. Keep the central control plane approach and require some sort of kubernetes deployment accessible from the internet. This has the downside that the user must, provide a managed kubernetes deployment in his own datacenter or uses a hyperscaler. Still not optimal.
40+
1. Install the metal-api and all its dependencies in every partition, replicate or shard the databases to every connected partition, make them know each other. Connect the partitions over the internet with some sort of vpn to make the services visible to each other.
41+
42+
As we can see, the first approach does not really address the problem, therefore i will describe solution #2 in more details.
43+
44+
## Central/Current setup
45+
46+
### Stateful services
47+
48+
Every distributed system suffer from handling state in a scalable, fast and correct way. To start how to cope with the state, we first must identify which state can be seen as partition local only and which state must be synchronous for read, and synchronous for writes across partitions.
49+
50+
Affected states:
51+
52+
- masterdata: e.g. tenant and project must be present in every partition, but these are entities which are read often but updates are rare. A write can therefore be visible with a decent delay in a distinct partition with no consequences.
53+
- ipam: the prefixes and ip´s allocated from machines. These entities are also read often and rare updates. But we must differentiate between dirty reads for different types. A machine network is partition local, ips acquired from such a network must by synchronous in the same partition. Ips acquired from global networks such as internet must by synchronous for all partitions, as otherwise a internet ip could be acquired twice.
54+
- vrf ids: they must only be unique in one partition
55+
- image and size configurations: read often, written seldom, so no high requirements on the storage of these entities.
56+
- images: os images are already replicated from a central s3 storage to a per partition s3 service. metal-hammer kernel and initrd are small and pull always from the central s3, can be done similar to os images.
57+
- machine and machine allocation: must be only synchronous in the partition
58+
- switch: must be only synchronous in the partition
59+
- nsq messages: do not need to cross partition boundaries. No need to keep the messages persistent, even the opposite is true, we don't want to have the messages persist for a longer period.
60+
61+
Now we can see that the most critical state to held and synchronize are the IPAM data, because these entities must be guaranteed to be synchronously updated, while being updated frequently.
62+
63+
Datastores:
64+
65+
We use three different types of datastores to persist the states of the metal application.
66+
67+
- rethinkdb is the main datastore for almost all entities managed by metal-api
68+
- postgresql is used for masterdata and ipam data.
69+
- nsq uses disk and memory tho store the messages.
70+
71+
### Stateless services
72+
73+
These are the easy part, all of our services which are stateless can be scaled up and down without any impact on functionality. Even the stateful services like masterdata and metal-api rely fully on the underlying datastore and can therefore also be scaled up and down to meet scalability requirements.
74+
75+
Albeit, most of these services need to be placed behind a loadbalancer which does the L4/L7 balancing across the started/available replicas of the service for the clients talking to it. This is actually provided by kubernetes with either service type loadbalancer or type clusterip.
76+
77+
One exception is the `metal-console` service which must have the partition in it´s dns name now, because there is no direct network connectivity between the management networks of the partitions. See "Network Setup)
78+
79+
## Distributed setup
80+
81+
### State
82+
83+
In order to replicate certain data which must be available across all partitions we can use on of the existing open source databases which enable such kind of setup. There are a few available out there, the following incomplete list will highlight the pro´s and cons of each.
84+
85+
- RethinkDB
86+
87+
We already store most of our data in RethinkDB and it gives already the ability to synchronize the data in a distributed manner with different guarantees for consistency and latency. This is described here: [Scaling, Sharding and replication](https://rethinkdb.com/docs/sharding-and-replication/). But because rethinkdb has a rough history and unsure future with the last release took more than a year, we in the team already thought that we eventually must move away from rethinkdb in the future.
88+
89+
- Postgresql
90+
91+
Postgres does not have a multi datacenter with replication in both directions, it just can make the remote instance store the same data.
92+
93+
- CockroachDB
94+
95+
Is a Postgresql compatible database engine on the wire. CockroachDB gives you both, ACID and geo replication with writes allowed from all connected members. It is even possible to configure [Follow the Workload](https://www.cockroachlabs.com/docs/stable/topology-follow-the-workload) and [Geo Partitioning and Replication](https://www.cockroachlabs.com/docs/v19.2/topology-geo-partitioned-replicas).
96+
97+
If we migrate all metal-api entities to be stored the same way we store masterdata, we could use cockroachdb to store all metal entities in one ore more databases spread across all partitions and still ensure consistency and high availability.
98+
99+
A simple setup how this would look like is shown here.
100+
101+
![Simple CockroachDB setup](Distributed.png)
102+
103+
go-ipam was modified in a example PR here: [PR 17](https://github.com/metal-stack/go-ipam/pull/17)
104+
105+
### API Access
106+
107+
In order to make the metal-api accessible for api users like `cloud-api` or `metalctl` as easy at it is today, some effort has to be taken. One possible approach would be to use a external loadbalancer which spread the requests evenly to all metal-api endpoints in all partitions. Because all data are accessible from all partitions, a api request going to partition A with a request to create a machine in partition B, will still work. If on the other hand partition B is not in a connected state because the interconnection between both partitions is broken, then of course the request will fail.
108+
109+
**IMPORTANT**
110+
The NSQ Message to inform `metal-core` must end in the correct partition
111+
112+
To provide such a external loadbalancer we have several opportunities:
113+
114+
- Cloudflare or comparable CDN service.
115+
- BGP Anycast from every partition
116+
117+
Another setup would place a small gateway behind the metal-api address, which forwards to the metal-api in the partition where the request must be executed. This gateway, `metal-api-router` must inspect the payload, extract the desired partition, and forward the request without any modifications to the metal-api endpoint in this partition. This can be done for all requests, or if we want to optimize, only for write accesses.
118+
119+
## Network setup
120+
121+
In order to have the impact to the overall security concept as minimal as possible i would not modify the current network setup. The only modifications which has to be made are:
122+
123+
- Allow https ingress traffic to all metal-api instances.
124+
- Allow ssh ingress traffic to all metal-console instances.
125+
- Allow CockroachDB Replication between all partitions.
126+
- No NSQ traffic from outside required anymore, except we cant solve the topic above.
127+
128+
A simple setup how this would look like is shown here, this does not work though because of the forementioned NSQ issue.
129+
130+
![API and Console Access](Distributed-API.png)
131+
132+
Therefore we need the `metal-api-router`:
133+
134+
![Working API and Console Access](Distributed-API-Working.png)
135+
136+
## Deployment
137+
138+
The deployment of our components will substantially differ in a partition compared to a the deployment we have actually. Deploying it in kubernetes in the partition would be very difficult to achieve because we have no sane way to deploy kubernetes on physical machines without a underlying API.
139+
I would therefore suggest to deploy our components in the same way we do that for the services running on the management server. Use systemd to start docker containers.
140+
141+
![Deployment](Distributed-Deployment.png)

0 commit comments

Comments
 (0)