Skip to content

Commit ae6d0f2

Browse files
scoopexmaxwolfsfkr
authored
Documentation of the SCS Hardware Landscape (#277)
* Rework headings Signed-off-by: Marc Schöchlin <[email protected]> * resize Signed-off-by: Marc Schöchlin <[email protected]> * add Signed-off-by: Marc Schöchlin <[email protected]> * fix typo Signed-off-by: Max Wolfs <[email protected]> * fix md errors Signed-off-by: Max Wolfs <[email protected]> * add Signed-off-by: Marc Schöchlin <[email protected]> * resize image Signed-off-by: Marc Schöchlin <[email protected]> * Update docs/turnkey-solution/hardware-landscape.md Co-authored-by: Felix Kronlage-Dammers <[email protected]> Signed-off-by: Marc Schöchlin <[email protected]> --------- Signed-off-by: Marc Schöchlin <[email protected]> Signed-off-by: Max Wolfs <[email protected]> Signed-off-by: Marc Schöchlin <[email protected]> Co-authored-by: Max Wolfs <[email protected]> Co-authored-by: Felix Kronlage-Dammers <[email protected]>
1 parent a749e4d commit ae6d0f2

File tree

3 files changed

+122
-1
lines changed

3 files changed

+122
-1
lines changed
Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
---
2+
sidebar_label: Hardware-Landscape
3+
sidebar_position: 99
4+
---
5+
6+
# The SCS Hardware-Landscape
7+
8+
![An image of the SCS hardware landscape rack](images/combined_rack_visual.jpg)
9+
10+
## General information
11+
12+
The general aim of this environment is to install and operate the SCS reference implementation on hardware.
13+
In addition to the classic tasks in the area of quality assurance, the environment is also used to evaluate
14+
new concepts in the underlay/overlay network area, as a test environment for hardware-related developments,
15+
as a demonstration environment for interested parties and as a publicly accessible blueprint for users.
16+
The environment is designed for long-term use which a varying circle of users.
17+
18+
The environment consists of 21 server and 12 switch components. The selection of hardware and the
19+
functions and properties used was designed so that the focus is on generally available or characteristic
20+
functions and dependency on manufacturer-specific functions is avoided. Instead of the x86 servers or SONiC
21+
switches used here, the realised environment could also be realised with hardware from other manufacturers.
22+
23+
From 1 January 2025, the environment will be operated by [forum SCS-Standards](https://scs.community/2024/10/23/osba-forum-scs-standards/)
24+
and the participating companies.
25+
26+
## Tasks and Objectives
27+
28+
The tasks and objectives of the environments can be summarised as follows:
29+
30+
* The division into several environments makes it possible to run a lab as well as to map a productive environment (near-live operation).
31+
* Operation of the compliance monitor (automated test for conformity with the SCS standards)
32+
* Implementation and validation of the developed standards in a reference environment
33+
* Analysis of problems in the interaction with the standards
34+
* Provision of proof-of-concept installations for interested parties who want to use, promote or further develop the project
35+
* The environment can be used by members of the SCS Standards forum and by contributors to the SCS community
36+
as a development and test environment for open-source development in connection with the further development
37+
of the SCS standards, SCS reference implementation and other relevant software components ('open-lab'/'near-live laboratory').
38+
* Continuous Integration Environment ('Zuul as a Service') - Operation of non-critical zuul worker instances
39+
40+
## Installation details
41+
42+
The available hardware was divided into two distinct application areas:
43+
44+
* The **lab environment** consists exclusively of switch hardware used to evaluate, test and develop
45+
concepts in the area of 'Software Defined Networking'. This means that various switch models can be
46+
used to test and implement development tasks in the area of the open [SONiC](https://sonicfoundation.dev/) NOS
47+
(a network operating system based on Debian Linux) or provisioning automation tasks in the SONiC environment with the
48+
open-source system Netbox, a solution that is used primarily for IPAM and DCIM (IP Address Management, Data Center Infrastructure Management).
49+
* The **production environment** is an exemplary installation of the relevant or most reference implementations with regard to an
50+
SCS system. It follows a configuration or approach that is based on the needs and circumstances of a real and much larger environment.
51+
To this end, characteristic infrastructure components were automatically installed on the manager nodes used for the installation.
52+
53+
The setup of the entire environment is designed in such a way that it can be reproducibly restored or reset.
54+
Therefore, the Ansible automation available via OSISM was used in many areas.
55+
Areas that could not be usefully automated using Ansible were implemented using a Python command-line tooling stored in the GIT repository.
56+
57+
## Available documentation
58+
59+
The primary point of information and orientation is the [*readme file*](https://github.com/SovereignCloudStack/hardware-landscape?tab=readme-ov-file#references)
60+
which is stored at the top level of the [configuration repository](https://github.com/SovereignCloudStack/hardware-landscape).
61+
62+
The relevant **References** section refers here to the individual documentation areas.
63+
64+
## Specific installation and configuration details
65+
66+
* Processes for access management to the environment (2 VPN gateways, SSH logins, SSH profiles,..) have been implemented
67+
* The production and lab environments have been set up, automated and documented as described above
68+
* The complete environment is managed in a [GIT repository](https://github.com/SovereignCloudStack/hardware-landscape),
69+
adjustments and further developments are managed via GIT merge requests
70+
* Almost all installation steps are [documented and automated](https://github.com/SovereignCloudStack/hardware-landscape/blob/main/documentation/System_Deployment.md)
71+
based on a pure rack installation (The setup is extensively documented, in particular the few manual steps)
72+
* The entire customized setup of the nodes is [implemented by OSISM/Ansible](https://github.com/SovereignCloudStack/hardware-landscape/tree/main/environments/custom)
73+
* All secrets (e.g. passwords) of the environment are stored and versioned in the encrypted Ansible Vault in
74+
the repository (when access is transferred, rekeying can be used to change the access or the rights to it).
75+
* A far-reaching or in-depth automation has been created that allows the environment to be re-set up or parts of it to
76+
be re-set up with a reasonable amount of personnel.
77+
* The setup of the basic environment was implemented appropriately with Ansible and using the OSISM environment (the reference implementation)
78+
* Python tooling was created that adds areas that are specific to the use case of the environment and provides functions that simplify the operation of the infrastructure
79+
* Server systems
80+
* Backup and restore of the hardware configuration
81+
* Templating of the BMC configuration
82+
* Automatic installation of the operating system base image via Redfish Virtual Media
83+
* Control of the server status via command line (to stop and start the system for test, maintenance and energy-saving purposes)
84+
* Generation of base profiles for the Ansible Inventory based on the hardware key data stored in the documentation
85+
* Switches
86+
* Backup and restore of the switch configuration
87+
* Generation of base profiles for the Ansible Inventory based on the hardware key data stored in the documentation
88+
* Network setup
89+
* The two management hosts act as redundant VPN gateways, ssh jumphosts, routers and uplink routers
90+
* The system is deployed with a layer 3 underlay concept
91+
* An "eBGP router on the host" is implemented for the node-interconnectivity
92+
(all nodes and all switches are running FRR instances)
93+
* All Ceph and Openstack nodes of the system do not have a direct upstream routing
94+
(access is configured and provided by HTTP-, NTP and DNS-proxies)
95+
* For security reasons, the system itself can only be accessed via VPN.
96+
The provider network of the production environment is realized with a VXLAN which is terminated on the managers for routing
97+
('a virtual provider network')).
98+
* The basic node installation was realised in such a way that specific [node images](https://github.com/osism/node-image)
99+
are created for the respective rack, which make the operation or reconfiguration of network equipment for PXE bootstrap
100+
unnecessary. (Preliminary stage for rollout via OpenStack Ironic)
101+
* The management of the hardware (BMC and switch management) is implemented with a VLAN
102+
* Routing, firewalling and NAT is managed by a NFTables Script which adds rules in a idempotent way to the existing rules
103+
of the manager nodes.
104+
* The [openstack workload generator](https://github.com/SovereignCloudStack/openstack-workload-generator) is used put test workloads
105+
on the system
106+
* Automated creation of OpenStack domains, projects, servers, networks, users, etc.
107+
* Launching test workloads
108+
* Dismantling test workloads
109+
* An observability stack was built
110+
* Prometheus for metrics
111+
* Opensearch for log aggregation
112+
* Central syslog server for the switches on the managers (recorded via the manager nodes in Opensearch)
113+
* Specific documentation created for the project
114+
* Details of the hardware installed in the environment
115+
* The physical structure of the environment was documented in detail (rack installation and cabling)
116+
* The technical and logical structure of the environment was documented in detail
117+
* A FAQ for handling the open-source network operating system SONiC was created with relevant topics for the test environment
118+
* As part of the development, the documentation and implementation of the OSISM reference implementation was significantly improved (essentially resulting from
512 KB
Loading

sidebarsDocs.js

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -421,7 +421,10 @@ const sidebarsDocs = {
421421
link: {
422422
type: 'generated-index'
423423
},
424-
items: ['turnkey-solution/overview']
424+
items: [
425+
'turnkey-solution/overview',
426+
'turnkey-solution/hardware-landscape'
427+
]
425428
},
426429
{
427430
type: 'category',

0 commit comments

Comments
 (0)