29 Setting up the management cluster #
29.1 Introduction #
The management cluster is the part of ATIP that is used to manage the provision and lifecycle of the runtime stacks. From a technical point of view, the management cluster contains the following components:
SUSE Linux Enterprise Micro
as the OS. Depending on the use case, some configurations like networking, storage, users and kernel arguments can be customized.RKE2
as the Kubernetes cluster. Depending on the use case, it can be configured to use specific CNI plugins, such asMultus
,Cilium
, etc.Rancher
as the management platform to manage the lifecycle of the clusters.Metal3
as the component to manage the lifecycle of the bare-metal nodes.CAPI
as the component to manage the lifecycle of the Kubernetes clusters (downstream clusters). With ATIP, also theRKE2 CAPI Provider
is used to manage the lifecycle of the RKE2 clusters (downstream clusters).
With all components mentioned above, the management cluster can manage the lifecycle of downstream clusters, using a declarative approach to manage the infrastructure and applications.
For more information about SUSE Linux Enterprise Micro
, see: SLE Micro (Chapter 7, SLE Micro)
For more information about RKE2
, see: RKE2 (Chapter 14, RKE2)
For more information about Rancher
, see: Rancher (Chapter 4, Rancher)
For more information about Metal3
, see: Metal3 (Chapter 8, Metal3)
29.2 Steps to set up the management cluster #
The following steps are necessary to set up the management cluster (using a single node):
The following are the main steps to set up the management cluster using a declarative approach:
Image preparation for connected environments (Section 29.3, “Image preparation for connected environments”): The first step is to prepare the manifests and files with all the necessary configurations to be used in connected environments.
Directory structure for connected environments (Section 29.3.1, “Directory structure”): This step creates a directory structure to be used by Edge Image Builder to store the configuration files and the image itself.
Management cluster definition file (Section 29.3.2, “Management cluster definition file”): The
mgmt-cluster.yaml
file is the main definition file for the management cluster. It contains the following information about the image to be created:Image Information: The information related to the image to be created using the base image.
Operating system: The operating system configurations to be used in the image.
Kubernetes: Helm charts and repositories, kubernetes version, network configuration, and the nodes to be used in the cluster.
Custom folder (Section 29.3.3, “Custom folder”): The
custom
folder contains the configuration files and scripts to be used by Edge Image Builder to deploy a fully functional management cluster.Files: Contains the configuration files to be used by the management cluster.
Scripts: Contains the scripts to be used by the management cluster.
Kubernetes folder (Section 29.3.4, “Kubernetes folder”): The
kubernetes
folder contains the configuration files to be used by the management cluster.Manifests: Contains the manifests to be used by the management cluster.
Helm: Contains the Helm charts to be used by the management cluster.
Config: Contains the configuration files to be used by the management cluster.
Network folder (Section 29.3.5, “Networking folder”): The
network
folder contains the network configuration files to be used by the management cluster nodes.
Image preparation for air-gap environments (Section 29.4, “Image preparation for air-gap environments”): The step is to show the differences to prepare the manifests and files to be used in an air-gap scenario.
Directory structure for air-gap environments (Section 29.4.1, “Directory structure for air-gap environments”): The directory structure must be modified to include the resources needed to run the management cluster in an air-gap environment.
Modifications in the definition file (Section 29.4.2, “Modifications in the definition file”): The
mgmt-cluster.yaml
file must be modified to include theembeddedArtifactRegistry
section with theimages
field set to all container images to be included into the EIB output image.Modifications in the custom folder (Section 29.4.3, “Modifications in the custom folder”): The
custom
folder must be modified to include the resources needed to run the management cluster in an air-gap environment.Register script: The
custom/scripts/99-register.sh
script must be removed when you use an air-gap environment.Air-gap resources: The
custom/files/airgap-resources.tar.gz
file must be included in thecustom/files
folder with all the resources needed to run the management cluster in an air-gap environment.Scripts: The
custom/scripts/99-mgmt-setup.sh
script must be modified to extract and copy theairgap-resources.tar.gz
file to the final location. Thecustom/files/metal3.sh
script must be modified to use the local resources included in theairgap-resources.tar.gz
file instead of downloading them from the internet.
Image creation (Section 29.5, “Image creation”): This step covers the creation of the image using the Edge Image Builder tool (for both, connected and air-gap scenarios). Check the prerequisites (Chapter 9, Edge Image Builder) to run the Edge Image Builder tool on your system.
Management Cluster Provision (Section 29.6, “Provision the management cluster”): This step covers the provisioning of the management cluster using the image created in the previous step (for both, connected and air-gap scenarios). This step can be done using a laptop, server, VM or any other x86_64 system with a USB port.
For more information about Edge Image Builder, see Edge Image Builder (Chapter 9, Edge Image Builder) and Edge Image Builder Quick Start (Chapter 3, Standalone clusters with Edge Image Builder).
29.3 Image preparation for connected environments #
Using Edge Image Builder to create the image for the management cluster, a lot of configurations can be customized, but in this document, we cover the minimal configurations necessary to set up the management cluster. Edge Image Builder is typically run from inside a container so, if you do not already have a way to run containers, we need to start by installing a container runtime such as Podman or Rancher Desktop. For this guide, we assume you already have a container runtime available.
Also, as a prerequisite to deploy a highly available management cluster, you need to reserve three IPs in your network:
- apiVIP
for the API VIP Address (used to access the Kubernetes API server).
- ingressVIP
for the Ingress VIP Address (consumed, for example, by the Rancher UI).
- metal3VIP
for the Metal3 VIP Address.
29.3.1 Directory structure #
When running EIB, a directory is mounted from the host, so the first thing to do is to create a directory structure to be used by EIB to store the configuration files and the image itself. This directory has the following structure:
eib ├── mgmt-cluster.yaml ├── network │ └── mgmt-cluster-node1.yaml ├── kubernetes │ ├── manifests │ │ ├── rke2-ingress-config.yaml │ │ ├── neuvector-namespace.yaml │ │ ├── ingress-l2-adv.yaml │ │ └── ingress-ippool.yaml │ ├── helm │ │ └── values │ │ ├── rancher.yaml │ │ ├── neuvector.yaml │ │ ├── metal3.yaml │ │ └── certmanager.yaml │ └── config │ └── server.yaml ├── custom │ ├── scripts │ │ ├── 99-register.sh │ │ ├── 99-mgmt-setup.sh │ │ └── 99-alias.sh │ └── files │ ├── rancher.sh │ ├── mgmt-stack-setup.service │ ├── metal3.sh │ └── basic-setup.sh └── base-images
The image SLE-Micro.x86_64-5.5.0-Default-SelfInstall-GM2.install.iso
must be downloaded from the SUSE Customer Center or the SUSE Download page, and it must be located under the base-images
folder.
You should check the SHA256 checksum of the image to ensure it has not been tampered with. The checksum can be found in the same location where the image was downloaded.
An example of the directory structure can be found in the SUSE Edge GitHub repository under the "telco-examples" folder.
29.3.2 Management cluster definition file #
The mgmt-cluster.yaml
file is the main definition file for the management cluster. It contains the following information:
apiVersion: 1.0
image:
imageType: iso
arch: x86_64
baseImage: SLE-Micro.x86_64-5.5.0-Default-SelfInstall-GM2.install.iso
outputImageName: eib-mgmt-cluster-image.iso
operatingSystem:
isoConfiguration:
installDevice: /dev/sda
users:
- username: root
encryptedPassword: ${ROOT_PASSWORD}
packages:
packageList:
- git
- jq
sccRegistrationCode: ${SCC_REGISTRATION_CODE}
kubernetes:
version: ${KUBERNETES_VERSION}
helm:
charts:
- name: cert-manager
repositoryName: jetstack
version: 1.14.2
targetNamespace: cert-manager
valuesFile: certmanager.yaml
createNamespace: true
installationNamespace: kube-system
- name: longhorn-crd
version: 103.3.0+up1.6.1
repositoryName: rancher-charts
targetNamespace: longhorn-system
createNamespace: true
installationNamespace: kube-system
- name: longhorn
version: 103.3.0+up1.6.1
repositoryName: rancher-charts
targetNamespace: longhorn-system
createNamespace: true
installationNamespace: kube-system
- name: metal3-chart
version: 0.7.4
repositoryName: suse-edge-charts
targetNamespace: metal3-system
createNamespace: true
installationNamespace: kube-system
valuesFile: metal3.yaml
- name: neuvector-crd
version: 103.0.3+up2.7.6
repositoryName: rancher-charts
targetNamespace: neuvector
createNamespace: true
installationNamespace: kube-system
valuesFile: neuvector.yaml
- name: neuvector
version: 103.0.3+up2.7.6
repositoryName: rancher-charts
targetNamespace: neuvector
createNamespace: true
installationNamespace: kube-system
valuesFile: neuvector.yaml
- name: rancher
version: 2.8.8
repositoryName: rancher-prime
targetNamespace: cattle-system
createNamespace: true
installationNamespace: kube-system
valuesFile: rancher.yaml
repositories:
- name: jetstack
url: https://charts.jetstack.io
- name: rancher-charts
url: https://charts.rancher.io/
- name: suse-edge-charts
url: oci://registry.suse.com/edge
- name: rancher-prime
url: https://charts.rancher.com/server-charts/prime
network:
apiHost: ${API_HOST}
apiVIP: ${API_VIP}
nodes:
- hostname: mgmt-cluster-node1
initializer: true
type: server
# - hostname: mgmt-cluster-node2
# type: server
# - hostname: mgmt-cluster-node3
# type: server
To explain the fields and values in the mgmt-cluster.yaml
definition file, we have divided it into the following sections.
Image section (definition file):
image:
imageType: iso
arch: x86_64
baseImage: SLE-Micro.x86_64-5.5.0-Default-SelfInstall-GM2.install.iso
outputImageName: eib-mgmt-cluster-image.iso
where the baseImage
is the original image you downloaded from the SUSE Customer Center or the SUSE Download page. outputImageName
is the name of the new image that will be used to provision the management cluster.
Operating system section (definition file):
operatingSystem:
isoConfiguration:
installDevice: /dev/sda
users:
- username: root
encryptedPassword: ${ROOT_PASSWORD}
packages:
packageList:
- jq
sccRegistrationCode: ${SCC_REGISTRATION_CODE}
where the installDevice
is the device to be used to install the operating system, the username
and encryptedPassword
are the credentials to be used to access the system, the packageList
is the list of packages to be installed (jq
is required internally during the installation process), and the sccRegistrationCode
is the registration code used to get the packages and dependencies at build time and can be obtained from the SUSE Customer Center.
The encrypted password can be generated using the openssl
command as follows:
openssl passwd -6 MyPassword!123
This outputs something similar to:
$6$UrXB1sAGs46DOiSq$HSwi9GFJLCorm0J53nF2Sq8YEoyINhHcObHzX2R8h13mswUIsMwzx4eUzn/rRx0QPV4JIb0eWCoNrxGiKH4R31
Kubernetes section (definition file):
kubernetes:
version: ${KUBERNETES_VERSION}
helm:
charts:
- name: cert-manager
repositoryName: jetstack
version: 1.14.2
targetNamespace: cert-manager
valuesFile: certmanager.yaml
createNamespace: true
installationNamespace: kube-system
- name: longhorn-crd
version: 103.3.0+up1.6.1
repositoryName: rancher-charts
targetNamespace: longhorn-system
createNamespace: true
installationNamespace: kube-system
- name: longhorn
version: 103.3.0+up1.6.1
repositoryName: rancher-charts
targetNamespace: longhorn-system
createNamespace: true
installationNamespace: kube-system
- name: metal3-chart
version: 0.7.4
repositoryName: suse-edge-charts
targetNamespace: metal3-system
createNamespace: true
installationNamespace: kube-system
valuesFile: metal3.yaml
- name: neuvector-crd
version: 103.0.3+up2.7.6
repositoryName: rancher-charts
targetNamespace: neuvector
createNamespace: true
installationNamespace: kube-system
valuesFile: neuvector.yaml
- name: neuvector
version: 103.0.3+up2.7.6
repositoryName: rancher-charts
targetNamespace: neuvector
createNamespace: true
installationNamespace: kube-system
valuesFile: neuvector.yaml
- name: rancher
version: 2.8.8
repositoryName: rancher-prime
targetNamespace: cattle-system
createNamespace: true
installationNamespace: kube-system
valuesFile: rancher.yaml
repositories:
- name: jetstack
url: https://charts.jetstack.io
- name: rancher-charts
url: https://charts.rancher.io/
- name: suse-edge-charts
url: oci://registry.suse.com/edge
- name: rancher-prime
url: https://charts.rancher.com/server-charts/prime
network:
apiHost: ${API_HOST}
apiVIP: ${API_VIP}
nodes:
- hostname: mgmt-cluster-node1
initializer: true
type: server
# - hostname: mgmt-cluster-node2
# type: server
# - hostname: mgmt-cluster-node3
# type: server
where version
is the version of Kubernetes to be installed. In our case, we are using an RKE2 cluster, so the version must be minor less than 1.29 to be compatible with Rancher
(for example, v1.28.13+rke2r1
).
The helm
section contains the list of Helm charts to be installed, the repositories to be used, and the version configuration for all of them.
The network
section contains the configuration for the network, like the apiHost
and apiVIP
to be used by the RKE2
component.
The apiVIP
should be an IP address that is not used in the network and should not be part of the DHCP pool (in case we use DHCP). Also, when we use the apiVIP
in a multi-node cluster, it is used to access the Kubernetes API server.
The apiHost
is the name resolution to apiVIP
to be used by the RKE2
component.
The nodes
section contains the list of nodes to be used in the cluster. The nodes
section contains the list of nodes to be used in the cluster. In this example, a single-node cluster is being used, but it can be extended to a multi-node cluster by adding more nodes to the list (by uncommenting the lines).
The names of the nodes must be unique in the cluster.
Optionally, use the
initializer
field to specify the bootstrap host, otherwise it will be the first node in the list.The names of the nodes must be the same as the host names defined in the Network Folder (Section 29.3.5, “Networking folder”) when network configuration is required.
29.3.3 Custom folder #
The custom
folder contains the following subfolders:
... ├── custom │ ├── scripts │ │ ├── 99-register.sh │ │ ├── 99-mgmt-setup.sh │ │ └── 99-alias.sh │ └── files │ ├── rancher.sh │ ├── mgmt-stack-setup.service │ ├── metal3.sh │ └── basic-setup.sh ...
The
custom/files
folder contains the configuration files to be used by the management cluster.The
custom/scripts
folder contains the scripts to be used by the management cluster.
The custom/files
folder contains the following files:
basic-setup.sh
: contains the configuration parameters about theMetal3
version to be used, as well as theRancher
andMetalLB
basic parameters. Only modify this file if you want to change the versions of the components or the namespaces to be used.#!/bin/bash # Pre-requisites. Cluster already running export KUBECTL="/var/lib/rancher/rke2/bin/kubectl" export KUBECONFIG="/etc/rancher/rke2/rke2.yaml" ################## # METAL3 DETAILS # ################## export METAL3_CHART_TARGETNAMESPACE="metal3-system" export METAL3_CLUSTERCTLVERSION="1.6.2" export METAL3_CAPICOREVERSION="1.6.2" export METAL3_CAPIMETAL3VERSION="1.6.0" export METAL3_CAPIRKE2VERSION="0.4.1" export METAL3_CAPIPROVIDER="rke2" export METAL3_CAPISYSTEMNAMESPACE="capi-system" export METAL3_RKE2BOOTSTRAPNAMESPACE="rke2-bootstrap-system" export METAL3_CAPM3NAMESPACE="capm3-system" export METAL3_RKE2CONTROLPLANENAMESPACE="rke2-control-plane-system" export METAL3_CAPI_IMAGES="registry.suse.com/edge" # Or registry.opensuse.org/isv/suse/edge/clusterapi/containerfile/suse for the upstream ones ########### # METALLB # ########### export METALLBNAMESPACE="metallb-system" ########### # RANCHER # ########### export RANCHER_CHART_TARGETNAMESPACE="cattle-system" export RANCHER_FINALPASSWORD="adminadminadmin" die(){ echo ${1} 1>&2 exit ${2} }
metal3.sh
: contains the configuration for theMetal3
component to be used (no modifications needed). In future versions, this script will be replaced to use insteadRancher Turtles
to make it easy.#!/bin/bash set -euo pipefail BASEDIR="$(dirname "$0")" source ${BASEDIR}/basic-setup.sh METAL3LOCKNAMESPACE="default" METAL3LOCKCMNAME="metal3-lock" trap 'catch $? $LINENO' EXIT catch() { if [ "$1" != "0" ]; then echo "Error $1 occurred on $2" ${KUBECTL} delete configmap ${METAL3LOCKCMNAME} -n ${METAL3LOCKNAMESPACE} fi } # Get or create the lock to run all those steps just in a single node # As the first node is created WAY before the others, this should be enough # TODO: Investigate if leases is better if [ $(${KUBECTL} get cm -n ${METAL3LOCKNAMESPACE} ${METAL3LOCKCMNAME} -o name | wc -l) -lt 1 ]; then ${KUBECTL} create configmap ${METAL3LOCKCMNAME} -n ${METAL3LOCKNAMESPACE} --from-literal foo=bar else exit 0 fi # Wait for metal3 while ! ${KUBECTL} wait --for condition=ready -n ${METAL3_CHART_TARGETNAMESPACE} $(${KUBECTL} get pods -n ${METAL3_CHART_TARGETNAMESPACE} -l app.kubernetes.io/name=metal3-ironic -o name) --timeout=10s; do sleep 2 ; done # Get the ironic IP IRONICIP=$(${KUBECTL} get cm -n ${METAL3_CHART_TARGETNAMESPACE} ironic-bmo -o jsonpath='{.data.IRONIC_IP}') # If LoadBalancer, use metallb, else it is NodePort if [ $(${KUBECTL} get svc -n ${METAL3_CHART_TARGETNAMESPACE} metal3-metal3-ironic -o jsonpath='{.spec.type}') == "LoadBalancer" ]; then # Wait for metallb while ! ${KUBECTL} wait --for condition=ready -n ${METALLBNAMESPACE} $(${KUBECTL} get pods -n ${METALLBNAMESPACE} -l app.kubernetes.io/component=controller -o name) --timeout=10s; do sleep 2 ; done # Do not create the ippool if already created ${KUBECTL} get ipaddresspool -n ${METALLBNAMESPACE} ironic-ip-pool -o name || cat <<-EOF | ${KUBECTL} apply -f - apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: name: ironic-ip-pool namespace: ${METALLBNAMESPACE} spec: addresses: - ${IRONICIP}/32 serviceAllocation: priority: 100 serviceSelectors: - matchExpressions: - {key: app.kubernetes.io/name, operator: In, values: [metal3-ironic]} EOF # Same for L2 Advs ${KUBECTL} get L2Advertisement -n ${METALLBNAMESPACE} ironic-ip-pool-l2-adv -o name || cat <<-EOF | ${KUBECTL} apply -f - apiVersion: metallb.io/v1beta1 kind: L2Advertisement metadata: name: ironic-ip-pool-l2-adv namespace: ${METALLBNAMESPACE} spec: ipAddressPools: - ironic-ip-pool EOF fi # If clusterctl is not installed, install it if ! command -v clusterctl > /dev/null 2>&1; then LINUXARCH=$(uname -m) case $(uname -m) in "x86_64") export GOARCH="amd64" ;; "aarch64") export GOARCH="arm64" ;; "*") echo "Arch not found, asumming amd64" export GOARCH="amd64" ;; esac # Clusterctl bin # Maybe just use the binary from hauler if available curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v${METAL3_CLUSTERCTLVERSION}/clusterctl-linux-${GOARCH} -o /usr/local/bin/clusterctl chmod +x /usr/local/bin/clusterctl fi # If rancher is deployed if [ $(${KUBECTL} get pods -n ${RANCHER_CHART_TARGETNAMESPACE} -l app=rancher -o name | wc -l) -ge 1 ]; then cat <<-EOF | ${KUBECTL} apply -f - apiVersion: management.cattle.io/v3 kind: Feature metadata: name: embedded-cluster-api spec: value: false EOF # Disable Rancher webhooks for CAPI ${KUBECTL} delete mutatingwebhookconfiguration.admissionregistration.k8s.io mutating-webhook-configuration ${KUBECTL} delete validatingwebhookconfigurations.admissionregistration.k8s.io validating-webhook-configuration ${KUBECTL} wait --for=delete namespace/cattle-provisioning-capi-system --timeout=300s fi # Deploy CAPI if [ $(${KUBECTL} get pods -n ${METAL3_CAPISYSTEMNAMESPACE} -o name | wc -l) -lt 1 ]; then # https://github.com/rancher-sandbox/cluster-api-provider-rke2#setting-up-clusterctl mkdir -p ~/.cluster-api cat <<-EOF > ~/.cluster-api/clusterctl.yaml images: all: repository: ${METAL3_CAPI_IMAGES} EOF # Try this command 3 times just in case, stolen from https://stackoverflow.com/a/33354419 if ! (r=3; while ! clusterctl init \ --core "cluster-api:v${METAL3_CAPICOREVERSION}"\ --infrastructure "metal3:v${METAL3_CAPIMETAL3VERSION}"\ --bootstrap "${METAL3_CAPIPROVIDER}:v${METAL3_CAPIRKE2VERSION}"\ --control-plane "${METAL3_CAPIPROVIDER}:v${METAL3_CAPIRKE2VERSION}" ; do ((--r))||exit echo "Something went wrong, let's wait 10 seconds and retry" sleep 10;done) ; then echo "clusterctl failed" exit 1 fi # Wait for capi-controller-manager while ! ${KUBECTL} wait --for condition=ready -n ${METAL3_CAPISYSTEMNAMESPACE} $(${KUBECTL} get pods -n ${METAL3_CAPISYSTEMNAMESPACE} -l cluster.x-k8s.io/provider=cluster-api -o name) --timeout=10s; do sleep 2 ; done # Wait for capm3-controller-manager, there are two pods, the ipam and the capm3 one, just wait for the first one while ! ${KUBECTL} wait --for condition=ready -n ${METAL3_CAPM3NAMESPACE} $(${KUBECTL} get pods -n ${METAL3_CAPM3NAMESPACE} -l cluster.x-k8s.io/provider=infrastructure-metal3 -o name | head -n1 ) --timeout=10s; do sleep 2 ; done # Wait for rke2-bootstrap-controller-manager while ! ${KUBECTL} wait --for condition=ready -n ${METAL3_RKE2BOOTSTRAPNAMESPACE} $(${KUBECTL} get pods -n ${METAL3_RKE2BOOTSTRAPNAMESPACE} -l cluster.x-k8s.io/provider=bootstrap-rke2 -o name) --timeout=10s; do sleep 2 ; done # Wait for rke2-control-plane-controller-manager while ! ${KUBECTL} wait --for condition=ready -n ${METAL3_RKE2CONTROLPLANENAMESPACE} $(${KUBECTL} get pods -n ${METAL3_RKE2CONTROLPLANENAMESPACE} -l cluster.x-k8s.io/provider=control-plane-rke2 -o name) --timeout=10s; do sleep 2 ; done fi # Clean up the lock cm ${KUBECTL} delete configmap ${METAL3LOCKCMNAME} -n ${METAL3LOCKNAMESPACE}
rancher.sh
: contains the configuration for theRancher
component to be used (no modifications needed).#!/bin/bash set -euo pipefail BASEDIR="$(dirname "$0")" source ${BASEDIR}/basic-setup.sh RANCHERLOCKNAMESPACE="default" RANCHERLOCKCMNAME="rancher-lock" if [ -z "${RANCHER_FINALPASSWORD}" ]; then # If there is no final password, then finish the setup right away exit 0 fi trap 'catch $? $LINENO' EXIT catch() { if [ "$1" != "0" ]; then echo "Error $1 occurred on $2" ${KUBECTL} delete configmap ${RANCHERLOCKCMNAME} -n ${RANCHERLOCKNAMESPACE} fi } # Get or create the lock to run all those steps just in a single node # As the first node is created WAY before the others, this should be enough # TODO: Investigate if leases is better if [ $(${KUBECTL} get cm -n ${RANCHERLOCKNAMESPACE} ${RANCHERLOCKCMNAME} -o name | wc -l) -lt 1 ]; then ${KUBECTL} create configmap ${RANCHERLOCKCMNAME} -n ${RANCHERLOCKNAMESPACE} --from-literal foo=bar else exit 0 fi # Wait for rancher to be deployed while ! ${KUBECTL} wait --for condition=ready -n ${RANCHER_CHART_TARGETNAMESPACE} $(${KUBECTL} get pods -n ${RANCHER_CHART_TARGETNAMESPACE} -l app=rancher -o name) --timeout=10s; do sleep 2 ; done until ${KUBECTL} get ingress -n ${RANCHER_CHART_TARGETNAMESPACE} rancher > /dev/null 2>&1; do sleep 10; done RANCHERBOOTSTRAPPASSWORD=$(${KUBECTL} get secret -n ${RANCHER_CHART_TARGETNAMESPACE} bootstrap-secret -o jsonpath='{.data.bootstrapPassword}' | base64 -d) RANCHERHOSTNAME=$(${KUBECTL} get ingress -n ${RANCHER_CHART_TARGETNAMESPACE} rancher -o jsonpath='{.spec.rules[0].host}') # Skip the whole process if things have been set already if [ -z $(${KUBECTL} get settings.management.cattle.io first-login -ojsonpath='{.value}') ]; then # Add the protocol RANCHERHOSTNAME="https://${RANCHERHOSTNAME}" TOKEN="" while [ -z "${TOKEN}" ]; do # Get token sleep 2 TOKEN=$(curl -sk -X POST ${RANCHERHOSTNAME}/v3-public/localProviders/local?action=login -H 'content-type: application/json' -d "{\"username\":\"admin\",\"password\":\"${RANCHERBOOTSTRAPPASSWORD}\"}" | jq -r .token) done # Set password curl -sk ${RANCHERHOSTNAME}/v3/users?action=changepassword -H 'content-type: application/json' -H "Authorization: Bearer $TOKEN" -d "{\"currentPassword\":\"${RANCHERBOOTSTRAPPASSWORD}\",\"newPassword\":\"${RANCHER_FINALPASSWORD}\"}" # Create a temporary API token (ttl=60 minutes) APITOKEN=$(curl -sk ${RANCHERHOSTNAME}/v3/token -H 'content-type: application/json' -H "Authorization: Bearer ${TOKEN}" -d '{"type":"token","description":"automation","ttl":3600000}' | jq -r .token) curl -sk ${RANCHERHOSTNAME}/v3/settings/server-url -H 'content-type: application/json' -H "Authorization: Bearer ${APITOKEN}" -X PUT -d "{\"name\":\"server-url\",\"value\":\"${RANCHERHOSTNAME}\"}" curl -sk ${RANCHERHOSTNAME}/v3/settings/telemetry-opt -X PUT -H 'content-type: application/json' -H 'accept: application/json' -H "Authorization: Bearer ${APITOKEN}" -d '{"value":"out"}' fi # Clean up the lock cm ${KUBECTL} delete configmap ${RANCHERLOCKCMNAME} -n ${RANCHERLOCKNAMESPACE}
mgmt-stack-setup.service
: contains the configuration to create the systemd service to run the scripts during the first boot (no modifications needed).[Unit] Description=Setup Management stack components Wants=network-online.target # It requires rke2 or k3s running, but it will not fail if those services are not present After=network.target network-online.target rke2-server.service k3s.service # At least, the basic-setup.sh one needs to be present ConditionPathExists=/opt/mgmt/bin/basic-setup.sh [Service] User=root Type=forking # Metal3 can take A LOT to download the IPA image TimeoutStartSec=1800 ExecStartPre=/bin/sh -c "echo 'Setting up Management components...'" # Scripts are executed in StartPre because Start can only run a single on ExecStartPre=/opt/mgmt/bin/rancher.sh ExecStartPre=/opt/mgmt/bin/metal3.sh ExecStart=/bin/sh -c "echo 'Finished setting up Management components'" RemainAfterExit=yes KillMode=process # Disable & delete everything ExecStartPost=rm -f /opt/mgmt/bin/rancher.sh ExecStartPost=rm -f /opt/mgmt/bin/metal3.sh ExecStartPost=rm -f /opt/mgmt/bin/basic-setup.sh ExecStartPost=/bin/sh -c "systemctl disable mgmt-stack-setup.service" ExecStartPost=rm -f /etc/systemd/system/mgmt-stack-setup.service [Install] WantedBy=multi-user.target
The custom/scripts
folder contains the following files:
99-alias.sh
script: contains the alias to be used by the management cluster to load the kubeconfig file at first boot (no modifications needed).#!/bin/bash echo "alias k=kubectl" >> /etc/profile.local echo "alias kubectl=/var/lib/rancher/rke2/bin/kubectl" >> /etc/profile.local echo "export KUBECONFIG=/etc/rancher/rke2/rke2.yaml" >> /etc/profile.local
99-mgmt-setup.sh
script: contains the configuration to copy the scripts during the first boot (no modifications needed).#!/bin/bash # Copy the scripts from combustion to the final location mkdir -p /opt/mgmt/bin/ for script in basic-setup.sh rancher.sh metal3.sh; do cp ${script} /opt/mgmt/bin/ done # Copy the systemd unit file and enable it at boot cp mgmt-stack-setup.service /etc/systemd/system/mgmt-stack-setup.service systemctl enable mgmt-stack-setup.service
99-register.sh
script: contains the configuration to register the system using the SCC registration code. The${SCC_ACCOUNT_EMAIL}
and${SCC_REGISTRATION_CODE}
have to be set properly to register the system with your account.#!/bin/bash set -euo pipefail # Registration https://www.suse.com/support/kb/doc/?id=000018564 if ! which SUSEConnect > /dev/null 2>&1; then zypper --non-interactive install suseconnect-ng fi SUSEConnect --email "${SCC_ACCOUNT_EMAIL}" --url "https://scc.suse.com" --regcode "${SCC_REGISTRATION_CODE}"
29.3.4 Kubernetes folder #
The kubernetes
folder contains the following subfolders:
... ├── kubernetes │ ├── manifests │ │ ├── rke2-ingress-config.yaml │ │ ├── neuvector-namespace.yaml │ │ ├── ingress-l2-adv.yaml │ │ └── ingress-ippool.yaml │ ├── helm │ │ └── values │ │ ├── rancher.yaml │ │ ├── neuvector.yaml │ │ ├── metal3.yaml │ │ └── certmanager.yaml │ └── config │ └── server.yaml ...
The kubernetes/config
folder contains the following files:
server.yaml
: By default, theCNI
plug-in installed by default isCilium
, so you do not need to create this folder and file. Just in case you need to customize theCNI
plug-in, you can use theserver.yaml
file under thekubernetes/config
folder. It contains the following information:cni: - multus - cilium
This is an optional file to define certain Kubernetes customization, like the CNI plug-ins to be used or many options you can check in the official documentation.
The kubernetes/manifests
folder contains the following files:
rke2-ingress-config.yaml
: contains the configuration to create theIngress
service for the management cluster (no modifications needed).apiVersion: helm.cattle.io/v1 kind: HelmChartConfig metadata: name: rke2-ingress-nginx namespace: kube-system spec: valuesContent: |- controller: config: use-forwarded-headers: "true" enable-real-ip: "true" publishService: enabled: true service: enabled: true type: LoadBalancer externalTrafficPolicy: Local
neuvector-namespace.yaml
: contains the configuration to create theNeuVector
namespace (no modifications needed).apiVersion: v1 kind: Namespace metadata: labels: pod-security.kubernetes.io/enforce: privileged name: neuvector
ingress-l2-adv.yaml
: contains the configuration to create theL2Advertisement
for theMetalLB
component (no modifications needed).apiVersion: metallb.io/v1beta1 kind: L2Advertisement metadata: name: ingress-l2-adv namespace: metallb-system spec: ipAddressPools: - ingress-ippool
ingress-ippool.yaml
: contains the configuration to create theIPAddressPool
for therke2-ingress-nginx
component. The${INGRESS_VIP}
has to be set properly to define the IP address reserved to be used by therke2-ingress-nginx
component.apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: name: ingress-ippool namespace: metallb-system spec: addresses: - ${INGRESS_VIP}/32 serviceAllocation: priority: 100 serviceSelectors: - matchExpressions: - {key: app.kubernetes.io/name, operator: In, values: [rke2-ingress-nginx]}
The kubernetes/helm/values
folder contains the following files:
rancher.yaml
: contains the configuration to create theRancher
component. The${INGRESS_VIP}
must be set properly to define the IP address to be consumed by theRancher
component. The URL to access theRancher
component will behttps://rancher-${INGRESS_VIP}.sslip.io
.hostname: rancher-${INGRESS_VIP}.sslip.io bootstrapPassword: "foobar" replicas: 1 global.cattle.psp.enabled: "false"
neuvector.yaml
: contains the configuration to create theNeuVector
component (no modifications needed).controller: replicas: 1 ranchersso: enabled: true manager: enabled: false cve: scanner: enabled: false replicas: 1 k3s: enabled: true crdwebhook: enabled: false
metal3.yaml
: contains the configuration to create theMetal3
component. The${METAL3_VIP}
must be set properly to define the IP address to be consumed by theMetal3
component.global: ironicIP: ${METAL3_VIP} enable_vmedia_tls: false additionalTrustedCAs: false metal3-ironic: global: predictableNicNames: "true" persistence: ironic: size: "5Gi"
The Media Server is an optional feature included in Metal3 (by default is disabled). To use the Metal3 feature, you need to configure it on the previous manifest. To use the Metal3 media server, specify the following variable:
add the
enable_metal3_media_server
totrue
to enable the media server feature in the global section.include the following configuration about the media server where ${MEDIA_VOLUME_PATH} is the path to the media volume in the media (e.g
/home/metal3/bmh-image-cache
)metal3-media: mediaVolume: hostPath: ${MEDIA_VOLUME_PATH}
An external media server can be used to store the images, and in the case you want to use it with TLS, you will need to modify the following configurations:
set to
true
theadditionalTrustedCAs
in the previousmetal3.yaml
file to enable the additional trusted CAs from the external media server.include the following secret configuration in the folder
kubernetes/manifests/metal3-cacert-secret.yaml
to store the CA certificate of the external media server.apiVersion: v1 kind: Namespace metadata: name: metal3-system --- apiVersion: v1 kind: Secret metadata: name: tls-ca-additional namespace: metal3-system type: Opaque data: ca-additional.crt: {{ additional_ca_cert | b64encode }}
The additional_ca_cert
is the base64-encoded CA certificate of the external media server. You can use the following command to encode the certificate and generate the secret doing manually:
kubectl -n meta3-system create secret generic tls-ca-additional --from-file=ca-additional.crt=./ca-additional.crt
certmanager.yaml
: contains the configuration to create theCert-Manager
component (no modifications needed).installCRDs: "true"
29.3.5 Networking folder #
The network
folder contains as many files as nodes in the management cluster. In our case, we have only one node, so we have only one file called mgmt-cluster-node1.yaml
.
The name of the file must match the host name defined in the mgmt-cluster.yaml
definition file into the network/node section described above.
If you need to customize the networking configuration, for example, to use a specific static IP address (DHCP-less scenario), you can use the mgmt-cluster-node1.yaml
file under the network
folder. It contains the following information:
${MGMT_GATEWAY}
: The gateway IP address.${MGMT_DNS}
: The DNS server IP address.${MGMT_MAC}
: The MAC address of the network interface.${MGMT_NODE_IP}
: The IP address of the management cluster.
routes:
config:
- destination: 0.0.0.0/0
metric: 100
next-hop-address: ${MGMT_GATEWAY}
next-hop-interface: eth0
table-id: 254
dns-resolver:
config:
server:
- ${MGMT_DNS}
- 8.8.8.8
interfaces:
- name: eth0
type: ethernet
state: up
mac-address: ${MGMT_MAC}
ipv4:
address:
- ip: ${MGMT_NODE_IP}
prefix-length: 24
dhcp: false
enabled: true
ipv6:
enabled: false
If you want to use DHCP to get the IP address, you can use the following configuration (the MAC
address must be set properly using the ${MGMT_MAC}
variable):
## This is an example of a dhcp network configuration for a management cluster
interfaces:
- name: eth0
type: ethernet
state: up
mac-address: ${MGMT_MAC}
ipv4:
dhcp: true
enabled: true
ipv6:
enabled: false
Depending on the number of nodes in the management cluster, you can create more files like
mgmt-cluster-node2.yaml
,mgmt-cluster-node3.yaml
, etc. to configure the rest of the nodes.The
routes
section is used to define the routing table for the management cluster.
29.4 Image preparation for air-gap environments #
This section describes how to prepare the image for air-gap environments showing only the differences from the previous sections. The following changes to the previous section (Image preparation for connected environments (Section 29.3, “Image preparation for connected environments”)) are required to prepare the image for air-gap environments:
The
mgmt-cluster.yaml
file must be modified to include theembeddedArtifactRegistry
section with theimages
field set to all container images to be included into the EIB output image.The
custom/scripts/99-register.sh
script must be removed when use an air-gap environment.The
custom/files/airgap-resources.tar.gz
file must be included in thecustom/files
folder with all the resources needed to run the management cluster in an air-gap environment.The
custom/scripts/99-mgmt-setup.sh
script must be modified to extract and copy theairgap-resources.tar.gz
file to the final location.The
custom/files/metal3.sh
script must be modified to use the local resources included in theairgap-resources.tar.gz
file instead of downloading them from the internet.
29.4.1 Directory structure for air-gap environments #
The directory structure for air-gap environments is the same as for connected environments, with the differences explained as follows:
eib |-- base-images | |-- SLE-Micro.x86_64-5.5.0-Default-SelfInstall-GM2.install.iso |-- custom | |-- files | | |-- airgap-resources.tar.gz | | |-- basic-setup.sh | | |-- metal3.sh | | |-- mgmt-stack-setup.service | | |-- rancher.sh | |-- scripts | |-- 99-alias.sh | |-- 99-mgmt-setup.sh |-- kubernetes | |-- config | | |-- server.yaml | |-- helm | | |-- values | | |-- certmanager.yaml | | |-- metal3.yaml | | |-- neuvector.yaml | | |-- rancher.yaml | |-- manifests | |-- neuvector-namespace.yaml |-- mgmt-cluster.yaml |-- network |-- mgmt-cluster-network.yaml
The image SLE-Micro.x86_64-5.5.0-Default-SelfInstall-GM2.install.iso
must be downloaded from the SUSE Customer Center or the SUSE Download page, and it must be located under the base-images
folder before starting with the process.
You should check the SHA256 checksum of the image to ensure it has not been tampered with. The checksum can be found in the same location where the image was downloaded.
An example of the directory structure can be found in the SUSE Edge GitHub repository under the "telco-examples" folder.
29.4.2 Modifications in the definition file #
The mgmt-cluster.yaml
file must be modified to include the embeddedArtifactRegistry
section with the images
field set to all container images to be included into the EIB output image. The images
field must contain the list of all container images to be included in the output image. The following is an example of the mgmt-cluster.yaml
file with the embeddedArtifactRegistry
section included:
apiVersion: 1.0
image:
imageType: iso
arch: x86_64
baseImage: SLE-Micro.x86_64-5.5.0-Default-SelfInstall-GM2.install.iso
outputImageName: eib-mgmt-cluster-image.iso
operatingSystem:
isoConfiguration:
installDevice: /dev/sda
users:
- username: root
encryptedPassword: ${ROOT_PASSWORD}
packages:
packageList:
- jq
sccRegistrationCode: ${SCC_REGISTRATION_CODE}
kubernetes:
version: ${KUBERNETES_VERSION}
helm:
charts:
- name: cert-manager
repositoryName: jetstack
version: 1.14.2
targetNamespace: cert-manager
valuesFile: certmanager.yaml
createNamespace: true
installationNamespace: kube-system
- name: longhorn-crd
version: 103.3.0+up1.6.1
repositoryName: rancher-charts
targetNamespace: longhorn-system
createNamespace: true
installationNamespace: kube-system
- name: longhorn
version: 103.3.0+up1.6.1
repositoryName: rancher-charts
targetNamespace: longhorn-system
createNamespace: true
installationNamespace: kube-system
- name: metal3-chart
version: 0.7.4
repositoryName: suse-edge-charts
targetNamespace: metal3-system
createNamespace: true
installationNamespace: kube-system
valuesFile: metal3.yaml
- name: neuvector-crd
version: 103.0.3+up2.7.6
repositoryName: rancher-charts
targetNamespace: neuvector
createNamespace: true
installationNamespace: kube-system
valuesFile: neuvector.yaml
- name: neuvector
version: 103.0.3+up2.7.6
repositoryName: rancher-charts
targetNamespace: neuvector
createNamespace: true
installationNamespace: kube-system
valuesFile: neuvector.yaml
- name: rancher
version: 2.8.8
repositoryName: rancher-prime
targetNamespace: cattle-system
createNamespace: true
installationNamespace: kube-system
valuesFile: rancher.yaml
repositories:
- name: jetstack
url: https://charts.jetstack.io
- name: rancher-charts
url: https://charts.rancher.io/
- name: suse-edge-charts
url: oci://registry.suse.com/edge
- name: rancher-prime
url: https://charts.rancher.com/server-charts/prime
network:
apiHost: ${API_HOST}
apiVIP: ${API_VIP}
nodes:
- hostname: mgmt-cluster-node1
initializer: true
type: server
# - hostname: mgmt-cluster-node2
# type: server
# - hostname: mgmt-cluster-node3
# type: server
# type: server
embeddedArtifactRegistry:
images:
- name: registry.rancher.com/rancher/backup-restore-operator:v4.0.3
- name: registry.rancher.com/rancher/calico-cni:v3.27.4-rancher1
- name: registry.rancher.com/rancher/cis-operator:v1.0.15
- name: registry.rancher.com/rancher/coreos-kube-state-metrics:v1.9.7
- name: registry.rancher.com/rancher/coreos-prometheus-config-reloader:v0.38.1
- name: registry.rancher.com/rancher/coreos-prometheus-operator:v0.38.1
- name: registry.rancher.com/rancher/flannel-cni:v0.3.0-rancher9
- name: registry.rancher.com/rancher/fleet-agent:v0.9.9
- name: registry.rancher.com/rancher/fleet:v0.9.9
- name: registry.rancher.com/rancher/gitjob:v0.9.13
- name: registry.rancher.com/rancher/grafana-grafana:7.1.5
- name: registry.rancher.com/rancher/hardened-addon-resizer:1.8.20-build20240410
- name: registry.rancher.com/rancher/hardened-calico:v3.28.1-build20240806
- name: registry.rancher.com/rancher/hardened-cluster-autoscaler:v1.8.10-build20240124
- name: registry.rancher.com/rancher/hardened-cni-plugins:v1.5.1-build20240805
- name: registry.rancher.com/rancher/hardened-coredns:v1.11.1-build20240305
- name: registry.rancher.com/rancher/hardened-dns-node-cache:1.22.28-build20240125
- name: registry.rancher.com/rancher/hardened-etcd:v3.5.13-k3s1-build20240531
- name: registry.rancher.com/rancher/hardened-flannel:v0.25.5-build20240801
- name: registry.rancher.com/rancher/hardened-k8s-metrics-server:v0.7.1-build20240401
- name: registry.rancher.com/rancher/hardened-kubernetes:v1.28.13-rke2r1-build20240815
- name: registry.rancher.com/rancher/hardened-multus-cni:v4.0.2-build20240612
- name: registry.rancher.com/rancher/hardened-node-feature-discovery:v0.15.4-build20240513
- name: registry.rancher.com/rancher/hardened-whereabouts:v0.7.0-build20240429
- name: registry.rancher.com/rancher/helm-project-operator:v0.2.1
- name: registry.rancher.com/rancher/istio-kubectl:1.5.10
- name: registry.rancher.com/rancher/jimmidyson-configmap-reload:v0.3.0
- name: registry.rancher.com/rancher/k3s-upgrade:v1.28.13-k3s1
- name: registry.rancher.com/rancher/klipper-helm:v0.8.4-build20240523
- name: registry.rancher.com/rancher/klipper-lb:v0.4.9
- name: registry.rancher.com/rancher/kube-api-auth:v0.2.1
- name: registry.rancher.com/rancher/kubectl:v1.28.12
- name: registry.rancher.com/rancher/library-nginx:1.19.2-alpine
- name: registry.rancher.com/rancher/local-path-provisioner:v0.0.28
- name: registry.rancher.com/rancher/machine:v0.15.0-rancher116
- name: registry.rancher.com/rancher/mirrored-cluster-api-controller:v1.4.4
- name: registry.rancher.com/rancher/nginx-ingress-controller:v1.10.4-hardened2
- name: registry.rancher.com/rancher/pause:3.6
- name: registry.rancher.com/rancher/prom-alertmanager:v0.21.0
- name: registry.rancher.com/rancher/prom-node-exporter:v1.0.1
- name: registry.rancher.com/rancher/prom-prometheus:v2.18.2
- name: registry.rancher.com/rancher/prometheus-auth:v0.2.2
- name: registry.rancher.com/rancher/prometheus-federator:v0.3.4
- name: registry.rancher.com/rancher/pushprox-client:v0.1.3-rancher2-client
- name: registry.rancher.com/rancher/pushprox-proxy:v0.1.3-rancher2-proxy
- name: registry.rancher.com/rancher/rancher-agent:v2.8.8
- name: registry.rancher.com/rancher/rancher-csp-adapter:v3.0.1
- name: registry.rancher.com/rancher/rancher-webhook:v0.4.11
- name: registry.rancher.com/rancher/rancher:v2.8.8
- name: registry.rancher.com/rancher/rke-tools:v0.1.102
- name: registry.rancher.com/rancher/rke2-cloud-provider:v1.29.3-build20240515
- name: registry.rancher.com/rancher/rke2-runtime:v1.28.13-rke2r1
- name: registry.rancher.com/rancher/rke2-upgrade:v1.28.13-rke2r1
- name: registry.rancher.com/rancher/security-scan:v0.2.17
- name: registry.rancher.com/rancher/shell:v0.1.26
- name: registry.rancher.com/rancher/system-agent-installer-k3s:v1.28.13-k3s1
- name: registry.rancher.com/rancher/system-agent-installer-rke2:v1.28.13-rke2r1
- name: registry.rancher.com/rancher/system-agent:v0.3.9-suc
- name: registry.rancher.com/rancher/system-upgrade-controller:v0.13.4
- name: registry.rancher.com/rancher/ui-plugin-catalog:2.1.0
- name: registry.rancher.com/rancher/ui-plugin-operator:v0.1.1
- name: registry.rancher.com/rancher/webhook-receiver:v0.2.5
- name: registry.rancher.com/rancher/kubectl:v1.20.2
- name: registry.rancher.com/rancher/shell:v0.1.24
- name: registry.rancher.com/rancher/mirrored-ingress-nginx-kube-webhook-certgen:v1.4.1
- name: registry.rancher.com/rancher/mirrored-ingress-nginx-kube-webhook-certgen:v20221220-controller-v1.5.1-58-g787ea74b6
- name: registry.rancher.com/rancher/mirrored-ingress-nginx-kube-webhook-certgen:v20230312-helm-chart-4.5.2-28-g66a760794
- name: registry.rancher.com/rancher/mirrored-ingress-nginx-kube-webhook-certgen:v20231011-8b53cabe0
- name: registry.rancher.com/rancher/mirrored-ingress-nginx-kube-webhook-certgen:v20231226-1a7112e06
- name: registry.rancher.com/rancher/mirrored-longhornio-csi-attacher:v4.4.2
- name: registry.rancher.com/rancher/mirrored-longhornio-csi-provisioner:v3.6.2
- name: registry.rancher.com/rancher/mirrored-longhornio-csi-resizer:v1.9.2
- name: registry.rancher.com/rancher/mirrored-longhornio-csi-snapshotter:v6.3.2
- name: registry.rancher.com/rancher/mirrored-longhornio-csi-node-driver-registrar:v2.9.2
- name: registry.rancher.com/rancher/mirrored-longhornio-livenessprobe:v2.12.0
- name: registry.rancher.com/rancher/mirrored-longhornio-backing-image-manager:v1.6.1
- name: registry.rancher.com/rancher/mirrored-longhornio-longhorn-engine:v1.6.1
- name: registry.rancher.com/rancher/mirrored-longhornio-longhorn-instance-manager:v1.6.1
- name: registry.rancher.com/rancher/mirrored-longhornio-longhorn-manager:v1.6.1
- name: registry.rancher.com/rancher/mirrored-longhornio-longhorn-share-manager:v1.6.1
- name: registry.rancher.com/rancher/mirrored-longhornio-longhorn-ui:v1.6.1
- name: registry.rancher.com/rancher/mirrored-longhornio-support-bundle-kit:v0.0.36
- name: registry.suse.com/edge/cluster-api-provider-rke2-bootstrap:v0.4.1
- name: registry.suse.com/edge/cluster-api-provider-rke2-controlplane:v0.4.1
- name: registry.suse.com/edge/cluster-api-controller:v1.6.2
- name: registry.suse.com/edge/cluster-api-provider-metal3:v1.6.0
- name: registry.suse.com/edge/ip-address-manager:v1.6.0
29.4.3 Modifications in the custom folder #
The
custom/scripts/99-register.sh
script must be removed when using an air-gap environment. As you can see in the directory structure, the99-register.sh
script is not included in thecustom/scripts
folder.The
custom/scripts/99-mgmt-setup.sh
script must be modified to extract and copy theairgap-resources.tar.gz
file to the final location. The following is an example of the99-mgmt-setup.sh
script with the modifications to extract and copy theairgap-resources.tar.gz
file:#!/bin/bash # Copy the scripts from combustion to the final location mkdir -p /opt/mgmt/bin/ for script in basic-setup.sh rancher.sh metal3.sh; do cp ${script} /opt/mgmt/bin/ done # Copy the systemd unit file and enable it at boot cp mgmt-stack-setup.service /etc/systemd/system/mgmt-stack-setup.service systemctl enable mgmt-stack-setup.service # Extract the airgap resources tar zxf airgap-resources.tar.gz # Copy the clusterctl binary to the final location cp airgap-resources/clusterctl /opt/mgmt/bin/ && chmod +x /opt/mgmt/bin/clusterctl # Copy the clusterctl.yaml and override mkdir -p /root/cluster-api cp -r airgap-resources/clusterctl.yaml airgap-resources/overrides /root/cluster-api/
The
custom/files/metal3.sh
script must be modified to use the local resources included in theairgap-resources.tar.gz
file instead of downloading them from the internet. The following is an example of themetal3.sh
script with the modifications to use the local resources:#!/bin/bash set -euo pipefail BASEDIR="$(dirname "$0")" source ${BASEDIR}/basic-setup.sh METAL3LOCKNAMESPACE="default" METAL3LOCKCMNAME="metal3-lock" trap 'catch $? $LINENO' EXIT catch() { if [ "$1" != "0" ]; then echo "Error $1 occurred on $2" ${KUBECTL} delete configmap ${METAL3LOCKCMNAME} -n ${METAL3LOCKNAMESPACE} fi } # Get or create the lock to run all those steps just in a single node # As the first node is created WAY before the others, this should be enough # TODO: Investigate if leases is better if [ $(${KUBECTL} get cm -n ${METAL3LOCKNAMESPACE} ${METAL3LOCKCMNAME} -o name | wc -l) -lt 1 ]; then ${KUBECTL} create configmap ${METAL3LOCKCMNAME} -n ${METAL3LOCKNAMESPACE} --from-literal foo=bar else exit 0 fi # Wait for metal3 while ! ${KUBECTL} wait --for condition=ready -n ${METAL3_CHART_TARGETNAMESPACE} $(${KUBECTL} get pods -n ${METAL3_CHART_TARGETNAMESPACE} -l app.kubernetes.io/name=metal3-ironic -o name) --timeout=10s; do sleep 2 ; done # If rancher is deployed if [ $(${KUBECTL} get pods -n ${RANCHER_CHART_TARGETNAMESPACE} -l app=rancher -o name | wc -l) -ge 1 ]; then cat <<-EOF | ${KUBECTL} apply -f - apiVersion: management.cattle.io/v3 kind: Feature metadata: name: embedded-cluster-api spec: value: false EOF # Disable Rancher webhooks for CAPI ${KUBECTL} delete mutatingwebhookconfiguration.admissionregistration.k8s.io mutating-webhook-configuration ${KUBECTL} delete validatingwebhookconfigurations.admissionregistration.k8s.io validating-webhook-configuration ${KUBECTL} wait --for=delete namespace/cattle-provisioning-capi-system --timeout=300s fi # Deploy CAPI if [ $(${KUBECTL} get pods -n ${METAL3_CAPISYSTEMNAMESPACE} -o name | wc -l) -lt 1 ]; then # Try this command 3 times just in case, stolen from https://stackoverflow.com/a/33354419 if ! (r=3; while ! /opt/mgmt/bin/clusterctl init \ --core "cluster-api:v${METAL3_CAPICOREVERSION}"\ --infrastructure "metal3:v${METAL3_CAPIMETAL3VERSION}"\ --bootstrap "${METAL3_CAPIPROVIDER}:v${METAL3_CAPIRKE2VERSION}"\ --control-plane "${METAL3_CAPIPROVIDER}:v${METAL3_CAPIRKE2VERSION}"\ --config /root/cluster-api/clusterctl.yaml ; do ((--r))||exit echo "Something went wrong, let's wait 10 seconds and retry" sleep 10;done) ; then echo "clusterctl failed" exit 1 fi # Wait for capi-controller-manager while ! ${KUBECTL} wait --for condition=ready -n ${METAL3_CAPISYSTEMNAMESPACE} $(${KUBECTL} get pods -n ${METAL3_CAPISYSTEMNAMESPACE} -l cluster.x-k8s.io/provider=cluster-api -o name) --timeout=10s; do sleep 2 ; done # Wait for capm3-controller-manager, there are two pods, the ipam and the capm3 one, just wait for the first one while ! ${KUBECTL} wait --for condition=ready -n ${METAL3_CAPM3NAMESPACE} $(${KUBECTL} get pods -n ${METAL3_CAPM3NAMESPACE} -l cluster.x-k8s.io/provider=infrastructure-metal3 -o name | head -n1 ) --timeout=10s; do sleep 2 ; done # Wait for rke2-bootstrap-controller-manager while ! ${KUBECTL} wait --for condition=ready -n ${METAL3_RKE2BOOTSTRAPNAMESPACE} $(${KUBECTL} get pods -n ${METAL3_RKE2BOOTSTRAPNAMESPACE} -l cluster.x-k8s.io/provider=bootstrap-rke2 -o name) --timeout=10s; do sleep 2 ; done # Wait for rke2-control-plane-controller-manager while ! ${KUBECTL} wait --for condition=ready -n ${METAL3_RKE2CONTROLPLANENAMESPACE} $(${KUBECTL} get pods -n ${METAL3_RKE2CONTROLPLANENAMESPACE} -l cluster.x-k8s.io/provider=control-plane-rke2 -o name) --timeout=10s; do sleep 2 ; done fi # Clean up the lock cm ${KUBECTL} delete configmap ${METAL3LOCKCMNAME} -n ${METAL3LOCKNAMESPACE}
The
custom/files/airgap-resources.tar.gz
file must be included in thecustom/files
folder with all the resources needed to run the management cluster in an air-gap environment. This file must be prepared manually downloading all resources and compressing them into this single file. Theairgap-resources.tar.gz
file contains the following resources:|-- clusterctl |-- clusterctl.yaml |-- overrides |-- bootstrap-rke2 | |-- v0.4.1 | |-- bootstrap-components.yaml | |-- metadata.yaml |-- cluster-api | |-- v1.6.2 | |-- core-components.yaml | |-- metadata.yaml |-- control-plane-rke2 | |-- v0.4.1 | |-- control-plane-components.yaml | |-- metadata.yaml |-- infrastructure-metal3 |-- v1.6.0 |-- cluster-template.yaml |-- infrastructure-components.yaml |-- metadata.yaml
The clusterctl.yaml
file contains the configuration to specify the images location and the overrides to be used by the clusterctl
tool. The overrides
folder contains yaml
file manifests to be used instead of downloading them from the internet.
providers:
# override a pre-defined provider
- name: "cluster-api"
url: "/root/cluster-api/overrides/cluster-api/v1.6.2/core-components.yaml"
type: "CoreProvider"
- name: "metal3"
url: "/root/cluster-api/overrides/infrastructure-metal3/v1.6.0/infrastructure-components.yaml"
type: "InfrastructureProvider"
- name: "rke2"
url: "/root/cluster-api/overrides/bootstrap-rke2/v0.4.1/bootstrap-components.yaml"
type: "BootstrapProvider"
- name: "rke2"
url: "/root/cluster-api/overrides/control-plane-rke2/v0.4.1/control-plane-components.yaml"
type: "ControlPlaneProvider"
images:
all:
repository: registry.suse.com/edge
The clusterctl
and the rest of the files included in the overrides
folder can be downloaded using the following curls commands:
# clusterctl binary curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.6.2/clusterctl-linux-${GOARCH} -o /usr/local/bin/clusterct # boostrap-components (boostrap-rke2) curl -L https://github.com/rancher-sandbox/cluster-api-provider-rke2/releases/download/v0.4.1/bootstrap-components.yaml curl -L https://github.com/rancher-sandbox/cluster-api-provider-rke2/releases/download/v0.4.1/metadata.yaml # control-plane-components (control-plane-rke2) curl -L https://github.com/rancher-sandbox/cluster-api-provider-rke2/releases/download/v0.4.1/control-plane-components.yaml curl -L https://github.com/rancher-sandbox/cluster-api-provider-rke2/releases/download/v0.4.1/metadata.yaml # cluster-api components curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.6.2/core-components.yaml curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.6.2/metadata.yaml # infrastructure-components (infrastructure-metal3) curl -L https://github.com/metal3-io/cluster-api-provider-metal3/releases/download/v1.6.0/infrastructure-components.yaml curl -L https://github.com/metal3-io/cluster-api-provider-metal3/releases/download/v1.6.0/metadata.yaml
If you want to use different versions of the components, you can change the version in the URL to download the specific version of the components.
With the previous resources downloaded, you can compress them into a single file using the following command:
tar -czvf airgap-resources.tar.gz clusterctl clusterctl.yaml overrides
29.5 Image creation #
Once the directory structure is prepared following the previous sections (for both, connected and air-gap scenarios), run the following command to build the image:
podman run --rm --privileged -it -v $PWD:/eib \ registry.suse.com/edge/edge-image-builder:1.0.2 \ build --definition-file mgmt-cluster.yaml
This creates the ISO output image file that, in our case, based on the image definition described above, is eib-mgmt-cluster-image.iso
.
29.6 Provision the management cluster #
The previous image contains all components explained above, and it can be used to provision the management cluster using a virtual machine or a bare-metal server (using the virtual-media feature).