Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
Applies to SUSE CaaS Platform 4.2.3

3 Bootstrapping the Cluster

Bootstrapping the cluster is the initial process of starting up the cluster and defining which of the nodes are masters and which are workers. For maximum automation of this process, SUSE CaaS Platform uses the skuba package.

3.1 Preparation

3.1.1 Install skuba

First you need to install skuba on a management machine, like your local workstation:

  1. Add the SLE15 SP1 extension containing skuba. This also requires the "containers" module.

    SUSEConnect -p sle-module-containers/15.1/x86_64
    SUSEConnect -p caasp/4.0/x86_64 -r <PRODUCT_KEY>
  2. Install the management pattern with:

    zypper in -t pattern SUSE-CaaSP-Management
Tip
Tip

Example deployment configuration files for each deployment scenario are installed under /usr/share/caasp/terraform/, or in case of the bare metal deployment: /usr/share/caasp/autoyast/.

3.1.2 Container Runtime Proxy

Important
Important

CRI-O proxy settings must be adjusted manually on all nodes before joining the cluster!

In some environments you must configure the container runtime to access the container registries through a proxy. In this case, please refer to: SUSE CaaS Platform Admin Guide: Configuring HTTP/HTTPS Proxy for CRI-O

3.2 Cluster Deployment

Make sure you have added the SSH identity (corresponding to the public SSH key distributed above) to the ssh-agent on your workstation. For instructions on how to add the SSH identity, refer to Section 2.1.1, “Basic SSH Key Configuration”.

This is a requirement for skuba (https://github.com/SUSE/skuba#prerequisites).

By default skuba connects to the nodes as root user. A different user can be specified by the following flags:

--sudo --user <USERNAME>
Important
Important

You must configure sudo for the user to be able to authenticate without password. Replace <USERNAME> with the user you created during installation. As root, run:

echo "<USERNAME> ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers

3.2.1 Initializing the Cluster

Warning
Warning: Secure configuration files access

The directory created during this step contains configuration files that allow full administrator access to your cluster. Apply best practices for access control to this folder.

Now you can initialize the cluster on the deployed machines. As --control-plane enter the IP/FQDN of your load balancer. If you do not use a load balancer use your first master node.

Important
Important

If you are deploying on a cloud provider you must enable vendor specific integrations (CPI). Please refer further below to Section 3.2.3.1, “Enabling Cloud Provider Integration”.

skuba cluster init --control-plane <LB_IP/FQDN> my-cluster

cluster init generates the folder named my-cluster and initializes the directory that will hold the configuration (kubeconfig) for the cluster.

Important
Important

The IP/FQDN must be reachable by every node of the cluster and therefore 127.0.0.1/localhost cannot be used.

3.2.1.1 Transitioning from Docker to CRI-O

SUSE CaaS Platform 4.2.3 default configuration uses the CRI-O Container Engine in conjunction with Docker Linux capabilities. This means SUSE CaaS Platform 4.2.3 containers run on top of CRI-O with the following additional Linux capabilities: audit_write, setfcap and mknod. This measure ensures a transparent transition and seamless compatibility with workloads running on the previous SUSE CaaS Platform versions and out-of-the-box Docker compatibility.

In case you wish to use unmodified CRI-O, use the --strict-capability-defaults option during the initial setup when you run skuba cluster init, which will create the vanilla CRI-O configuration:

skuba cluster init --strict-capability-defaults

Please be aware that this might result in incompatibility with your previously running workloads, unless you explicitly define the additional Linux capabilities required on top of CRI-O defaults.

Important
Important

After the bootstrap of the Kubernetes cluster there will be no easy way to revert this modification. Please choose wisely.

3.2.2 Configuring Kubernetes Services

Inspect the kubeadm-init.conf file inside your cluster definition and set extra configuration settings supported by kubeadm. The latest supported version is v1beta1. Later, when you later run skuba node bootstrap, kubeadm will read kubeadm-init.conf and will forcefully set certain settings to the ones required by SUSE CaaS Platform.

3.2.2.1 Network Settings

The default network settings inside kubeadm-init.conf are viable for production clusters and adjusting them is optional. If you however wish to change the pod and service subnets, it is important that you do so before the bootstrap. The subnet ranges must be planned carefully, because the settings cannot be adjusted after deployment is complete. The default settings are the following:

networking:
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.96.0.0/12

The podSubnet IP range must be big enough to contain all IP addresses for all PODs planned for the cluster. The subnet also mustn’t conflict with services from outside of the cluster - external databases, file services, etc. This also holds for serviceSubnet - the IP range must not conflict with external services and needs to be broad enough for all services planned for the cluster.

3.2.3 Cluster Configuration

Before bootstrapping the cluster, it is advisable to perform some additional configuration.

3.2.3.1 Enabling Cloud Provider Integration

Enable cloud provider integration to take advantage of the underlying cloud platforms and automatically manage resources like the Load Balancer, Nodes (Instances), Network Routes and Storage services.

If you want to enable cloud provider integration with different cloud platforms, initialize the cluster with the flag --cloud-provider <CLOUD PROVIDER>. The only currently available options are openstack, aws and vsphere, but more options are planned.

Important
Important: Cleanup

By enabling CPI providers your Kubernetes cluster will be able to provision cloud resources on its own (eg: Load Balancers, Persistent Volumes). You will have to manually clean these resources before you destroy the cluster with Terraform.

Not removing resources like Load Balancers created by the CPI will result in Terraform timing out during destroy operations.

Persistent volumes created with the retain policy will exist inside of the external cloud infrastructure even after the cluster is removed.

3.2.3.1.1 OpenStack CPI

Define the cluster using the following command:

skuba cluster init --control-plane <LB_IP/FQDN> --cloud-provider openstack my-cluster

Running the above command will create a directory my-cluster/cloud/openstack with a README.md and an openstack.conf.template in it. Copy openstack.conf.template or create an openstack.conf file inside my-cluster/cloud/openstack, according to the supported format. The supported format and content can be found in the official Kubernetes documentation:

https://v1-17.docs.kubernetes.io/docs/concepts/cluster-administration/cloud-providers/#openstack

Warning
Warning

The file my-cluster/cloud/openstack/openstack.conf must not be freely accessible. Please remember to set proper file permissions for it, for example 600.

3.2.3.2 Example OpenStack Cloud Provider Configuration

You can find the required parameters in OpenStack RC File v3.

[Global]
auth-url=<OS_AUTH_URL> 1
username=<OS_USERNAME> 2
password=<OS_PASSWORD> 3
tenant-id=<OS_PROJECT_ID> 4
domain-name=<OS_USER_DOMAIN_NAME> 5
region=<OS_REGION_NAME> 6
ca-file="/etc/pki/trust/anchors/SUSE_Trust_Root.pem" 7
[LoadBalancer]
lb-version=v2 8
subnet-id=<PRIVATE_SUBNET_ID> 9
floating-network-id=<PUBLIC_NET_ID> 10
create-monitor=yes 11
monitor-delay=1m 12
monitor-timeout=30s 13
monitor-max-retries=3 14
[BlockStorage]
bs-version=v2 15
ignore-volume-az=true 16

1

(required) Specifies the URL of the Keystone API used to authenticate the user. This value can be found in Horizon (the OpenStack control panel). under Project > Access and Security > API Access > Credentials.

2

(required) Refers to the username of a valid user set in Keystone.

3

(required) Refers to the password of a valid user set in Keystone.

4

(required) Used to specify the ID of the project where you want to create your resources.

5

(optional) Used to specify the name of the domain your user belongs to.

6

(optional) Used to specify the identifier of the region to use when running on a multi-region OpenStack cloud. A region is a general division of an OpenStack deployment.

7

(optional) Used to specify the path to your custom CA file.

8

(optional) Used to override automatic version detection. Valid values are v1 or v2. Where no value is provided, automatic detection will select the highest supported version exposed by the underlying OpenStack cloud.

9

(optional) Used to specify the ID of the subnet you want to create your load balancer on. Can be found at Network > Networks. Click on the respective network to get its subnets.

10

(optional) If specified, will create a floating IP for the load balancer.

11

(optional) Indicates whether or not to create a health monitor for the Neutron load balancer. Valid values are true and false. The default is false. When true is specified then monitor-delay, monitor-timeout, and monitor-max-retries must also be set.

12

(optional) The time between sending probes to members of the load balancer. Ensure that you specify a valid time unit.

13

(optional) Maximum time for a monitor to wait for a ping reply before it times out. The value must be less than the delay value. Ensure that you specify a valid time unit.

14

(optional) Number of permissible ping failures before changing the load balancer member’s status to INACTIVE. Must be a number between 1 and 10.

15

(optional) Used to override automatic version detection. Valid values are v1, v2, v3 and auto. When auto is specified, automatic detection will select the highest supported version exposed by the underlying OpenStack cloud.

16

(optional) Influences availability zone, use when attaching Cinder volumes. When Nova and Cinder have different availability zones, this should be set to true.

After setting options in the openstack.conf file, please proceed with Section 3.2.5, “Cluster Bootstrap”.

Important
Important

When cloud provider integration is enabled, it’s very important to bootstrap and join nodes with the same node names that they have inside Openstack, as these names will be used by the Openstack cloud controller manager to reconcile node metadata.

3.2.3.2.1 Amazon Web Services (AWS) CPI

Define the cluster using the following command:

skuba cluster init --control-plane <LB IP/FQDN> --cloud-provider aws my-cluster

Running the above command will create a directory my-cluster/cloud/aws with a README.md file in it. No further configuration files are needed.

The supported format and content can be found in the official Kubernetes documentation.

Important
Important

When cloud provider integration is enabled, it’s very important to bootstrap and join nodes with the same node names that they have inside AWS, as these names will be used by the AWS cloud controller manager to reconcile node metadata.

You can use the "private dns" values provided by the Terraform output.

3.2.3.2.2 vSphere In-tree CPI (VCP)

Define the cluster using the following command:

skuba cluster init --control-plane <LB_IP/FQDN> --cloud-provider vsphere my-cluster

Running the above command will create a directory my-cluster/cloud/vsphere with a README.md and a vsphere.conf.template in it. Copy vsphere.conf.template or create a vsphere.conf file inside my-cluster/cloud/vsphere, according to the supported format.

The supported format and content can be found in the official Kubernetes documentation.

Warning
Warning

The file my-cluster/cloud/vsphere/vsphere.conf must not be freely accessible. Please remember to set proper file permissions for it, for example 600.

3.2.3.3 Example vSphere Cloud Provider Configuration

[Global]
user = "<VC_ADMIN_USERNAME>" 1
password = "<VC_ADMIN_PASSWORD>" 2
port = "443" 3
insecure-flag = "1" 4
[VirtualCenter "<VC_IP_OR_FQDN>"] 5
datacenters = "<VC_DATACENTERS>" 6
[Workspace]
server = "<VC_IP_OR_FQDN>" 7
datacenter = "<VC_DATACENTER>" 8
default-datastore = "<VC_DATASTORE>" 9
resourcepool-path = "<VC_RESOURCEPOOL_PATH>" 10
folder = "<VC_VM_FOLDER>" 11
[Disk]
scsicontrollertype = pvscsi 12
[Network]
public-network = "VM Network" 13
[Labels] 14
region = "<VC_DATACENTER_TAG>" 15
zone = "<VC_CLUSTER_TAG>" 16

1

(required) Refers to the vCenter username for vSphere cloud provider to authenticate with.

2

(required) Refers to the vCenter password for vCenter user specified with user.

3

(optional) The vCenter Server Port. The default is 443 if not specified.

4

(optional) Set to 1 if vCenter used a self-signed certificate.

5

(required) The IP address of the vCenter server.

6

(required) The datacenter name in vCenter where Kubernetes nodes reside.

7

(required) The IP address of the vCenter server for storage provisioning. Usually the same as VirtualCenter

8

(required) The datacenter to provision temporary VMs for volume provisioning.

9

(required) The default datastore to provision temporary VMs for volume provisioning.

10

(required) The resource pool to provision temporary VMs for volume provisioning.

11

(required) The vCenter VM folder where Kubernetes nodes are in.

12

(required) Defines the SCSI controller in use on the VMs. Almost always set to pvscsi.

13

(optional) The network in vCenter where Kubernetes nodes should join. The default is "VM Network" if not specified.

14

(optional) The feature flag for zone and region support.

Important
Important

The zone and region tags must exist and assigned to datacenter and cluster before bootstrap. Instruction to tag zones and regions, refer to: https://vmware.github.io/vsphere-storage-for-kubernetes/documentation/zones.html#tag-zones-and-regions-in-vcenter.

15

(optional) The category name of the tag assigned to the vCenter datacenter.

16

(optional) The category name of the tag assigned to the vCenter cluster.

After setting options in the vsphere.conf file, please proceed with Section 3.2.5, “Cluster Bootstrap”.

Important
Important: Set vSphere virtual machine hostnames

When cloud provider integration is enabled, it’s very important to bootstrap and join nodes with the node names same as vSphere virtual machine’s hostnames. These names will be used by the vSphere cloud controller manager to reconcile node metadata.

Important
Important: Enable disk.EnableUUID.

Each virtual machine requires to have disk.EnableUUID enabled to successfully mount the virtual disks.

Clusters provisioned following Deploying VMs from the Template with cpi_enable = true automatically enables disk.EnableUUID.

For clusters provisioned by any other method, ensure virtual machines are set to use disk.EnableUUID.

For more information, refer to: Configure Kubernetes Cluster Virtual Machines .

Important
Important: Create a Folder For Your Virtual Machines.

All virtual machines must exist in a folder and provide the name of that folder as the folder variable in the vsphere.conf before bootstrap.

Clusters provisioned following Deploying VMs from the Template with cpi_enable = true automatically create and place all cluster node virtual machines inside a *-cluster folder.

For clusters provisioned by any other method, make sure to create and move all cluster node virtual machines to a folder.

3.2.3.4 Integrate External LDAP TLS

  1. Based on the manifest in my-cluster/addons/dex/base/dex.yaml, provide a kustomize patch to my-cluster/addons/dex/patches/custom.yaml of the form of strategic merge patch or a JSON 6902 patch.

  2. Adapt the ConfigMap by adding LDAP configuration to the connector section of the custom.yaml file. For detailed configurations for the LDAP connector, refer to https://github.com/dexidp/dex/blob/v2.23.0/Documentation/connectors/ldap.md.

Read https://kubernetes-sigs.github.io/kustomize/api-reference/glossary/#patchstrategicmerge and https://kubernetes-sigs.github.io/kustomize/api-reference/glossary/#patchjson6902 to get more information.

# Example LDAP connector

connectors:
- type: ldap
  id: 389ds
  name: 389ds
  config:
    host: ldap.example.org:636 1 2
    rootCAData: <BASE64_ENCODED_PEM_FILE> 3
    bindDN: cn=user-admin,ou=Users,dc=example,dc=org 4
    bindPW: <BIND_DN_PASSWORD> 5
    usernamePrompt: Email Address 6
    userSearch:
      baseDN: ou=Users,dc=example,dc=org 7
      filter: "(objectClass=person)" 8
      username: mail 9
      idAttr: DN 10
      emailAttr: mail 11
      nameAttr: cn 12

1

Host name of LDAP server reachable from the cluster.

2

The port on which to connect to the host (for example StartTLS: 389, TLS: 636).

3

LDAP server base64 encoded root CA certificate file (for example cat <root-ca-pem-file> | base64 | awk '{print}' ORS='' && echo)

4

Bind DN of user that can do user searches.

5

Password of the user.

6

Label of LDAP attribute users will enter to identify themselves (for example username).

7

BaseDN where users are located (for example ou=Users,dc=example,dc=org).

8

Filter to specify type of user objects (for example "(objectClass=person)").

9

Attribute users will enter to identify themselves (for example mail).

10

Attribute used to identify user within the system (for example DN).

11

Attribute containing the user’s email.

12

Attribute used as username within OIDC tokens.

Besides the LDAP connector you can also set up other connectors. For additional connectors, refer to the available connector configurations in the Dex repository: https://github.com/dexidp/dex/tree/v2.16.0/Documentation/connectors.

3.2.3.5 Prevent Nodes Running Special Workloads from Being Rebooted

Some nodes might run specially treated workloads (pods).

To prevent downtime of those workloads and the respective node, it is possible to flag the pod with --blocking-pod-selector=<POD_NAME>. Any node running this workload will not be rebooted via kured and needs to be rebooted manually.

  1. Based on the manifest in my-cluster/addons/kured/base/kured.yaml, provide a kustomize patch to my-cluster/addons/kured/patches/custom.yaml of the form of strategic merge patch or a JSON 6902 patch. Read https://github.com/kubernetes-sigs/kustomize/blob/master/docs/glossary.md#patchstrategicmerge and https://github.com/kubernetes-sigs/kustomize/blob/master/docs/glossary.md#patchjson6902 to get more information.

  2. Adapt the DaemonSet by adding one of the following flags to the command section of the kured container:

    ---
    apiVersion: apps/v1
    kind: DaemonSet
    ...
    spec:
      ...
        ...
          ...
          containers:
            ...
              command:
                - /usr/bin/kured
                - --blocking-pod-selector=name=<POD_NAME>

You can add any key/value labels to this selector:

--blocking-pod-selector=<LABEL_KEY_1>=<LABEL_VALUE_1>,<LABEL_KEY_2>=<LABEL_VALUE_2>

Alternatively, you can adapt the kured DaemonSet also later during runtime (after bootstrap) by editing my-cluster/addons/kured/patches/custom.yaml and executing:

kubectl apply -k my-cluster/addons/kured/

This will restart all kured pods with the additional configuration flags.

3.2.4 Prevent Nodes with Any Prometheus Alerts from Being Rebooted

Note
Note

By default, any prometheus alert blocks a node from reboot. However you can filter specific alerts to be ignored via the --alert-filter-regexp flag.

  1. Based on the manifest in my-cluster/addons/kured/base/kured.yaml, provide a kustomize patch to my-cluster/addons/kured/patches/custom.yaml of the form of strategic merge patch or a JSON 6902 patch. Read https://github.com/kubernetes-sigs/kustomize/blob/master/docs/glossary.md#patchstrategicmerge and https://github.com/kubernetes-sigs/kustomize/blob/master/docs/glossary.md#patchjson6902 to get more information.

  2. Adapt the DaemonSet by adding one of the following flags to the command section of the kured container:

    ---
    apiVersion: apps/v1
    kind: DaemonSet
    ...
    spec:
      ...
        ...
          ...
          containers:
            ...
              command:
                - /usr/bin/kured
                - --prometheus-url=<PROMETHEUS_SERVER_URL>
                - --alert-filter-regexp=^(RebootRequired|AnotherBenignAlert|...$
Important
Important

The <PROMETHEUS_SERVER_URL> needs to contain the protocol (http:// or https://)

Alternatively you can adapt the kured DaemonSet also later during runtime (after bootstrap) by editing my-cluster/addons/kured/patches/custom.yaml and executing:

kubectl apply -k my-cluster/addons/kured/

This will restart all kured pods with the additional configuration flags.

3.2.5 Cluster Bootstrap

  1. Switch to the new directory.

  2. Now bootstrap a master node. For --target enter the FQDN of your first master node. Replace <NODE_NAME> with a unique identifier, for example, "master-one".

    Note
    Note: Log retention

    By default skuba will only display the events of the bootstrap process in the terminal during execution. The examples in the following sections will use the tee tool to store a copy of the outputs in a file of your choosing.

    For more information on the different logging approaches utilized by SUSE CaaS Platform components please refer to: SUSE CaaS Platform - Admin Guide: Logging.

    Tip
    Tip: Custom Trusted CA Certificate

    During cluster bootstrap, skuba automatically generates a root CA certificate. You can however also deploy the Kubernetes cluster with your own custom root CA certificate.

    Please refer to the SUSE CaaS Platform Administration Guide for more information on custom certificates.

    Warning
    Warning

    Please plan carefully when deploying with a custom root CA certificate. This certificate can not be reconfigured once deployed and requires a full re-installation of the cluster to replace.

    cd my-cluster
    skuba node bootstrap --user sles --sudo --target <IP/FQDN> <NODE_NAME> | tee <NODE_NAME>-skuba-node-bootstrap.log

    This will bootstrap the specified node as the first master in the cluster. The process will generate authentication certificates and the admin.conf file that is used for authentication against the cluster. The files will be stored in the my-cluster directory specified in step one.

  3. Add additional master nodes to the cluster.

    Replace the <IP/FQDN> with the IP for the machine. Replace <NODE_NAME> with a unique identifier, for example, "master-two".

    skuba node join --role master --user sles --sudo --target <IP/FQDN> <NODE_NAME>| tee <NODE_NAME>-skuba-node-join.log
  4. Add a worker to the cluster:

    Replace the <IP/FQDN> with the IP for the machine. Replace <NODE_NAME> with a unique identifier, for example, "worker-one".

    skuba node join --role worker --user sles --sudo --target <IP/FQDN> <NODE_NAME>| tee <NODE_NAME>-skuba-node-join.log
  5. Verify that the nodes have been added:

    skuba cluster status

    The output should look like this:

    NAME      STATUS    ROLE     OS-IMAGE                              KERNEL-VERSION           KUBELET-VERSION   CONTAINER-RUNTIME   HAS-UPDATES   HAS-DISRUPTIVE-UPDATES   CAASP-RELEASE-VERSION
    master0   Ready     master   SUSE Linux Enterprise Server 15 SP1   4.12.14-197.29-default   v1.16.2           cri-o://1.16.1      no            no                       4.2.1
    master1   Ready     master   SUSE Linux Enterprise Server 15 SP1   4.12.14-197.29-default   v1.16.2           cri-o://1.16.1      no            no                       4.2.1
    master2   Ready     master   SUSE Linux Enterprise Server 15 SP1   4.12.14-197.29-default   v1.16.2           cri-o://1.16.1      no            no                       4.2.1
    worker0   Ready     worker   SUSE Linux Enterprise Server 15 SP1   4.12.14-197.29-default   v1.16.2           cri-o://1.16.1      no            no                       4.2.1
    worker1   Ready     worker   SUSE Linux Enterprise Server 15 SP1   4.12.14-197.29-default   v1.16.2           cri-o://1.16.1      no            no                       4.2.1
    worker2   Ready     worker   SUSE Linux Enterprise Server 15 SP1   4.12.14-197.29-default   v1.16.2           cri-o://1.16.1      no            no                       4.2.1
Important
Important

The IP/FQDN must be reachable by every node of the cluster and therefore 127.0.0.1/localhost cannot be used.

3.3 Using kubectl

You can install and use kubectl by installing the kubernetes-client package from the SUSE CaaS Platform extension.

sudo zypper in kubernetes-client
Tip
Tip

Alternatively you can install from upstream: https://v1-17.docs.kubernetes.io/docs/tasks/tools/install-kubectl/.

To talk to your cluster, you must be in the my-cluster directory when running commands so it can find the admin.conf file.

Tip
Tip: Setting up kubeconfig

To make usage of Kubernetes tools easier, you can store a copy of the admin.conf file as kubeconfig.

mkdir -p ~/.kube
cp admin.conf ~/.kube/config
Warning
Warning

The configuration file contains sensitive information and must be handled in a secure fashion. Copying it to a shared user directory might grant access to unwanted users.

You can run commands against your cluster like usual. For example:

  • kubectl get nodes -o wide

    or

  • kubectl get pods --all-namespaces

    # kubectl get pods --all-namespaces
    
    NAMESPACE     NAME                                    READY     STATUS    RESTARTS   AGE
    kube-system   coredns-86c58d9df4-5zftb                1/1       Running   0          2m
    kube-system   coredns-86c58d9df4-fct4m                1/1       Running   0          2m
    kube-system   etcd-my-master                          1/1       Running   0          1m
    kube-system   kube-apiserver-my-master                1/1       Running   0          1m
    kube-system   kube-controller-manager-my-master       1/1       Running   0          1m
    kube-system   cilium-operator-7d6ddddbf5-dmbhv        1/1       Running   0          51s
    kube-system   cilium-qjt9h                            1/1       Running   0          53s
    kube-system   cilium-szkqc                            1/1       Running   0          2m
    kube-system   kube-proxy-5qxnt                        1/1       Running   0          2m
    kube-system   kube-proxy-746ws                        1/1       Running   0          53s
    kube-system   kube-scheduler-my-master                1/1       Running   0          1m
    kube-system   kured-ztnfj                             1/1       Running   0          2m
    kube-system   kured-zv696                             1/1       Running   0          2m
    kube-system   oidc-dex-55fc689dc-b9bxw                1/1       Running   0          2m
    kube-system   oidc-gangway-7b7fbbdbdf-ll6l8           1/1       Running   0          2m
Print this page