Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
Applies to SUSE CaaS Platform 4.2.2

2 Deployment Instructions

Important
Important

If you are installing over one of the previous milestones, you must remove the RPM repository. SUSE CaaS Platform is now distributed as an extension for SUSE Linux Enterprise and no longer requires the separate repository.

If you do not remove the repository before installation, there might be conflicts with the package dependencies that could render your installation nonfunctional.

Note
Note

Due to a naming convention conflict, all Versions of SUSE CaaS Platform 4.x will be released in the 4.0 module.

2.1 Deployment Preparations

In order to deploy SUSE CaaS Platform you need a workstation running SUSE Linux Enterprise 15 SP1 or similar openSUSE equivalent. This workstation is called the "Management machine". Important files are generated and must be maintained on this machine, but it is not a member of the SUSE CaaS Platform cluster.

2.1.1 Basic SSH Key Configuration

In order to successfully deploy SUSE CaaS Platform, you need to have SSH keys loaded into an SSH agent. This is important, because it is required in order to use the installation tools skuba and terraform.

Note
Note

The use of ssh-agent comes with some implications for security that you should take into consideration.

The pitfalls of using ssh-agent

To avoid these risks please make sure to either use ssh-agent -t <TIMEOUT> and specify a time after which the agent will self-terminate, or terminate the agent yourself before logging out by running ssh-agent -k.

To log in to the created cluster nodes from the Management machine, you need to configure an SSH key pair. This key pair needs to be trusted by the user account you will log in with into each cluster node; that user is called "sles" by default. In order to use the installation tools terraform and skuba, this trusted keypair must be loaded into the SSH agent.

  1. If you do not have an existing ssh keypair to use, run:

    ssh-keygen -t ecdsa
  2. The ssh-agent or a compatible program is sometimes started automatically by graphical desktop environments. If that is not your situation, run:

    eval "$(ssh-agent)"

    This will start the agent and set environment variables used for agent communication within the current session. This has to be the same terminal session that you run the skuba commands in. A new terminal usually requires a new ssh-agent. In some desktop environments the ssh-agent will also automatically load the SSH keys. To add an SSH key manually, use the ssh-add command:

    ssh-add <PATH_TO_KEY>
    Tip
    Tip

    If you are adding the SSH key manually, specify the full path. For example: /home/sles/.ssh/id_rsa

You can load multiple keys into your agent using the ssh-add <PATH_TO_KEY> command. Keys should be password protected as a security measure. The ssh-add command will prompt for your password, then the agent caches the decrypted key material for a configurable lifetime. The -t lifetime option to ssh-add specifies a maximum time to cache the specific key. See man ssh-add for more information.

Warning
Warning: Specify a key expiration time

The ssh key is decrypted when loaded into the key agent. Though the key itself is not accesible from the agent, anyone with access to the agent’s control socket file can use the private key contents to impersonate the key owner. By default, socket access is limited to the user who launched the agent. None the less, it is good security practice to specify an expiration time for the decrypted key using the -t option. For example: ssh-add -t 1h30m $HOME/.ssh/id.ecdsa would expire the decrypted key in 1.5 hours. . Alternatively, ssh-agent can also be launched with -t to specify a default timeout. For example: eval $( ssh-agent -t 120s ) would default to a two minute (120 second) timeout for keys added. If timeouts are specified for both programs, the timeout from ssh-add is used. See man ssh-agent and man ssh-add for more information.

Note
Note: Usage of multiple identities with ssh-agent

Skuba will try all the identities loaded into the ssh-agent until one of them grants access to the node, or until the SSH server’s maximum authentication attempts are exhausted. This could lead to undesired messages in SSH or other security/authentication logs on your local machine.

2.1.1.1 Forwarding the Authentication Agent Connection

It is also possible to forward the authentication agent connection from a host to another, which can be useful if you intend to run skuba on a "jump host" and don’t want to copy your private key to this node. This can be achieved using the ssh -A command. Please refer to the man page of ssh to learn about the security implications of using this feature.

2.1.2 Registration Code

Note
Note

The registration code for SUSE CaaS Platform.4 also contains the activation permissions for the underlying SUSE Linux Enterprise operating system. You can use your SUSE CaaS Platform registration code to activate the SUSE Linux Enterprise 15 SP1 subscription during installation.

You need a subscription registration code to use SUSE CaaS Platform. You can retrieve your registration code from SUSE Customer Center.

  • Login to https://scc.suse.com

  • Navigate to MY ORGANIZATIONS → <YOUR ORG>

  • Select the Subscriptions tab from the menu bar at the top

  • Search for "CaaS Platform"

  • Select the version you wish to deploy (should be the highest available version)

  • Click on the Link in the Name column

  • The registration code should be displayed as the first line under "Subscription Information"

Tip
Tip

If you can not find SUSE CaaS Platform in the list of subscriptions please contact your local administrator responsible for software subscriptions or SUSE support.

2.1.3 Unique Machine IDs

During deployment of the cluster nodes, each machine will be assigned a unique ID in the /etc/machine-id file by Terraform or AutoYaST. If you are using any (semi-)manual methods of deployments that involve cloning of machines and deploying from templates, you must make sure to delete this file before creating the template.

If two nodes are deployed with the same machine-id, they will not be correctly recognized by skuba.

Important
Important: Regenerating Machine ID

In case you are not using Terraform or AutoYaST you must regenerate machine IDs manually.

During the template preparation you will have removed the machine ID from the template image. This ID is required for proper functionality in the cluster and must be (re-)generated on each machine.

Log in to each virtual machine created from the template and run:

rm /etc/machine-id
dbus-uuidgen --ensure
systemd-machine-id-setup
systemctl restart systemd-journald

This will regenerate the machine id values for DBUS (/var/lib/dbus/machine-id) and systemd (/etc/machine-id) and restart the logging service to make use of the new IDs.

2.1.4 Installation Tools

For any deployment type you will need skuba and Terraform. These packages are available from the SUSE CaaS Platform package sources. They are provided as an installation "pattern" that will install dependencies and other required packages in one simple step.

Access to the packages requires the SUSE CaaS Platform and Containers extension modules. Enable the modules during the operating system installation or activate them using SUSE Connect.

sudo SUSEConnect -r  <CAASP_REGISTRATION_CODE> 1
sudo SUSEConnect -p sle-module-containers/15.1/x86_64 2
sudo SUSEConnect -p caasp/4.0/x86_64 -r <CAASP_REGISTRATION_CODE> 3

1

Activate SUSE Linux Enterprise

2

Add the free Containers module

3

Add the SUSE CaaS Platform extension with your registration code

Install the required tools:

sudo zypper in -t pattern SUSE-CaaSP-Management

This will install the skuba command line tool and Terraform; as well as various default configurations and examples.

Note
Note: Using a Proxy Server

Sometimes you need a proxy server to be able to connect to the SUSE Customer Center. If you have not already configured a system-wide proxy, you can temporarily do so for the duration of the current shell session like this:

  1. Expose the environmental variable http_proxy:

    export http_proxy=http://PROXY_IP_FQDN:PROXY_PORT
  2. Replace <PROXY_IP_FQDN> by the IP address or a fully qualified domain name (FQDN) of the proxy server and <PROXY_PORT> by its port.

  3. If you use a proxy server with basic authentication, create the file $HOME/.curlrc with the following content:

    --proxy-user "<USER>:<PASSWORD>"

    Replace <USER> and <PASSWORD> with the credentials of an allowed user for the proxy server, and consider limiting access to the file (chmod 0600).

2.1.5 Load Balancer

Important
Important

Setting up a load balancer is mandatory in any production environment.

SUSE CaaS Platform requires a load balancer to distribute workload between the deployed master nodes of the cluster. A failure-tolerant SUSE CaaS Platform cluster will always use more than one control plane node as well as more than one load balancer, so there isn’t a single point of failure.

There are many ways to configure a load balancer. This documentation cannot describe all possible combinations of load balancer configurations and thus does not aim to do so. Please apply your organization’s load balancing best practices.

For SUSE OpenStack Cloud, the Terraform configurations shipped with this version will automatically deploy a suitable load balancer for the cluster.

For bare metal, KVM, or VMware, you must configure a load balancer manually and allow it access to all master nodes created during Chapter 3, Bootstrapping the Cluster.

The load balancer should be configured before the actual deployment. It is needed during the cluster bootstrap, and also during upgrades. To simplify configuration, you can reserve the IPs needed for the cluster nodes and pre-configure these in the load balancer.

The load balancer needs access to port 6443 on the apiserver (all master nodes) in the cluster. It also needs access to Gangway port 32001 and Dex port 32000 on all master and worker nodes in the cluster for RBAC authentication.

We recommend performing regular HTTPS health checks on each master node /healthz endpoint to verify that the node is responsive. This is particularly important during upgrades, when a master node restarts the apiserver. During this rather short time window, all requests have to go to another master node’s apiserver. The master node that is being upgraded will have to be marked INACTIVE on the load balancer pool at least during the restart of the apiserver. We provide reasonable defaults for that on our default openstack load balancer Terraform configuration.

The following contains examples for possible load balancer configurations based on SUSE Linux Enterprise 15 SP1 and nginx or HAProxy.

Important
Important

The load balancer should fulfill your RTO and fault tolerance requirements.

The level of redundancy and how to configure specific environment load balancers is beyond the scope of this document.

For production environments, we recommend the use of SUSE Linux Enterprise High Availability Extension 15

2.1.5.1 HAProxy TCP Load Balancer with Active Checks

Warning
Warning: Package Support

HAProxy is available as a supported package with a SUSE Linux Enterprise High Availability Extension 15 subscription.

Alternatively, you can install HAProxy from SUSE Package Hub but you will not receive product support for this component.

HAProxy is a very powerful load balancer application which is suitable for production environments. Unlike the open source version of nginx mentioned in the example above, HAProxy supports active health checking which is a vital function for reliable cluster health monitoring.

The version used at this date is the 1.8.7.

Important
Important

The configuration of an HA cluster is out of the scope of this document.

The default mechanism is round-robin so each request will be distributed to a different server.

The health-checks are executed every two seconds. If a connection fails, the check will be retried two times with a timeout of five seconds for each request. If no connection succeeds within this interval (2x5s), the node will be marked as DOWN and no traffic will be sent until the checks succeed again.

2.1.5.1.1 Configuring the Load Balancer
  1. Register SLES and enable the "Server Applications" module:

    SUSEConnect -r CAASP_REGISTRATION_CODE
    SUSEConnect --product sle-module-server-applications/15.1/x86_64
  2. Enable the source for the haproxy package:

    • If you are using the SUSE Linux Enterprise High Availability Extension

      SUSEConnect --product sle-ha/15.1/x86_64 -r ADDITIONAL_REGCODE
    • If you want the free (unsupported) package:

      SUSEConnect --product PackageHub/15.1/x86_64
  3. Configure /dev/log for HAProxy chroot (optional)

    This step is only required when HAProxy is configured to run in a jail directory (chroot). This is highly recommended since it increases the security of HAProxy.

    Since HAProxy is chrooted, it’s necessary to make the log socket available inside the jail directory so HAProxy can send logs to the socket.

    mkdir -p /var/lib/haproxy/dev/ && touch /var/lib/haproxy/dev/log

    This systemd service will take care of mounting the socket in the jail directory.

    cat > /etc/systemd/system/bindmount-dev-log-haproxy-chroot.service <<EOF
    [Unit]
    Description=Mount /dev/log in HAProxy chroot
    After=systemd-journald-dev-log.socket
    Before=haproxy.service
    
    [Service]
    Type=oneshot
    ExecStart=/bin/mount --bind /dev/log /var/lib/haproxy/dev/log
    
    [Install]
    WantedBy=multi-user.target
    EOF

    Enabling the service will make the changes persistent after a reboot.

    systemctl enable --now bindmount-dev-log-haproxy-chroot.service
  4. Install HAProxy:

    zypper in haproxy
  5. Write the configuration in /etc/haproxy/haproxy.cfg:

    Note
    Note

    Replace the individual <MASTER_XX_IP_ADDRESS> with the IP of your actual master nodes (one entry each) in the server lines. Feel free to leave the name argument in the server lines (master00 and etc.) as is - it only serves as a label that will show up in the haproxy logs.

    global
      log /dev/log local0 info 1
      chroot /var/lib/haproxy 2
      user haproxy
      group haproxy
      daemon
    
    defaults
      mode       tcp
      log        global
      option     tcplog
      option     redispatch
      option     tcpka
      retries    2
      http-check     expect status 200 3
      default-server check check-ssl verify none
      timeout connect 5s
      timeout client 5s
      timeout server 5s
      timeout tunnel 86400s 4
    
    listen stats 5
      bind    *:9000
      mode    http
      stats   hide-version
      stats   uri       /stats
    
    listen apiserver 6
      bind   *:6443
      option httpchk GET /healthz
      server master00 <MASTER_00_IP_ADDRESS>:6443
      server master01 <MASTER_01_IP_ADDRESS>:6443
      server master02 <MASTER_02_IP_ADDRESS>:6443
    
    listen dex 7
      bind   *:32000
      option httpchk GET /healthz
      server master00 <MASTER_00_IP_ADDRESS>:32000
      server master01 <MASTER_01_IP_ADDRESS>:32000
      server masetr02 <MASTER_02_IP_ADDRESS>:32000
    
    listen gangway 8
      bind   *:32001
      option httpchk GET /
      server master00 <MASTER_00_IP_ADDRESS>:32001
      server master01 <MASTER_01_IP_ADDRESS>:32001
      server master02 <MASTER_02_IP_ADDRESS>:32001

    1

    Forward the logs to systemd journald, the log level can be set to debug to increase verbosity.

    2

    Define if it will run in a chroot.

    4

    This timeout is set to 24h in order to allow long connections when accessing pod logs or port forwarding.

    5

    URL to expose HAProxy stats on port 9000, it is accessible at http://loadbalancer:9000/stats

    3

    The performed health checks will expect a 200 return code

    6

    Kubernetes apiserver listening on port 6443, the checks are performed against https://MASTER_XX_IP_ADDRESS:6443/healthz

    7

    Dex listening on port 32000, it must be accessible through the load balancer for RBAC authentication, the checks are performed against https://MASTER_XX_IP_ADDRESS:32000/healthz

    8

    Gangway listening on port 32001, it must be accessible through the load balancer for RBAC authentication, the checks are performed against https://MASTER_XX_IP_ADDRESS:32001/

  6. Configure firewalld to open up port 6443. As root, run:

    firewall-cmd --zone=public --permanent --add-port=6443/tcp
    firewall-cmd --zone=public --permanent --add-port=32000/tcp
    firewall-cmd --zone=public --permanent --add-port=32001/tcp
    firewall-cmd --reload
  7. Start and enable HAProxy. As root, run:

    systemctl enable --now haproxy
2.1.5.1.2 Verifying the Load Balancer
Important
Important

The SUSE CaaS Platform cluster must be up and running for this to produce any useful results. This step can only be performed after Chapter 3, Bootstrapping the Cluster is completed successfully.

To verify that the load balancer works, you can run a simple command to repeatedly retrieve cluster information from the master nodes. Each request should be forwarded to a different master node.

From your workstation, run:

while true; do skuba cluster status; sleep 1; done;

There should be no interruption in the skuba cluster status running command.

On the load balancer virtual machine, check the logs to validate that each request is correctly distributed in a round robin way.

# journalctl -flu haproxy
haproxy[2525]: 10.0.0.47:59664 [30/Sep/2019:13:33:20.578] apiserver apiserver/master00 1/0/578 9727 -- 18/18/17/3/0 0/0
haproxy[2525]: 10.0.0.47:59666 [30/Sep/2019:13:33:22.476] apiserver apiserver/master01 1/0/747 9727 -- 18/18/17/7/0 0/0
haproxy[2525]: 10.0.0.47:59668 [30/Sep/2019:13:33:24.522] apiserver apiserver/master02 1/0/575 9727 -- 18/18/17/7/0 0/0
haproxy[2525]: 10.0.0.47:59670 [30/Sep/2019:13:33:26.386] apiserver apiserver/master00 1/0/567 9727 -- 18/18/17/3/0 0/0
haproxy[2525]: 10.0.0.47:59678 [30/Sep/2019:13:33:28.279] apiserver apiserver/master01 1/0/575 9727 -- 18/18/17/7/0 0/0
haproxy[2525]: 10.0.0.47:59682 [30/Sep/2019:13:33:30.174] apiserver apiserver/master02 1/0/571 9727 -- 18/18/17/7/0 0/0

2.1.5.2 Nginx TCP Load Balancer with Passive Checks

For TCP load balancing, we can use the ngx_stream_module module (available since version 1.9.0). In this mode, nginx will just forward the TCP packets to the master nodes.

The default mechanism is round-robin so each request will be distributed to a different server.

Warning
Warning

The open source version of Nginx referred to in this guide only allows the use of use passive health checks. nginx will mark a node as unresponsive only after a failed request. The original request is lost and not forwarded to an available alternative server.

This load balancer configuration is therefore only suitable for testing and proof-of-concept (POC) environments.

2.1.5.2.1 Configuring the Load Balancer
  1. Register SLES and enable the "Server Applications" module:

    SUSEConnect -r CAASP_REGISTRATION_CODE
    SUSEConnect --product sle-module-server-applications/15.1/x86_64
  2. Install Nginx:

    zypper in nginx
  3. Write the configuration in /etc/nginx/nginx.conf:

    user  nginx;
    worker_processes  auto;
    
    load_module /usr/lib64/nginx/modules/ngx_stream_module.so;
    
    error_log  /var/log/nginx/error.log;
    error_log  /var/log/nginx/error.log  notice;
    error_log  /var/log/nginx/error.log  info;
    
    events {
        worker_connections  1024;
        use epoll;
    }
    
    stream {
        log_format proxy '$remote_addr [$time_local] '
                         '$protocol $status $bytes_sent $bytes_received '
                         '$session_time "$upstream_addr"';
    
        error_log  /var/log/nginx/k8s-masters-lb-error.log;
        access_log /var/log/nginx/k8s-masters-lb-access.log proxy;
    
        upstream k8s-masters {
            #hash $remote_addr consistent; 1
            server master00:6443 weight=1 max_fails=2 fail_timeout=5s; 2
            server master01:6443 weight=1 max_fails=2 fail_timeout=5s;
            server master02:6443 weight=1 max_fails=2 fail_timeout=5s;
        }
        server {
            listen 6443;
            proxy_connect_timeout 5s;
            proxy_timeout 30s;
            proxy_pass k8s-masters;
        }
    
        upstream dex-backends {
            #hash $remote_addr consistent; 3
            server master00:32000 weight=1 max_fails=2 fail_timeout=5s; 4
            server master01:32000 weight=1 max_fails=2 fail_timeout=5s;
            server master02:32000 weight=1 max_fails=2 fail_timeout=5s;
        }
        server {
            listen 32000;
            proxy_connect_timeout 5s;
            proxy_timeout 30s;
            proxy_pass dex-backends; 5
        }
    
        upstream gangway-backends {
            #hash $remote_addr consistent; 6
            server master00:32001 weight=1 max_fails=2 fail_timeout=5s; 7
            server master01:32001 weight=1 max_fails=2 fail_timeout=5s;
            server master02:32001 weight=1 max_fails=2 fail_timeout=5s;
        }
        server {
            listen 32001;
            proxy_connect_timeout 5s;
            proxy_timeout 30s;
            proxy_pass gangway-backends; 8
        }
    }

    1 3 6

    Note: To enable session persistence, uncomment the hash option so the same client will always be redirected to the same server except if this server is unavailable.

    2 4 7

    Replace the individual masterXX with the IP/FQDN of your actual master nodes (one entry each) in the upstream k8s-masters section.

    5 8

    Dex port 32000 and Gangway port 32001 must be accessible through the load balancer for RBAC authentication.

  4. Configure firewalld to open up port 6443. As root, run:

    firewall-cmd --zone=public --permanent --add-port=6443/tcp
    firewall-cmd --zone=public --permanent --add-port=32000/tcp
    firewall-cmd --zone=public --permanent --add-port=32001/tcp
    firewall-cmd --reload
  5. Start and enable Nginx. As root, run:

    systemctl enable --now nginx
2.1.5.2.2 Verifying the Load Balancer
Important
Important

The SUSE CaaS Platform cluster must be up and running for this to produce any useful results. This step can only be performed after Chapter 3, Bootstrapping the Cluster is completed successfully.

To verify that the load balancer works, you can run a simple command to repeatedly retrieve cluster information from the master nodes. Each request should be forwarded to a different master node.

From your workstation, run:

while true; do skuba cluster status; sleep 1; done;

There should be no interruption in the skuba cluster status running command.

On the load balancer virtual machine, check the logs to validate that each request is correctly distributed in a round robin way.

# tail -f /var/log/nginx/k8s-masters-lb-access.log
10.0.0.47 [17/May/2019:13:49:06 +0000] TCP 200 2553 1613 1.136 "10.0.0.145:6443"
10.0.0.47 [17/May/2019:13:49:08 +0000] TCP 200 2553 1613 0.981 "10.0.0.148:6443"
10.0.0.47 [17/May/2019:13:49:10 +0000] TCP 200 2553 1613 0.891 "10.0.0.7:6443"
10.0.0.47 [17/May/2019:13:49:12 +0000] TCP 200 2553 1613 0.895 "10.0.0.145:6443"
10.0.0.47 [17/May/2019:13:49:15 +0000] TCP 200 2553 1613 1.157 "10.0.0.148:6443"
10.0.0.47 [17/May/2019:13:49:17 +0000] TCP 200 2553 1613 0.897 "10.0.0.7:6443"

2.2 Deployment on SUSE OpenStack Cloud

Note
Note: Preparation Required

You must have completed Section 2.1, “Deployment Preparations” to proceed.

You will use Terraform to deploy the required master and worker cluster nodes (plus a load balancer) to SUSE OpenStack Cloud and then use the skuba tool to bootstrap the Kubernetes cluster on top of those.

  1. Download the SUSE OpenStack Cloud RC file.

    1. Log in to SUSE OpenStack Cloud.

    2. Click on your username in the upper right hand corner to reveal the drop-down menu.

    3. Click on Download OpenStack RC File v3.

    4. Save the file to your workstation.

    5. Load the file into your shell environment using the following command, replacing DOWNLOADED_RC_FILE with the name your file:

      source <DOWNLOADED_RC_FILE>.sh
    6. Enter the password for the RC file. This should be same the credentials that you use to log in to SUSE OpenStack Cloud.

  2. Get the SLES15-SP1 image.

    1. Download the pre-built image of SUSE SLES15-SP1 for SUSE OpenStack Cloud from https://download.suse.com/Download?buildid=OE-3enq3uys~.

    2. Upload the image to your SUSE OpenStack Cloud.

Note
Note: The default user is 'sles'

The SUSE SLES15-SP1 images for SUSE OpenStack Cloud come with predefined user sles, which you use to log in to the cluster nodes. This user has been configured for password-less 'sudo' and is the one recommended to be used by Terraform and skuba.

2.2.1 Deploying the Cluster Nodes

  1. Find the Terraform template files for SUSE OpenStack Cloud in /usr/share/caasp/terraform/openstack (which was installed as part of the management pattern - sudo zypper in -t pattern SUSE-CaaSP-Management). Copy this folder to a location of your choice as the files need adjustment.

    mkdir -p ~/caasp/deployment/
    cp -r /usr/share/caasp/terraform/openstack/ ~/caasp/deployment/
    cd ~/caasp/deployment/openstack/
  2. Once the files are copied, rename the terraform.tfvars.example file to terraform.tfvars:

    mv terraform.tfvars.example terraform.tfvars
  3. Edit the terraform.tfvars file and add/modify the following variables:

    # Name of the image to use
    image_name = "SLE-15-SP1-JeOS-GMC"
    
    # Identifier to make all your resources unique and avoid clashes with other users of this Terraform project
    stack_name = "testing" 1
    
    # Name of the internal network to be created
    internal_net = "testing-net" 2
    
    # Name of the internal subnet to be created
    # IMPORTANT: If this variable is not set or empty,
    # then it will be generated following a schema like
    # internal_subnet = "${var.internal_net}-subnet"
    internal_subnet = "testing-subnet"
    
    # Name of the internal router to be created
    # IMPORTANT: If this variable is not set or empty,
    # then it will be generated following a schema like
    # internal_router = "${var.internal_net}-router"
    internal_router = "testing-router"
    
    # Name of the external network to be used, the one used to allocate floating IPs
    external_net = "floating"
    
    # CIDR of the subnet for the internal network
    subnet_cidr = "172.28.0.0/24"
    
    # Number of master nodes
    masters = 3 3
    
    # Number of worker nodes
    workers = 2 4
    
    # Size of the master nodes
    master_size = "t2.large"
    
    # Size of the worker nodes
    worker_size = "t2.large"
    
    # Attach persistent volumes to workers
    workers_vol_enabled = 0
    
    # Size of the worker volumes in GB
    workers_vol_size = 5
    
    # Name of DNS domain
    dnsdomain = "testing.example.com"
    
    # Set DNS Entry (0 is false, 1 is true)
    dnsentry = 0
    
    # Optional: Define the repositories to use
    # repositories = {
    #   repository1 = "http://repo.example.com/repository1/"
    #   repository2 = "http://repo.example.com/repository2/"
    # }
    repositories = {} 5
    
    # Define required packages to be installed/removed. Do not change those.
    packages = [  6
      "kernel-default",
      "-kernel-default-base",
      "new-package-to-install"
    ]
    
    # ssh keys to inject into all the nodes
    authorized_keys = [ 7
      ""
    ]
    
    # IMPORTANT: Replace these ntp servers with ones from your infrastructure
    ntp_servers = ["0.example.ntp.org", "1.example.ntp.org", "2.example.ntp.org", "3.example.ntp.org"] 8

    1

    stack_name: Prefix for all machines of the cluster spawned by terraform.

    2

    internal_net: the internal network name that will be created/used for the cluster in SUSE OpenStack Cloud. Note: This string will be used to generate the human readable IDs in SUSE OpenStack Cloud. If you use a generic term, deployment is very likely to fail because the term is already in use by someone else. It’s a good idea to use your username or some other unique identifier.

    3

    masters: Number of master nodes to be deployed.

    4

    workers: Number of worker nodes to be deployed.

    5

    repositories: A list of additional repositories to be added on each machines. Leave empty if no additional packages need to be installed.

    6

    packages: Additional packages to be installed on the node. Note: Do not remove any of the pre-filled values in the packages section. This can render your cluster unusable. You can add more packages but do not remove any of the default packages listed.

    7

    authorized_keys: List of ssh public keys that will be injected into the cluster nodes, allowing you to be able to log in into them via SSH as sles user. Copy and paste the text from the keyname.pub file here, not the private key. At least one of the keys must match a key loaded into your ssh-agent.

    8

    ntp_servers: A list of ntp servers you would like to use with chrony.

    Tip
    Tip

    You can set the timezone before deploying the nodes by modifying the following file:

    • ~/caasp/deployment/openstack/cloud-init/common.tpl

  4. (Optional) If you absolutely need to be able to SSH into your cluster nodes using password instead of key-based authentication, this is the best time to set it globally for all of your nodes. If you do this later, you will have to do it manually. To set this, modify the cloud-init configuration and comment out the related SSH configuration: ~/caasp/deployment/openstack/cloud-init/common.tpl

    # Workaround for bsc#1138557 . Disable root and password SSH login
    # - sed -i -e '/^PermitRootLogin/s/^.*$/PermitRootLogin no/' /etc/ssh/sshd_config
    # - sed -i -e '/^#ChallengeResponseAuthentication/s/^.*$/ChallengeResponseAuthentication no/' /etc/ssh/sshd_config
    # - sed -i -e '/^#PasswordAuthentication/s/^.*$/PasswordAuthentication no/' /etc/ssh/sshd_config
    # - systemctl restart sshd
  5. Register your nodes by using the SUSE CaaSP Product Key or by registering nodes against local SUSE Repository Mirroring Server in ~/caasp/deployment/openstack/registration.auto.tfvars:

    Substitute <CAASP_REGISTRATION_CODE> for the code from Section 2.1.2, “Registration Code”.

    ## To register CaaSP product please use one of the following method
    # - register against SUSE Customer Service, with SUSE CaaSP Product Key
    # - register against local SUSE Repository Mirroring Server
    
    # SUSE CaaSP Product Key
    caasp_registry_code = "<CAASP_REGISTRATION_CODE>"
    
    # SUSE Repository Mirroring Server Name (FQDN)
    #rmt_server_name = "rmt.example.com"

    This is required so all the deployed nodes can automatically register with SUSE Customer Center and retrieve packages.

  6. You can also enable Cloud Provider Integration with OpenStack in ~/caasp/deployment/openstack/cpi.auto.tfvars:

    # Enable CPI integration with OpenStack
    cpi_enable = true
    
    # Used to specify the name of to your custom CA file located in /etc/pki/trust/anchors/.
    # Upload CUSTOM_CA_FILE to this path on nodes before joining them to your cluster.
    #ca_file = "/etc/pki/trust/anchors/<CUSTOM_CA_FILE>"
  7. Now you can deploy the nodes by running:

    terraform init
    terraform plan
    terraform apply

    Check the output for the actions to be taken. Type "yes" and confirm with Enter when ready. Terraform will now provision all the machines and network infrastructure for the cluster.

    Important
    Important: Note down IP/FQDN for nodes

    The IP addresses of the generated machines will be displayed in the Terraform output during the cluster node deployment. You need these IP addresses to deploy SUSE CaaS Platform to the cluster.

    If you need to find an IP address later on, you can run terraform output within the directory you performed the deployment from the ~/caasp/deployment/openstack directory or perform the following steps:

    1. Log in to SUSE OpenStack Cloud and click on Network › Load Balancers. Find the one with the string you entered in the Terraform configuration above, for example "testing-lb".

    2. Note down the "Floating IP". If you have configured an FQDN for this IP, use the host name instead.

      deploy loadbalancer ip
    3. Now click on Compute › Instances.

    4. Switch the filter dropdown box to Instance Name and enter the string you specified for stack_name in the terraform.tfvars file.

    5. Find the floating IPs on each of the nodes of your cluster.

2.2.2 Logging in to the Cluster Nodes

  1. Connecting to the cluster nodes can be accomplished only via SSH key-based authentication thanks to the ssh-public key injection done earlier via Terraform. You can use the predefined sles user to log in.

    If the ssh-agent is running in the background, run:

    ssh sles@<NODE_IP_ADDRESS>

    Without the ssh-agent running, run:

    ssh sles@<NODE_IP_ADDRESS> -i <PATH_TO_YOUR_SSH_PRIVATE_KEY>
  2. Once connected, you can execute commands using password-less sudo. In addition to that, you can also set a password if you prefer to.

    To set the root password, run:

    sudo passwd

    To set the sles user’s password, run:

    sudo passwd sles
Important
Important: Password authentication has been disabled

Under the default settings you always need your SSH key to access the machines. Even after setting a password for either root or sles user, you will be unable to log in via SSH using their respective passwords. You will most likely receive a Permission denied (publickey) error. This mechanism has been deliberately disabled because of security best practices. However, if this setup does not fit your workflows, you can change it at your own risk by modifying the SSH configuration: under /etc/ssh/sshd_config

To allow password SSH authentication, set:

+ PasswordAuthentication yes

To allow login as root via SSH, set:

+ PermitRootLogin yes

For the changes to take effect, you need to restart the SSH service by running:

sudo systemctl restart sshd.service

2.2.3 Container Runtime Proxy

Important
Important

CRI-O proxy settings must be adjusted on all nodes before joining the cluster!

Please refer to: https://documentation.suse.com/suse-caasp/4.2/single-html/caasp-admin/#_configuring_httphttps_proxy_for_cri_o

In some environments you must configure the container runtime to access the container registries through a proxy. In this case, please refer to: SUSE CaaS Platform Admin Guide: Configuring HTTP/HTTPS Proxy for CRI-O

2.3 Deployment on VMware

Note
Note: Preparation Required

You must have completed Section 2.1, “Deployment Preparations” to proceed.

2.3.1 Environment Description

Note
Note

These instructions are based on VMware ESXi 6.7.

Important
Important

These instructions currently do not describe how to set up a load balancer in a VMware environment.

This will be added in future versions. You must provide your own load balancing solution that directs access to the master nodes.

Important
Important

VMware vSPhere doesn’t offer a load-balancer solution. Please expose port 6443 for the Kubernetes api-servers on the master nodes on a local load balancer using round-robin 1:1 port forwarding.

2.3.2 Choose a Setup Type

You can deploy SUSE CaaS Platform onto existing VMware instances using AutoYaST or create a VMware template that will also create the instances. You must choose one method to deploy the entire cluster!

Please follow the instructions for you chosen method below and ignore the instructions for the other method.

2.3.3 Setup with AutoYaST

Note
Note

If you choose the AutoYaST method, please ignore all the following steps for the VMware template creation.

For each VM deployment, follow the AutoYaST installation method used for deployment on bare metal machines as described in Section 2.4, “Deployment on Bare Metal or KVM”.

2.3.4 Setup Using the VMware Template

2.3.4.1 Choose a Disk Format for the Template

Before creating the template, it is important to select the right format of the root hard disk for the node. This format is then used by default when creating new instances with Terraform.

For the majority of cases, we recommend using Thick Provision Lazy Zeroed. This format is quick to create, provides good performance, and avoids the risk of running out of disk space due to over-provisioning.

Note
Note

It is not possible to resize a disk when using Thick Provision Eager Zeroed. For this reason, the Terraform variables master_disk_size and worker_disk_size must be set to the exact same size as in the original template.

Official VMware documentation describes these formats as follows:

Thick Provision Lazy Zeroed

Creates a virtual disk in a default thick format. Space required for the virtual disk is allocated when the disk is created. Data remaining on the physical device is not erased during creation, but is zeroed out on demand later on first write from the virtual machine. Virtual machines do not read stale data from the physical device.

Thick Provision Eager Zeroed

A type of thick virtual disk that supports clustering features such as Fault Tolerance. Space required for the virtual disk is allocated at creation time. In contrast to the thick provision lazy zeroed format, the data remaining on the physical device is zeroed out when the virtual disk is created. It might take longer to create virtual disks in this format than to create other types of disks. Increasing the size of an Eager Zeroed Thick virtual disk causes a significant stun time for the virtual machine.

Thin Provision

Use this format to save storage space. For the thin disk, you provision as much datastore space as the disk would require based on the value that you enter for the virtual disk size. However, the thin disk starts small and at first, uses only as much datastore space as the disk needs for its initial operations. If the thin disk needs more space later, it can grow to its maximum capacity and occupy the entire datastore space provisioned to it.

Important
Important: Select the Disk Format Thoughtfully

It is not possible to change the format in Terraform later. Once you have selected one format, you can only create instances with that format and it is not possible to switch.

2.3.4.2 VM Preparation for Creating a Template

  1. Upload the ISO image SLE-15-SP1-Installer-DVD-x86_64-GM-DVD1.iso to the desired VMware datastore.

Now you can create a new base VM for SUSE CaaS Platform within the designated resource pool through the vSphere WebUI:

  1. Create a "New Virtual Machine".

  2. Define a name for the virtual machine (VM).

    vmware step1
  3. Select the folder where the VM will be stored.

  4. Select a Compute Resource that will run the VM.

    vmware step2
  5. Select the storage to be used by the VM.

    vmware step3
  6. Select ESXi 6.7 and later from compatibility.

    vmware step4
  7. Select Guest OS Family › Linux and Guest OS Version › SUSE Linux Enterprise 15 (64-bit).

    Note: You will manually select the correct installation medium in the next step.

    vmware step5
  8. Now customize the hardware settings.

    vmware step6
    1. Select CPU › 2.

    2. Select Memory › 4096 MB.

    3. Select New Hard disk › 40 GB, New Hard disk › Disk Provisioning > See Section 2.3.4.1, “Choose a Disk Format for the Template” to select the appropriate disk format. For Thick Provision Eager Zeroed, use this value for Terraform variables master_disk_size and worker_disk_size

    4. Select New SCSI Controller › LSI Logic Parallel SCSI controller (default) and change it to "VMware Paravirtualized".

    5. Select New Network › VM Network, New Network › Adapter Type › VMXNET3.

      ("VM Network" sets up a bridged network which provides a public IP address reachable within a company.)

    6. Select New CD/DVD › Datastore ISO File.

    7. Check the box New CD/DVD › Connect At Power On to be able boot from ISO/DVD.

    8. Then click on "Browse" next to the CD/DVD Media field to select the downloaded ISO image on the desired datastore.

    9. Go to the VM Options tab.

      vmware step6b
    10. Select Boot Options.

    11. Select Firmware › BIOS.

    12. Confirm the process with Next.

2.3.4.2.1 SUSE Linux Enterprise Server Installation

Power on the newly created VM and install the system over graphical remote console:

  1. Enter registration code for SUSE Linux Enterprise in YaST.

  2. Confirm the update repositories prompt with "Yes".

  3. Remove the check mark in the "Hide Development Versions" box.

  4. Make sure the following modules are selected on the "Extension and Module Selection" screen:

    vmware extension
    • SUSE CaaS Platform 4.0 x86_64

      Note
      Note

      Due to a naming convention conflict, all Versions of 4.x will be released in the 4.0 Module.

    • Basesystem Module

    • Containers Module (this will automatically be checked when you select SUSE CaaS Platform)

    • Public Cloud Module

  5. Enter the registration code to unlock the SUSE CaaS Platform extension.

  6. Select System Role › Minimal on the "System Role" screen.

  7. Click on "Expert Partitioner" to redesign the default partition layout.

  8. Select "Start with current proposal".

    vmware step8
    1. Keep sda1 as BIOS partition.

    2. Remove the root / partition.

      Select the device in "System View" on the left (default: /dev/sda2) and click "Delete". Confirm with "Yes".

      vmware step9
    3. Remove the /home partition.

    4. Remove the swap partition.

  9. Select the /dev/sda/ device in "System View" and then click Partitions › Add Partition.

    vmware step10
  10. Accept the default maximum size (remaining size of the hard disk defined earlier without the boot partition).

    vmware step11
    1. Confirm with "Next".

    2. Select Role › Operating System

      vmware step12
    3. Confirm with "Next".

    4. Accept the default settings.

      vmware step13
      • Filesystem: BtrFS

      • Enable Snapshots

      • Mount Device

      • Mount Point /

  11. You should be left with two partitions. Now click "Accept".

    vmware step7
  12. Confirm the partitioning changes.

    vmware step14
  13. Click "Next".

  14. Configure your timezone and click "Next".

  15. Create a user with the username sles and specify a password.

    1. Check the box Local User › Use this password for system administrator.

      vmware step15
  16. Click "Next".

  17. On the "Installation Settings" screen:

    1. In the "Security" section:

      1. Disable the Firewall (click on (disable)).

      2. Enable the SSH service (click on (enable)).

    2. Scroll to the kdump section of the software description and click on the title.

  18. In the "Kdump Start-Up" screen, select Enable/Disable Kdump › Disable Kdump.

    Note
    Note

    Kdump needs to be disabled because it defines a certain memory limit. If you later wish to deploy the template on a machine with different memory allocation (e.g. template created for 4GB, new machine has 2GB), the results of Kdump will be useless.

    You can always configure Kdump on the machine after deploying from the template.

    Refer to: SUSE Linux Enterprise Server 15 SP1 System Analysis and Tuning Guide: Basic Kdump Configuration

    1. Confirm with "OK".

      vmware step16
  19. Click "Install". Confirm the installation by clicking "Install" in the pop-up dialog.

  20. Finish the installation and confirm system reboot with "OK".

    vmware step17

2.3.4.3 Preparation of the VM as a Template

In order to run SUSE CaaS Platform on the created VMs, you must configure and install some additional packages like sudo, cloud-init and open-vm-tools.

Tip
Tip: Activate extensions during SUSE Linux Enterprise installation with YaST

Steps 1-4 may be skipped, if they were already performed in YaST during the SUSE Linux Enterprise installation.

  1. Register the SLES15-SP1 system. Substitute <CAASP_REGISTRATION_CODE> for the code from Section 2.1.2, “Registration Code”.

    SUSEConnect -r CAASP_REGISTRATION_CODE
  2. Register the Containers module (free of charge):

    SUSEConnect -p sle-module-containers/15.1/x86_64
  3. Register the Public Cloud module for basic cloud-init package (free of charge):

    SUSEConnect -p sle-module-public-cloud/15.1/x86_64
  4. Register the SUSE CaaS Platform module. Substitute <CAASP_REGISTRATION_CODE> for the code from Section 2.1.2, “Registration Code”.

    SUSEConnect -p caasp/4.0/x86_64 -r CAASP_REGISTRATION_CODE
  5. Install required packages. As root, run:

    zypper in sudo cloud-init cloud-init-vmware-guestinfo open-vm-tools
  6. Enable the installed cloud-init services. As root, run:

    systemctl enable cloud-init cloud-init-local cloud-config cloud-final
  7. Deregister from scc:

    SUSEConnect -d; SUSEConnect --cleanup
  8. Do a cleanup of the SLE image for converting into a VMware template. As root, run:

    rm /etc/machine-id /var/lib/zypp/AnonymousUniqueId \
    /var/lib/systemd/random-seed /var/lib/dbus/machine-id \
    /var/lib/wicked/*
  9. Clean up btrfs snapshots and create one with initial state:

    snapper list
    snapper delete <list_of_nums_of_unneeded_snapshots>
    snapper create -d "Initial snapshot for caasp template" -t single
  10. Power down the VM. As root, run:

    shutdown -h now

2.3.4.4 Creating the VMware Template

Now you can convert the VM into a template in VMware (or repeat this action block for each VM).

  1. In the vSphere WebUI, right-click on the VM and select Template › Convert to Template. Name it reasonably so you can later identify the template. The template will be created.

2.3.4.5 Deploying VMs from the Template

2.3.4.5.1 Using Terraform
  1. Find the Terraform template files for VMware in /usr/share/caasp/terraform/vmware which was installed as part of the management pattern (sudo zypper in patterns-caasp-Management). Copy this folder to a location of your choice; as the files need to be adjusted.

    mkdir -p ~/caasp/deployment/
    cp -r /usr/share/caasp/terraform/vmware/ ~/caasp/deployment/
    cd ~/caasp/deployment/vmware/
  2. Once the files are copied, rename the terraform.tfvars.example file to terraform.tfvars:

    mv terraform.tfvars.example terraform.tfvars
  3. Edit the terraform.tfvars file and add/modify the following variables:

# datastore to use in vSphere
vsphere_datastore = "STORAGE-0" 1

# datastore_ cluster to use in vSphere
vsphere_datastore_cluster = "STORAGE-CLUSTER-0" 2

# datacenter to use in vSphere
vsphere_datacenter = "DATACENTER" 3

# network to use in vSphere
vsphere_network = "VM Network" 4

# resource pool the machines will be running in
vsphere_resource_pool = "esxi1/Resources" 5

# template name the machines will be copied from
template_name = "sles15-sp1-caasp" 6

# IMPORTANT: Replace by "efi" string in case your template was created by using EFI firmware
firmware = "bios"

# prefix that all of the booted machines will use
# IMPORTANT: please enter unique identifier below as value of
# stack_name variable to not interfere with other deployments
stack_name = "caasp-v4" 7

# Number of master nodes
masters = 1 8

# Optional: Size of the root disk in GB on master node
master_disk_size = 50 9

# Number of worker nodes
workers = 2 10

# Optional: Size of the root disk in GB on worker node
worker_disk_size = 40 11

# Username for the cluster nodes. Must exist on base OS.
username = "sles" 12

# Optional: Define the repositories to use
# repositories = {
#   repository1 = "http://repo.example.com/repository1/"
#   repository2 = "http://repo.example.com/repository2/"
# }
repositories = {} 13

# Minimum required packages. Do not remove them.
# Feel free to add more packages
packages = [ 14
]

# ssh keys to inject into all the nodes
authorized_keys = [ 15
  "ssh-rsa <example_key> example@example.com"
]

# IMPORTANT: Replace these ntp servers with ones from your infrastructure
ntp_servers = ["0.example.ntp.org", "1.example.ntp.org", "2.example.ntp.org", "3.example.ntp.org"] 16

# Controls whether or not the guest network waiter waits for a routable address.
# Default is True and should not be changed unless you hit the upstream bug: https://github.com/hashicorp/terraform-provider-vsphere/issues/1127
wait_for_guest_net_routable = true 17
Important
Important

Only one of vsphere_datastore or vsphere_datastore_cluster can be set at the same time. Proceed to comment or delete the unused one from your terraform.tfvars

1

vsphere_datastore: The datastore to use. This option is mutually exclusive with vsphere_datastore_cluster.

2

vsphere_datastore_cluster: The datastore cluster to use. This option is mutually exclusive with vsphere_datastore.

3

vsphere_datacenter: The datacenter to use.

4

vsphere_network: The network to use.

5

vsphere_resource_pool: The root resource pool or an user-created child resource pool. Refer to https://docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsphere.resmgmt.doc/GUID-60077B40-66FF-4625-934A-641703ED7601.html for detailed information.

6

template_name: The name of the template created according to instructions.

7

stack_name: Prefix for all machines of the cluster spawned by terraform. Note: This string will be used to generate the human readable IDs in SUSE OpenStack Cloud. If you use a generic term, deployment very likely to fail because the term is already in use by someone else. It’s a good idea to use your username or some other unique identifier.

8

masters: Number of master nodes to be deployed.

9

master_disk_size: Size of the root disk in GB. Note: The value must be at least the same size as the source template. It is only possible to increase the size of a disk.

10

workers: Number of worker nodes to be deployed.

11

worker_disk_size: Size of the root disk in GB. Note: The value must be at least the same size as the source template. It is only possible to increase the size of a disk.

12

username: Login username for the nodes. Note: Leave this as the default sles. The username must exist on the used base operating system. It will not be created.

13

repositories: A list of additional repositories to be added on each machines. Leave empty if no additional packages need to be installed.

14

packages: Additional packages to be installed on the node. Note: Do not remove any of the pre-filled values in the packages section. This can render your cluster unusable. You can add more packages but do not remove any of the default packages listed.

15

authorized_keys: List of ssh-public-keys that will be able to log in to the deployed machines.

16

ntp_servers: A list of ntp servers you would like to use with chrony.

17

wait_for_guest_net_routable: true or false to disable the routable check. Default is true. Note: This should only be changed if terraform times out while creating VMs as mentioned on the upstream bug: https://github.com/hashicorp/terraform-provider-vsphere/issues/1127

  1. Enter the registration code for your nodes in ~/caasp/deployment/vmware/registration.auto.tfvars:

    Substitute <CAASP_REGISTRATION_CODE> for the code from Section 2.1.2, “Registration Code”.

    # SUSE CaaSP Product Product Key
    caasp_registry_code = "CAASP_REGISTRATION_CODE"

    This is required so all the deployed nodes can automatically register with SUSE Customer Center and retrieve packages.

  2. You can also enable Cloud Provider Integration with vSphere.

    # Enable CPI integration with vSphere
    cpi_enable = true
  3. You can also disable node hostnames set from DHCP server. It is recommended when you enabled CPI integration with vSphere. This sets vSphere virtual machine’s hostname with a naming convention ("<stack_name>-master-<index>" or "<stack_name>-worker-<index>"). This can be used as node name when using skuba command to bootstrapping or joining nodes.

    Important
    Important

    It is mandatory that each virtual machine’s hostname must match its cluster node name.

    # Set node's hostname from DHCP server
    hostname_from_dhcp = false

    Once the files are adjusted, terraform needs to know about the vSphere server and the login details for it; these can be exported as environment variables or entered every time terraform is invoked.

  4. Additionally, the ssh-key that is specified in the tfvars file must be added to the key agent so the machine running skuba can ssh into the machines:

    export VSPHERE_SERVER="<server_address"
    export VSPHERE_USER="<username>"
    export VSPHERE_PASSWORD="<password>"
    export VSPHERE_ALLOW_UNVERIFIED_SSL=true # In case you are using custom certificate for accessing vsphere API
    
    ssh-add <path_to_private_ssh_key_from_tfvars>
    Warning
    Warning: Specify a key expiration time

    The ssh key is decrypted when loaded into the key agent. Though the key itself is not accesible, anyone with access to the agent’s control socket file can use the private key contents to impersonate the key owner. By default, socket access is limited to the user who launched the agent. None the less, it is still good security practice to specify an expiration time for the decrypted key using the -t option.

    For example: ssh-add -t 1h30m $HOME/.ssh/id.ecdsa would expire the decrypted key in 1.5 hours. See man ssh-agent and man ssh-add for more information.

  5. Run Terraform to create the required machines for use with skuba:

    terraform init
    terraform plan
    terraform apply
2.3.4.5.2 Setup by Hand
Note
Note

Full instructions for the manual setup and configuration are currently not in scope of this document.

Deploy the template to your created VMs. After that, boot into the node and configure the OS as needed.

  1. Power on the newly created VMs

  2. Generate new machine IDs on each node

  3. You need to know the FQDN/IP for each of the created VMs during the bootstrap process

  4. Continue with bootstrapping/joining of nodes

Tip
Tip

To manually generate the unique machine-id please refer to: Important: Regenerating Machine ID.

2.3.5 Container Runtime Proxy

Important
Important

CRI-O proxy settings must be adjusted on all nodes before joining the cluster!

In some environments you must configure the container runtime to access the container registries through a proxy. In this case, please refer to: SUSE CaaS Platform Admin Guide: Configuring HTTP/HTTPS Proxy for CRI-O

2.4 Deployment on Bare Metal or KVM

Note
Note: Preparation Required

You must have completed Section 2.1, “Deployment Preparations” to proceed.

Note
Note

If deploying on KVM virtual machines, you may use a tool such as virt-manager to configure the virtual machines and begin the SUSE Linux Enterprise 15 SP1 installation.

2.4.1 Environment Description

Note
Note

You must have a load balancer configured as described in Section 2.1.5, “Load Balancer”.

Note
Note

The AutoYaST file found in skuba is a template. It has the base requirements. This AutoYaST file should act as a guide and should be updated with your company’s standards.

Note
Note

To account for hardware/platform-specific setup criteria (legacy BIOS vs. (U)EFI, drive partitioning, networking, etc.), you must adjust the AutoYaST file to your needs according to the requirements.

Refer to the official AutoYaST documentation for more information: AutoYaST Guide.

2.4.1.1 Machine Configuration Prerequisites

Deployment with AutoYaST will require a minimum disk size of 40 GB. That space is reserved for container images without any workloads (10 GB), for the root partition (30 GB) and the EFI system partition (200 MB).

2.4.2 AutoYaST Preparation

  1. On the management machine, get an example AutoYaST file from /usr/share/caasp/autoyast/bare-metal/autoyast.xml, (which was installed earlier on as part of the management pattern (sudo zypper in -t pattern SUSE-CaaSP-Management).

  2. Copy the file to a suitable location to modify it. Name the file autoyast.xml.

  3. Modify the following places in the AutoYaST file (and any additional places as required by your specific configuration/environment):

    1. <ntp-client>

      Change the pre-filled value to your organization’s NTP server. Provide multiple servers if possible by adding new <ntp_server> subentries.

    2. <timezone>

      Adjust the timezone your nodes will be set to. Refer to: SUSE Linux Enterprise Server AutoYaST Guide: Country Settings

    3. <username>sles</username>

      Insert your authorized key in the placeholder field.

    4. <users>

      You can add additional users by creating new blocks in the configuration containing their data.

      Note
      Note

      If the users are configured to not have a password like in the example, ensure the system’s sudoers file is updated. Without updating the sudoers file the user will only be able to perform basic operations that will prohibit many administrative tasks.

      The default AutoYaST file provides examples for a disabled root user and a sles user with authorized key SSH access.

      The password for root can be enabled by using the passwd command.

    5. <suse_register>

      Insert the email address and SUSE CaaS Platform registration code in the placeholder fields. This activates SUSE Linux Enterprise 15 SP1.

    6. <addon>

      Insert the SUSE CaaS Platform registration code in the placeholder field. This enables the SUSE CaaS Platform extension module. Update the AutoYaST file with your registration keys and your company’s best practices and hardware configurations.

      Note
      Note

      Your SUSE CaaS Platform registration key can be used to both activate SUSE Linux Enterprise 15 SP1 and enable the extension.

    Refer to the official AutoYaST documentation for more information: AutoYaST Guide.

  4. Host the AutoYaST files on a Web server reachable inside the network you are installing the cluster in.

2.4.2.1 Deploying with local Repository Mirroring Tool (RMT) server

In order to use a local Repository Mirroring Tool (RMT) server for deployment of packages, you need to specify the server configuration in your AutoYaST file. To do so add the following section:

<suse_register>
<do_registration config:type="boolean">true</do_registration>
<install_updates config:type="boolean">true</install_updates>

<reg_server>https://rmt.example.org</reg_server> 1
<reg_server_cert>https://rmt.example.org/rmt.crt</reg_server_cert> 2
<reg_server_cert_fingerprint_type>SHA1</reg_server_cert_fingerprint_type>
<reg_server_cert_fingerprint>0C:A4:A1:06:AD:E2:A2:AA:D0:08:28:95:05:91:4C:07:AD:13:78:FE</reg_server_cert_fingerprint> 3
<slp_discovery config:type="boolean">false</slp_discovery>
<addons config:type="list">
  <addon>
    <name>sle-module-containers</name>
    <version>15.1</version>
    <arch>x86_64</arch>
  </addon>
  <addon>
    <name>caasp</name>
    <version>4.0</version>
    <arch>x86_64</arch>
  </addon>
</addons>
</suse_register>

1

Provide FQDN of the Repository Mirroring Tool (RMT) server

2

Provide the location on the server where the certificate can be found

3

Provide the certificate fingerprint for the Repository Mirroring Tool (RMT) server

2.4.3 Provisioning the Cluster Nodes

Once the AutoYaST file is available in the network that the machines will be configured in, you can start deploying machines.

The default production scenario consists of 6 nodes:

  • 1 load balancer

  • 3 masters

  • 2 workers

Depending on the type of load balancer you wish to use, you need to deploy at least 5 machines to serve as cluster nodes and provide a load balancer from the environment.

The load balancer must point at the machines that are assigned to be used as master nodes in the future cluster.

Tip
Tip

If you do not wish to use infrastructure load balancers, please deploy additional machines and refer to Section 2.1.5, “Load Balancer”.

Install SUSE Linux Enterprise 15 SP1 from your preferred medium and follow the steps for Invoking the Auto-Installation Process

Provide autoyast=https://[webserver/path/to/autoyast.xml] during the SUSE Linux Enterprise 15 SP1 installation.

2.4.3.1 SUSE Linux Enterprise Server Installation

Note
Note

Use AutoYaST and make sure to use a staged frozen patchlevel via RMT/SUSE Manager to ensure a 100% reproducible setup. RMT Guide

Once the machines have been installed using the AutoYaST file, you are now ready proceed with Chapter 3, Bootstrapping the Cluster.

2.4.4 Container Runtime Proxy

Important
Important

CRI-O proxy settings must be adjusted on all nodes before joining the cluster!

Please refer to: https://documentation.suse.com/suse-caasp/4.2/single-html/caasp-admin/#_configuring_httphttps_proxy_for_cri_o

In some environments you must configure the container runtime to access the container registries through a proxy. In this case, please refer to: SUSE CaaS Platform Admin Guide: Configuring HTTP/HTTPS Proxy for CRI-O

2.5 Deployment on Existing SLES Installation

If you already have a running SUSE Linux Enterprise 15 SP1 installation, you can add SUSE CaaS Platform to this installation using SUSE Connect. You also need to enable the "Containers" module because it contains some dependencies required by SUSE CaaS Platform.

2.5.1 Requirements

Note
Note: Preparation Required

You must have completed Section 2.1, “Deployment Preparations” to proceed.

2.5.1.1 Dedicated Cluster Nodes

Important
Important

Adding a machine with an existing use case (e.g. web server) as a cluster node is not supported!

SUSE CaaS Platform requires dedicated machines as cluster nodes.

The instructions in this document are meant to add SUSE CaaS Platform to an existing SUSE Linux Enterprise installation that has no other active use case.

For example: You have installed a machine with SUSE Linux Enterprise but it has not yet been commissioned to run a specific application and you decide now to make it a SUSE CaaS Platform cluster node.

2.5.1.2 Disabling Swap

When using a pre-existing SUSE Linux Enterprise installation, swap will be enabled. You must disable swap for all cluster nodes before performing the cluster bootstrap.

On all nodes that are meant to join the cluster; run:

sudo swapoff -a

Then modify /etc/fstab on each node to remove the swap entries.

Important
Important

It is recommended to reboot the machine to finalize these changes and prevent accidental reactivation of swap during an automated reboot of the machine later on.

2.5.2 Adding SUSE CaaS Platform repositories

Retrieve your SUSE CaaS Platform registration code and run the following. Substitute <CAASP_REGISTRATION_CODE> for the code from Section 2.1.2, “Registration Code”.

SUSEConnect -p sle-module-containers/15.1/x86_64

SUSEConnect -p caasp/4.0/x86_64 -r <CAASP_REGISTRATION_CODE>

Repeat all preparation steps for any cluster nodes you wish to join. You can then proceed with Chapter 3, Bootstrapping the Cluster.

2.6 Deployment on Amazon Web Services (AWS)

Deployment on Amazon Web Services (AWS) is currently a tech preview.

Note
Note: Preparation Required

You must have completed Section 2.1, “Deployment Preparations” to proceed.

You will use Terraform to deploy the whole infrastructure described in Section 2.6.1, “AWS Deployment”. Then you will use the skuba tool to bootstrap the Kubernetes cluster on top of it.

2.6.1 AWS Deployment

The AWS deployment created by our Terraform template files leads to the creation of the infrastructure described in the next paragraphs.

2.6.1.1 Network

All of the infrastructure is created inside of a user specified AWS region. The resources are currently all located inside of the same availability zone.

The Terraform template files create a dedicated Amazon Virtual Private Cloud (VPC) with two subnets: "public" and "private". Instances inside of the public subnet have Elasic IP addresses associated, hence they are reachable from the internet. Instances inside of the private subnet are not reachable from the internet. However they can still reach external resources; for example they can still perform operations like downloading updates and pulling container images from external container registries. Communication between the public and the private subnet is allowed. All the control plane instances are currently located inside of the public subnet. Worker instances are inside of the private subnet.

Both control plane and worker nodes have tailored Security Groups assigned to them. These are based on the networking requirements described in Section 8, “Networking”.

2.6.1.2 Load Balancer

The Terraform template files take care of creating a Classic Load Balancer which exposes the Kubernetes API service deployed on the control plane nodes.

The load balancer exposes the following ports:

  • 6443: Kubernetes API server

  • 32000: Dex (OIDC Connect)

  • 32001: Gangway (RBAC Authenticate)

2.6.1.3 Join Already Existing VPCs

The Terraform template files allow the user to have the SUSE CaaS Platform VPC join one or more existing VPCs.

This is achieved by the creation of VPC peering links and dedicated Route tables.

This feature allows SUSE CaaS Platform to access and be accessed by resources defined inside of other VPCs. For example, this capability can be used to register all the SUSE CaaS Platform instances against a SUSE Manager server running inside of a private VPC.

Current limitations:

  • The VPCs must belong to the same AWS region.

  • The VPCs must be owned by the same user who is creating the SUSE CaaS Platform infrastructure via Terraform.

2.6.1.4 IAM Profiles

The AWS Cloud Provider integration for Kubernetes requires special IAM profiles to be associated with the control plane and worker instances. Terraform can create these profiles or can leverage existing ones. It all depends on the rights of the user invoking Terraform.

The Terraform AWS provider requires your credentials. These can be obtained by following these steps:

  • Log in to the AWS console.

  • Click on your username in the upper right hand corner to reveal the drop-down menu.

  • Click on My Security Credentials.

  • Click Create Access Key on the "Security Credentials" tab.

  • Note down the newly created Access and Secret keys.

2.6.2 Deploying the Infrastructure

On the management machine, find the Terraform template files for AWS in /usr/share/caasp/terraform/aws. These files have been installed as part of the management pattern (sudo zypper in -t pattern SUSE-CaaSP-Management).

Copy this folder to a location of your choice as the files need adjustment.

mkdir -p ~/caasp/deployment/
cp -r /usr/share/caasp/terraform/aws/ ~/caasp/deployment/
cd ~/caasp/deployment/aws/

Once the files are copied, rename the terraform.tfvars.example file to terraform.tfvars:

cp terraform.tfvars.example terraform.tfvars

Edit the terraform.tfvars file and add/modify the following variables:

# prefix that all of the booted machines will use
# IMPORTANT, please enter unique identifier below as value of
# stack_name variable to not interfere with other deployments
stack_name = "caasp-v4" 1

# AWS region
aws_region = "eu-central-1"  2

# AWS availability zone
aws_az = "eu-central-1a" 3

# access key for AWS services
aws_access_key = "AKIXU..."  4

# secret key used for AWS services
aws_secret_key = "ORd..." 5

# Number of master nodes
masters = 1 6

# Number of worker nodes
workers = 2 7

# ssh keys to inject into all the nodes
# EXAMPLE:
# authorized_keys = [
#   "ssh-rsa <key-content>"
# ]
authorized_keys = [ 8
  "ssh-rsa <example_key> example@example.com"
]

# To register CaaSP product please use ONLY ONE of the following method
#
# SUSE CaaSP Product Product Key:
#caasp_registry_code = ""  9
#
# SUSE Repository Mirroring Server Name (FQDN):
#rmt_server_name = "rmt.example.com"  10

# List of VPC IDs to join via VPC peer link
#peer_vpc_ids = ["vpc-id1", "vpc-id2"] 11

# Name of the IAM profile to associate to control plane nodes
# Leave empty to have terraform create one.
# This is required to have AWS CPI support working properly.
#
# Note well: you must  have the right set of permissions.
# iam_profile_master = "caasp-k8s-master-vm-profile" 12

# Name of the IAM profile to associate to worker nodes.
# Leave empty to have terraform create one.
# This is required to have AWS CPI support working properly.
#
# Note well: you must  have the right set of permissions.
#iam_profile_worker = "caasp-k8s-worker-vm-profile" 13

1

stack_name: Prefix for all machines of the cluster spawned by terraform.

2

aws_region: The region in AWS.

3

aws_az: The availability zone in AWS.

4

aws_access_key: AWS access key.

5

aws_secrert_key: AWS secret key.

6

masters: Number of master nodes to be deployed.

7

workers: Number of worker nodes to be deployed.

8

authorized_keys: List of ssh-public-keys that will be able to log into the deployed machines.

9

caasp_registry_code: SUSE CaaS Platform Product Key for registering the product against SUSE Customer Center.

10

caasp_registry_code: register against a local SUSE Repository Mirroring Server.

11

peer_vpc_ids: List of already existing VPCs to join via dedicated VPC peering links.

12

iam_profile_master: Name of the IAM profile to associate with the control plane instance. Leave empty to have Terraform create it.

13

iam_profile_worker: Name of the IAM profile to associate with the worker instances. Leave empty to have Terraform create it.

Tip
Tip

You can set timezone and other parameters before deploying the nodes by modifying the cloud-init template:

  • ~/caasp/deployment/aws/cloud-init/cloud-init.yaml.tpl

You can enter the registration code for your nodes in ~/caasp/deployment/aws/registration.auto.tfvars instead of the terraform.tfvars file.

Substitute CAASP_REGISTRATION_CODE for the code from Section 2.1.2, “Registration Code”.

# SUSE CaaSP Product Key
caasp_registry_code = "<CAASP_REGISTRATION_CODE>"

This is required so all the deployed nodes can automatically register with SUSE Customer Center and retrieve packages.

Now you can deploy the nodes by running:

terraform init
terraform plan
terraform apply

Check the output for the actions to be taken. Type "yes" and confirm with Enter when ready. Terraform will now provision all the cluster infrastructure.

Important
Important: Public IPs for Nodes

skuba currently cannot access nodes through a bastion host, so all the nodes in the cluster must be directly reachable from the machine where skuba is being run. skuba could be run from one of the master nodes or from a pre-existing bastion host located inside of a joined VPC as described in Section 2.6.1.3, “Join Already Existing VPCs”.

Important
Important: Note Down IP/FQDN For the Nodes

The IP addresses and FQDN of the generated machines will be displayed in the Terraform output during the cluster node deployment. You need these information later to deploy SUSE CaaS Platform.

These information can be obtained at any time by executing the terraform output command within the directory from which you executed Terraform.

2.6.3 Logging into the Cluster Nodes

Connecting to the cluster nodes can be accomplished only via SSH key-based authentication thanks to the ssh-public key injection done earlier via cloud-init. You can use the predefined ec2-user user to log in.

If the ssh-agent is running in the background, run:

ssh ec2-user@<node-ip-address>

Without the ssh-agent running, run:

ssh ec2-user@<node-ip-address> -i <path-to-your-ssh-private-key>

Once connected, you can execute commands using password-less sudo.

Print this page