Troubleshooting Clients

1. Autoinstallation

Depending on your base channel, new autoinstallation profiles might be subscribed to a channel that is missing required packages.

For autoinstallation to work, these packages are required:

  • pyOpenSSL

  • rhnlib

  • libxml2-python

  • spacewalk-koan

To resolve this issue, check these things first:

  • Check that the tools software channel related to the base channel in your autoinstallation profile is available to your organization and your user.

  • Check that the tools channel is available to your SUSE Manager as a child channel.

  • Check that the required packages and any dependencies are available in the associated channels.

2. Bare Metal Systems

If a bare metal system on the network is not automatically added to the Systems list, check these things first:

  • You must have the pxe-default-image package installed.

  • File paths and parameters must be configured correctly. Check that the vmlinuz0 and initrd0.img files, which are provided by pxe-default-image, are in the locations specified in the rhn.conf configuration file.

  • Ensure the networking equipment connecting the bare metal system to the SUSE Manager server is working correctly, and that you can reach the SUSE Manager server IP address from the server.

  • The bare metal system to be provisioned must have PXE booting enabled in the boot sequence, and must not be attempting to boot an operating system.

  • The DHCP server must be responding to DHCP requests during boot. Check the PXE boot messages to ensure that:

    • the DHCP server is assigning the expected IP address

    • the DHCP server is assigning the the SUSE Manager server IP address as next-server for booting.

  • Ensure Cobbler is running, and that the Discovery feature is enabled.

If you see a blue Cobbler menu shortly after booting, discovery has started. If it does not complete successfully, temporarily disable automatic shutdown to help diagnose the problem. To disable automatic shutdown:

  1. Select pxe-default-profile in the Cobbler menu with the arrow keys, and press the Tab key before the timer expires.

  2. Add the kernel boot parameter spacewalk-finally=running using the integrated editor, and press Enter to continue booting.

  3. Enter a shell with the username root and password linux to continue debugging.

Duplicate profiles

Due to a technical limitation, it is not possible to reliably distinguish a new bare metal system from a system that has previously been discovered. Therefore, we recommended that you do not power on bare metal systems multiple times, as this results in duplicate profiles.

3. Bootstrap Repository for End-of-Life Products

When supported products are synchronized, bootstrap repositories are automatically created and regenerated on the SUSE Manager Server. When a product reaches end-of-life and becomes unsupported, bootstrap repositories must be created manually if you want to continue using the product.

For more information about bootstrap repositories, see Bootstrap Repository.

Procedure: Creating Bootstrap Repositories for End-Of-Life Products
  1. At the command prompt on the SUSE Manager Server, as root, list the available unsupported bootstrap repositories with the --force option, for example:

    mgr-create-bootstrap-repo --list --force
    1. SLE-12-SP2-x86_64
    2. SLE-12-SP3-x86_64
  2. Create the bootstrap repository, using the appropriate repository name as the product label:

    mgr-create-bootstrap-repo --create SLE-12-SP2-x86_64 --force

If you do not want to create bootstrap repositories manually, you can check whether LTSS is available for the product and bootstrap repository you need.

4. Cloned Salt Clients

If you have used your hypervisor clone utility, and attempted to register the cloned Salt client, you might get this error:

We're sorry, but the system could not be found.

This is caused by the new, cloned, system having the same machine ID as an existing, registered, system. You can adjust this manually to correct the error and register the cloned system successfully.

For more information and instructions, see Troubleshooting Registering Cloned Clients.

5. Disabling the FQDNS grain

The FQDNS grain returns the list of all the fully qualified DNS services in the system. Collecting this information is usually a fast process, but if the DNS settings have been misconfigured, it could take a much longer time. In some cases, the client could become unresponsive, or crash.

To prevent this problem, you can disable the FQDNS grain with a Salt flag. If you disable the grain, you can use a network module to provide FQDNS services, without the risk of the client becoming unresponsive.

This only applies to older Salt clients. If you registered your Salt client recently, the FQDNS grain is disabled by default.

On the SUSE Manager Server, at the command prompt, use this command to disable the FQDNS grain:

salt '*' state.sls util.mgr_disable_fqdns_grain

This command restarts each client and generate Salt events that the server needs to process. If you have a large number of clients, you can execute the command in batch mode instead:

salt --batch-size 50 '*' state.sls util.mgr_disable_fqdns_grain

Wait for the batch command to finish executing. Do not interrupt the process with Ctrl+C.

6. Mounting /tmp with noexec

Salt runs remote commands from /tmp on the client’s file system. Therefore you must not mount /tmp with the noexec option. The other way to solve this issue is to override temporary directory path with the TMPDIR environment variable specified for the Salt service to make it pointing to the directory with no noexec option set. It is recommended to use systemd drop-in configuration file /etc/systemd/system/venv-salt-minion.service.d/10-TMPDIR.conf if Salt Bundle is used, or /etc/systemd/system/salt-minion.service.d/10-TMPDIR.conf if salt-minion is used on the client. The example of the drop-in configuration file content:

[Service]
Environment=TMPDIR=/var/tmp

7. Mounting /var/tmp with noexec

Salt SSH is using /var/tmp to deploy Salt Bundle to and execute Salt commands on the client with the bundled Python. Therefore you must not mount /var/tmp with the noexec option. It is not possible to bootstrap the clients, which have /var/tmp mounted with noexec option, with the Web UI because the bootstrap process is using Salt SSH to reach a client.

8. Passing Grains to a Start Event

Every time a Salt client starts, it passes the machine_id grain to SUSE Manager. SUSE Manager uses this grain to determine if the client is registered. This process requires a synchronous Salt call. Synchronous Salt calls block other processes, so if you have a lot of clients start at the same time, the process could create significant delays.

To overcome this problem, a new feature has been introduced in Salt to avoid making a separate synchronous Salt call.

To use this feature, you can add a configuration parameter to the client configuration, on clients that support it.

To make this process easier, you can use the mgr_start_event_grains.sls helper Salt state.

This only applies to already registered clients. If you registered your Salt client recently, this config parameter is added by default.

On the SUSE Manager Server, at the command prompt, use this command to enable the start_event_grains configuration helper:

salt '*' state.sls util.mgr_start_event_grains

This command adds the required configuration into the client’s configuration file, and applies it when the client is restarted. If you have a large number of clients, you can execute the command in batch mode instead:

salt --batch-size 50 '*' state.sls mgr_start_event_grains

9. Proxy Connections and FQDN

Sometimes clients connected through a proxy appear in the Web UI, but do not show that they are connected through a proxy. This can occur if you are not using the fully qualified domain name (FQDN) to connect, and the proxy is not known to SUSE Manager.

To correct this behavior, specify additional FQDNs as grains in the client configuration file on the proxy:

grains:
  susemanager:
    custom_fqdns:
      - name.one
      - name.two

10. Red Hat CDN Channel and Multiple Certificates

The Red Hat content delivery network (CDN) channels sometimes provide multiple certificates, but the SUSE Manager Web UI can only import a single certificate. If CDN presents a certificate that is different to the one the SUSE Manager Web UI knows about, validation fails and permission to access the repository is denied, even though the certificate is accurate. The error message received is:

[error]
Repository '<repo_name>' is invalid.
<repo.pem> Valid metadata not found at specified URL
History:
 - [|] Error trying to read from '<repo.pem>'
 - Permission to access '<repo.pem>' denied.
Please check if the URIs defined for this repository are pointing to a valid repository.
Skipping repository '<repo_nam' because of the above error.
Could not refresh the repositories because of errors.
HH:MM:SS RepoMDError: Cannot access repository. Maybe repository GPG keys are not imported

To resolve this issue, merge all valid certificates into a single .pem file, and rebuild the certificates for use by SUSE Manager:

Procedure: Resolving Multiple Red Hat CDN Certificates
  1. On the Red Hat client, at the command prompt, as root, gather all current certificates from /etc/pki/entitlement/ in a single rh-cert.pem file:

    cat 866705146090697087.pem 3539668047766796506.pem redhat-entitlement-authority.pem > rh-cert.pem
  2. Gather all current keys from /etc/pki/entitlement/ in a single rh-key.pem file:

    cat 866705146090697087-key.pem 3539668047766796506-key.pem > rh-key.pem

You can now import the new certificates to the SUSE Manager Server, using the instructions in Registering Red Hat Enterprise Linux Clients with CDN.

11. Registration from Web UI fails and does not show any errors

For the initial registration from the Web UI, all Salt clients are using Salt SSH.

Because of its nature, Salt SSH clients do not report errors back to the server.

However, the Salt SSH clients store a log locally at /var/log/salt-ssh.log that can be inspected for errors.

12. Salt clients shown as down and DNS settings

Even if the Salt client is running, actions such as package refresh or apply states can be marked as failed with the message:

Minion is down or could not be contacted.

In this case try rescheduling the action. If rescheduling succeeds, the cause of the problem can be a wrong DNS configuration.

When the Salt client is restarted, or in case the grains are refreshed, the client calculates its FQDN grains, and it is unresponsive until the grains are proceeded. When a scheduled action on SUSE Manager Server is going to be executed, SUSE Manager Server performs a test.ping to the client before the actual action to ensure the client is actually running and the action can be triggered.

By default, SUSE Manager Server waits for 5 seconds to get the response from test.ping command. If the response is not received within 5 seconds, then the action is set to fail with the message that the client is down or could not be contacted.

To correct this, fix the DNS resolution on the client, so the client does not get stuck for 5 seconds while solving its FQDN.

If this is not possible, try to increase the value for java.salt_presence_ping_timeout in the /etc/rhn/rhn.conf file on the SUSE Manager Server to a value higher than 4.

For example:

java.salt_presence_ping_timeout = 6

After that, restart spacewalk-services with:

spacewalk-services restart

Increasing this value will cause SUSE Manager Server to take longer to check if a minion is unreachable or unresponsive, causing the SUSE Manager Server to be slower or less responsive overall.

13. Salt 3000 to Salt Bundle Migration

13.1. Switch SUSE Linux Enterprise Server 12, Red Hat Enterprise Linux 7, or CentOS 7 minions (Salt 3000 EOL) to Salt Bundle

Procedure: Switching with util.mgr_switch_to_venv_minion state to venv-salt-minion
  1. Apply util.mgr_switch_to_venv_minion with no pillar specified first. This will result in the switch to venv-salt-minion with copying configuration files in etc. It will not clean up the original salt-minion configurations and its packages.

    salt <minion_id> state.apply util.mgr_switch_to_venv_minion
  2. Apply util.mgr_switch_to_venv_minion with mgr_purge_non_venv_salt set to True to remove salt-minion and with mgr_purge_non_venv_salt_files set to True to remove all the files related to salt-minion. This second step ensures the first step was processed, and then removes the old configuration files and the now obsolete salt-minion package.

    salt <minion_id> state.apply util.mgr_switch_to_venv_minion pillar='{"mgr_purge_non_venv_salt_files": True, "mgr_purge_non_venv_salt": True}'

13.2. Switch SUSE Linux Enterprise Server 12, Red Hat Enterprise Linux 7, or CentOS 7 SSH minions (Salt 3000 EOL) to Salt Bundle

Procedure: Enabling the Salt Bundle with Salt SSH support
  1. Specify in /etc/rhn/rhn.conf:

    web.ssh_salt_pre_flight_script = /usr/share/susemanager/salt-ssh/preflight.sh
    
    web.ssh_use_salt_thin = false
  2. Create the extra drop-in configuration file for Salt Master /etc/salt/master.d/ssh-preflight.conf with ssh_run_pre_flight set to true:

    ssh_run_pre_flight: true
  3. Restart the services using spacewalk-service restart.