Troubleshooting Clients
- 1. Autoinstallation
- 2. Bare Metal Systems
- 3. Bootstrap Repository for End-of-Life Products
- 4. Cloned Salt Clients
- 5. Disabling the FQDNS grain
- 6. Mounting /tmp with noexec
- 7. Mounting /var/tmp with noexec
- 8. Passing Grains to a Start Event
- 9. Proxy Connections and FQDN
- 10. Red Hat CDN Channel and Multiple Certificates
- 11. Registration from Web UI fails and does not show any errors
- 12. Salt clients shown as down and DNS settings
- 13. Salt 3000 to Salt Bundle Migration
1. Autoinstallation
Depending on your base channel, new autoinstallation profiles might be subscribed to a channel that is missing required packages.
For autoinstallation to work, these packages are required:
-
pyOpenSSL
-
rhnlib
-
libxml2-python
-
spacewalk-koan
To resolve this issue, check these things first:
-
Check that the tools software channel related to the base channel in your autoinstallation profile is available to your organization and your user.
-
Check that the tools channel is available to your SUSE Manager as a child channel.
-
Check that the required packages and any dependencies are available in the associated channels.
2. Bare Metal Systems
If a bare metal system on the network is not automatically added to the Systems
list, check these things first:
-
You must have the
pxe-default-image
package installed. -
File paths and parameters must be configured correctly. Check that the
vmlinuz0
andinitrd0.img
files, which are provided bypxe-default-image
, are in the locations specified in therhn.conf
configuration file. -
Ensure the networking equipment connecting the bare metal system to the SUSE Manager server is working correctly, and that you can reach the SUSE Manager server IP address from the server.
-
The bare metal system to be provisioned must have PXE booting enabled in the boot sequence, and must not be attempting to boot an operating system.
-
The DHCP server must be responding to DHCP requests during boot. Check the PXE boot messages to ensure that:
-
the DHCP server is assigning the expected IP address
-
the DHCP server is assigning the the SUSE Manager server IP address as
next-server
for booting.
-
-
Ensure Cobbler is running, and that the Discovery feature is enabled.
If you see a blue Cobbler menu shortly after booting, discovery has started. If it does not complete successfully, temporarily disable automatic shutdown to help diagnose the problem. To disable automatic shutdown:
-
Select
pxe-default-profile
in the Cobbler menu with the arrow keys, and press the Tab key before the timer expires. -
Add the kernel boot parameter
spacewalk-finally=running
using the integrated editor, and press Enter to continue booting. -
Enter a shell with the username
root
and passwordlinux
to continue debugging.
Duplicate profiles
Due to a technical limitation, it is not possible to reliably distinguish a new bare metal system from a system that has previously been discovered. Therefore, we recommended that you do not power on bare metal systems multiple times, as this results in duplicate profiles. |
3. Bootstrap Repository for End-of-Life Products
When supported products are synchronized, bootstrap repositories are automatically created and regenerated on the SUSE Manager Server. When a product reaches end-of-life and becomes unsupported, bootstrap repositories must be created manually if you want to continue using the product.
For more information about bootstrap repositories, see Bootstrap Repository.
-
At the command prompt on the SUSE Manager Server, as root, list the available unsupported bootstrap repositories with the
--force
option, for example:mgr-create-bootstrap-repo --list --force 1. SLE-12-SP2-x86_64 2. SLE-12-SP3-x86_64
-
Create the bootstrap repository, using the appropriate repository name as the product label:
mgr-create-bootstrap-repo --create SLE-12-SP2-x86_64 --force
If you do not want to create bootstrap repositories manually, you can check whether LTSS is available for the product and bootstrap repository you need.
4. Cloned Salt Clients
If you have used your hypervisor clone utility, and attempted to register the cloned Salt client, you might get this error:
We're sorry, but the system could not be found.
This is caused by the new, cloned, system having the same machine ID as an existing, registered, system. You can adjust this manually to correct the error and register the cloned system successfully.
For more information and instructions, see Troubleshooting Registering Cloned Clients.
5. Disabling the FQDNS grain
The FQDNS grain returns the list of all the fully qualified DNS services in the system. Collecting this information is usually a fast process, but if the DNS settings have been misconfigured, it could take a much longer time. In some cases, the client could become unresponsive, or crash.
To prevent this problem, you can disable the FQDNS grain with a Salt flag. If you disable the grain, you can use a network module to provide FQDNS services, without the risk of the client becoming unresponsive.
This only applies to older Salt clients. If you registered your Salt client recently, the FQDNS grain is disabled by default. |
On the SUSE Manager Server, at the command prompt, use this command to disable the FQDNS grain:
salt '*' state.sls util.mgr_disable_fqdns_grain
This command restarts each client and generate Salt events that the server needs to process. If you have a large number of clients, you can execute the command in batch mode instead:
salt --batch-size 50 '*' state.sls util.mgr_disable_fqdns_grain
Wait for the batch command to finish executing. Do not interrupt the process with Ctrl+C.
6. Mounting /tmp with noexec
Salt runs remote commands from /tmp
on the client’s file system.
Therefore you must not mount /tmp
with the noexec
option.
The other way to solve this issue is to override temporary directory path with the TMPDIR
environment variable specified for the Salt service to make it pointing to the directory with no noexec
option set.
It is recommended to use systemd drop-in configuration file /etc/systemd/system/venv-salt-minion.service.d/10-TMPDIR.conf
if Salt Bundle is used, or /etc/systemd/system/salt-minion.service.d/10-TMPDIR.conf
if salt-minion
is used on the client.
The example of the drop-in configuration file content:
[Service] Environment=TMPDIR=/var/tmp
7. Mounting /var/tmp with noexec
Salt SSH is using /var/tmp
to deploy Salt Bundle to and execute Salt commands on the client with the bundled Python.
Therefore you must not mount /var/tmp
with the noexec
option.
It is not possible to bootstrap the clients, which have /var/tmp
mounted with noexec
option, with the Web UI because the bootstrap process is using Salt SSH to reach a client.
8. Passing Grains to a Start Event
Every time a Salt client starts, it passes the machine_id
grain to SUSE Manager. SUSE Manager uses this grain to determine if the client is registered.
This process requires a synchronous Salt call. Synchronous Salt calls block other processes, so if you have a lot of clients start at the same time, the process could create significant delays.
To overcome this problem, a new feature has been introduced in Salt to avoid making a separate synchronous Salt call.
To use this feature, you can add a configuration parameter to the client configuration, on clients that support it.
To make this process easier, you can use the mgr_start_event_grains.sls
helper Salt state.
This only applies to already registered clients. If you registered your Salt client recently, this config parameter is added by default. |
On the SUSE Manager Server, at the command prompt, use this command to enable the start_event_grains
configuration helper:
salt '*' state.sls util.mgr_start_event_grains
This command adds the required configuration into the client’s configuration file, and applies it when the client is restarted. If you have a large number of clients, you can execute the command in batch mode instead:
salt --batch-size 50 '*' state.sls mgr_start_event_grains
9. Proxy Connections and FQDN
Sometimes clients connected through a proxy appear in the Web UI, but do not show that they are connected through a proxy. This can occur if you are not using the fully qualified domain name (FQDN) to connect, and the proxy is not known to SUSE Manager.
To correct this behavior, specify additional FQDNs as grains in the client configuration file on the proxy:
grains: susemanager: custom_fqdns: - name.one - name.two
10. Red Hat CDN Channel and Multiple Certificates
The Red Hat content delivery network (CDN) channels sometimes provide multiple certificates, but the SUSE Manager Web UI can only import a single certificate. If CDN presents a certificate that is different to the one the SUSE Manager Web UI knows about, validation fails and permission to access the repository is denied, even though the certificate is accurate. The error message received is:
[error] Repository '<repo_name>' is invalid. <repo.pem> Valid metadata not found at specified URL History: - [|] Error trying to read from '<repo.pem>' - Permission to access '<repo.pem>' denied. Please check if the URIs defined for this repository are pointing to a valid repository. Skipping repository '<repo_nam' because of the above error. Could not refresh the repositories because of errors. HH:MM:SS RepoMDError: Cannot access repository. Maybe repository GPG keys are not imported
To resolve this issue, merge all valid certificates into a single .pem
file, and rebuild the certificates for use by SUSE Manager:
-
On the Red Hat client, at the command prompt, as root, gather all current certificates from
/etc/pki/entitlement/
in a singlerh-cert.pem
file:cat 866705146090697087.pem 3539668047766796506.pem redhat-entitlement-authority.pem > rh-cert.pem
-
Gather all current keys from
/etc/pki/entitlement/
in a singlerh-key.pem
file:cat 866705146090697087-key.pem 3539668047766796506-key.pem > rh-key.pem
You can now import the new certificates to the SUSE Manager Server, using the instructions in Registering Red Hat Enterprise Linux Clients with CDN.
11. Registration from Web UI fails and does not show any errors
For the initial registration from the Web UI, all Salt clients are using Salt SSH.
Because of its nature, Salt SSH clients do not report errors back to the server.
However, the Salt SSH clients store a log locally at /var/log/salt-ssh.log
that can be inspected for errors.
12. Salt clients shown as down and DNS settings
Even if the Salt client is running, actions such as package refresh or apply states can be marked as failed with the message:
Minion is down or could not be contacted.
In this case try rescheduling the action. If rescheduling succeeds, the cause of the problem can be a wrong DNS configuration.
When the Salt client is restarted, or in case the grains are refreshed, the client calculates its FQDN grains, and it is unresponsive until the grains are proceeded.
When a scheduled action on SUSE Manager Server is going to be executed, SUSE Manager Server performs a test.ping
to the client before the actual action to ensure the client is actually running and the action can be triggered.
By default, SUSE Manager Server waits for 5 seconds to get the response from test.ping
command.
If the response is not received within 5 seconds, then the action is set to fail with the message that the client is down or could not be contacted.
To correct this, fix the DNS resolution on the client, so the client does not get stuck for 5 seconds while solving its FQDN.
If this is not possible, try to increase the value for java.salt_presence_ping_timeout
in the /etc/rhn/rhn.conf
file on the SUSE Manager Server to a value higher than 4.
For example:
java.salt_presence_ping_timeout = 6
After that, restart spacewalk-services
with:
spacewalk-services restart
Increasing this value will cause SUSE Manager Server to take longer to check if a minion is unreachable or unresponsive, causing the SUSE Manager Server to be slower or less responsive overall. |
13. Salt 3000 to Salt Bundle Migration
13.1. Switch SUSE Linux Enterprise Server 12, Red Hat Enterprise Linux 7, or CentOS 7 minions (Salt 3000 EOL) to Salt Bundle
util.mgr_switch_to_venv_minion
state to venv-salt-minion
-
Apply
util.mgr_switch_to_venv_minion
with no pillar specified first. This will result in the switch tovenv-salt-minion
with copying configuration files inetc
. It will not clean up the originalsalt-minion
configurations and its packages.salt <minion_id> state.apply util.mgr_switch_to_venv_minion
-
Apply
util.mgr_switch_to_venv_minion
withmgr_purge_non_venv_salt
set toTrue
to removesalt-minion
and withmgr_purge_non_venv_salt_files
set toTrue
to remove all the files related tosalt-minion
. This second step ensures the first step was processed, and then removes the old configuration files and the now obsoletesalt-minion
package.salt <minion_id> state.apply util.mgr_switch_to_venv_minion pillar='{"mgr_purge_non_venv_salt_files": True, "mgr_purge_non_venv_salt": True}'
13.2. Switch SUSE Linux Enterprise Server 12, Red Hat Enterprise Linux 7, or CentOS 7 SSH minions (Salt 3000 EOL) to Salt Bundle
-
Specify in
/etc/rhn/rhn.conf
:web.ssh_salt_pre_flight_script = /usr/share/susemanager/salt-ssh/preflight.sh web.ssh_use_salt_thin = false
-
Create the extra drop-in configuration file for Salt Master
/etc/salt/master.d/ssh-preflight.conf
withssh_run_pre_flight
set totrue
:ssh_run_pre_flight: true
-
Restart the services using
spacewalk-service restart
.