Operation Recommendations
This section contains a range of recommendations for large scale deployments.
Always start small and scale up gradually. Monitor the server as you scale to identify problems early. |
1. Salt Client Onboarding Rate
The rate at which SUSE Manager can onboard clients is limited and depends on hardware resources. Onboarding clients at a faster rate than SUSE Manager is configured for will build up a backlog of unprocessed keys. This slows down the process and can potentially exhaust resources. We recommend that you limit the acceptance key rate programmatically. A safe starting point would be to onboard a client every 15 seconds. You can do that with this command:
for k in $(salt-key -l un|grep -v Unaccepted); do salt-key -y -a $k; sleep 15; done
2. Salt Clients and the RNG
All communication to and from Salt clients is encrypted. During client onboarding, Salt uses asymmetric cryptography, which requires available entropy from the Random Number Generator (RNG) facility in the kernel. If sufficient entropy is not available from the RNG, it will significantly slow down communications. This is especially true in virtualized environments. Ensure enough entropy is present, or change the virtualization host options.
You can check the amount of available entropy with the cat /proc/sys/kernel/random/entropy_avail
. It should never be below 100-200.
3. Clients Running with Unaccepted Salt Keys
Idle clients which have not been onboarded, that is clients running with unaccepted Salt keys, consume more resources than idle clients that have been onboarded. Generally, this consumes about an extra 2.5 Kb/s of inbound network bandwidth per client. For example, 1000 idle clients will consume about 2.5 Mb/s extra. This consumption will reduce almost to zero when onboarding has been completed for all clients. Limit the number of non-onboarded clients for optimal performance.
4. Disabling the Salt Mine
In older versions, SUSE Manager used a tool called Salt mine to check client availability. The Salt mine would cause clients to contact the server every hour, which created significant load. With the introduction of a more efficient mechanism in SUSE Manager 3.2, the Salt mine is no longer required. Instead, the SUSE Manager Server uses Taskomatic to ping only the clients that appear to have been offline for twelve hours or more, with all clients being contacted at least once in every twenty four hour period by default. You can adjust this by changing the web.system_checkin_threshold
parameter in rhn.conf
. The value is expressed in days, and the default value is 1
.
Newly registered Salt clients will have the Salt mine disabled by default. If the Salt mine is running on your system, you can reduce load by disabling it. This is especially effective if you have a large number of clients.
Disable the Salt mine by running this command on the server:
salt '*' state.sls util.mgr_mine_config_clean_up
This will restart the clients and generate some Salt events to be processed by the server. If you have a large number of clients, handling these events could create excessive load. To avoid this, you can execute the command in batch mode with this command:
salt --batch-size 50 '*' state.sls util.mgr_mine_config_clean_up
You will need to wait for this command to finish executing. Do not end the process with Ctrl+C.
5. Disable Unnecessary Taskomatic jobs
To minimize wasted resources, you can disable non-essential or unused Taskomatic jobs.
You can see the list of Taskomatic jobs in the SUSE Manager Web UI, at
.To disable a job, click the name of the job you want to disable, select Disable Schedule
, and click Update Schedule.
To delete a job, click the name of the job you want to delete, and click Delete Schedule.
We recommend disabling these jobs:
-
Daily comparison of configuration files:
compare-configs-default
-
Hourly synchronization of Cobbler files:
cobbler-sync-default
-
Daily gatherer and subscription matcher:
gatherer-matcher-default
Do not attempt to disable any other jobs, as it could prevent SUSE Manager from functioning correctly.
6. Swap and Monitoring
It is especially important in large scale deployments that you keep your SUSE Manager Server constantly monitored and backed up.
Swap space use can have significant impacts on performance. If significant non-transient swap usage is detected, you can increase the available hardware RAM.
You can also consider tuning the Server to consume less memory. For more information on tuning, see Scaling Minions (Large Scale Deployments).
7. AES Key Rotation
Communications from the Salt Master to clients is encrypted with a single AES key. The key is rotated when:
-
The
salt-master
process is restarted, or -
Any minion key is deleted (for example, when a client is deleted from SUSE Manager)
After the AES key has been rotated, all clients must re-authenticate to the master. By default, this happens next time a client receives a message. If you have a large number of clients (several thousands), this can cause a high CPU load on the SUSE Manager Server. If the CPU load is excessive, we recommend that you delete keys in batches, and in off-peak hours if possible, to avoid overloading the server.
For more information, see: