Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
documentation.suse.com / SUSE Linux Enterprise Micro 文档 / 快速入门指南 / Administration Guide
SUSE Linux Enterprise Micro 5.2

Administration Guide

Publication Date: 2024 年 9 月 29 日

This guide describes the administration of SUSE Linux Enterprise Micro.

1 Snapshots

Warning
Warning: Snapshots are mandatory

As snapshots are crucial for the correct functioning of SLE Micro, do not disable the feature, and ensure that the root partition is big enough to store the snapshots.

When a snapshot is created, both the snapshot and the original point to the same blocks in the file system. So, initially a snapshot does not occupy additional disk space. If data in the original file system is modified, changed data blocks are copied while the old data blocks are kept for the snapshot.

Snapshots always reside on the same partition or subvolume on which the snapshot has been taken. It is not possible to store snapshots on a different partition or subvolume. As a result, partitions containing snapshots need to be larger than partitions which do not contain snapshots. The exact amount depends strongly on the number of snapshots you keep and the amount of data modifications. As a rule of thumb, give partitions twice as much space as you normally would. To prevent disks from running out of space, old snapshots are automatically cleaned up.

Snapshots that are known to be working properly are marked as important.

1.1 Directories excluded from snapshots

As some directories store user-specific or volatile data, these directories are excluded from snapshots:

/home

Contains users' data. Excluded so that the data will not be included in snapshots and thus potentially overwritten by a rollback operation.

/root

Contains root's data. Excluded so that the data will not be included in snapshots and thus potentially overwritten by a rollback operation.

/opt

Third-party products usually get installed to /opt. Excluded so that these applications are not uninstalled during rollbacks.

/srv

Contains data for Web and FTP servers. Excluded in order to avoid data loss on rollbacks.

/usr/local

This directory is used when manually installing software. It is excluded to avoid uninstalling these installations on rollbacks.

/var

This directory contains many variable files, including logs, temporary caches, third-party products in /var/opt, and is the default location for virtual machine images and databases. Therefore, a separate subvolume is created with Copy-On-Write disabled, so as to exclude all of this variable data from snapshots.

/tmp

The directory contains temporary data.

the architecture-specific /boot/grub2 directory

Rollback of the boot loader binaries is not supported.

1.2 Showing exclusive disk space used by snapshots

Snapshots share data, for efficient use of storage space, so using ordinary commands like du and df won't measure used disk space accurately. When you want to free up disk space on Btrfs with quotas enabled, you need to know how much exclusive disk space is used by each snapshot, rather than shared space. The btrfs command provides a view of space used by snapshots:

# btrfs qgroup show -p /
qgroupid         rfer         excl parent  
--------         ----         ---- ------  
0/5          16.00KiB     16.00KiB ---     
[...]    
0/272         3.09GiB     14.23MiB 1/0     
0/273         3.11GiB    144.00KiB 1/0     
0/274         3.11GiB    112.00KiB 1/0     
0/275         3.11GiB    128.00KiB 1/0     
0/276         3.11GiB     80.00KiB 1/0     
0/277         3.11GiB    256.00KiB 1/0     
0/278         3.11GiB    112.00KiB 1/0     
0/279         3.12GiB     64.00KiB 1/0     
0/280         3.12GiB     16.00KiB 1/0     
1/0           3.33GiB    222.95MiB ---

The qgroupid column displays the identification number for each subvolume, assigning a qgroup level/ID combination.

The rfer column displays the total amount of data referred to in the subvolume.

The excl column displays the exclusive data in each subvolume.

The parent column shows the parent qgroup of the subvolumes.

The final item, 1/0, shows the totals for the parent qgroup. In the above example, 222.95 MiB will be freed if all subvolumes are removed. Run the following command to see which snapshots are associated with each subvolume:

# btrfs subvolume list -st /

2 Administration using transactional updates

SLE Micro was designed to use a read-only root file system. This means that after the deployment is complete, you are not able to perform direct modifications to the root file system, e.g. by using zypper. Instead, SUSE Linux Enterprise Micro introduces the concept of transactional updates which enables you to modify your system and keep it up to date.

The key features of transactional updates are the following:

  • They are atomic - the update is applied only if it completes successfully.

  • Changes are applied in a separate snapshot and so do not influence the running system.

  • Changes can easily be rolled back.

Each time you call the transactional-update command to change your system—either to install a package, perform an update or apply a patch—the following actions take place:

Procedure 1: Modifying the root file system
  1. A new read-write snapshot is created from your current root file system, or from a snapshot that you specified.

  2. All changes are applied (updates, patches or package installation).

  3. The snapshot is switched back to read-only mode.

  4. The new root file system snapshot is prepared, so that it will be active after you reboot.

  5. After rebooting, the new root file system is set as the default snapshot.

    Note
    Note

    Bear in mind that without rebooting your system, the changes will not be applied.

Warning
Warning

In case you do not reboot your machine before performing further changes, the transactional-update command will create a new snapshot from the current root file system. This means that you will end up with several parallel snapshots, each including that particular change but not changes from the other invocations of the command. After reboot, the most recently created snapshot will be used as your new root file system, and it will not include changes done in the previous snapshots.

2.1 transactional-update usage

The transactional-update command enables atomic installation or removal of updates; updates are applied only if all of them can be successfully installed. transactional-update creates a snapshot of your system and use it to update the system. Later you can restore this snapshot. All changes become active only after reboot.

The transactional-update command syntax is as follows:

transactional-update [option] [general_command] [package_command] standalone_command
Note
Note: Running transactional-update without arguments.

If you do not specify any command or option while running the transactional-update command, the system updates itself.

Possible command parameters are described further.

transactional-update options
--interactive, -i

Can be used along with a package command to turn on interactive mode.

--non-interactive, -n

Can be used along with a package command to turn on non-interactive mode.

--continue [number], -c

The --continue option is for making multiple changes to an existing snapshot without rebooting.

The default transactional-update behavior is to create a new snapshot from the current root file system. If you forget something, such as installing a new package, you have to reboot to apply your previous changes, run transactional-update again to install the forgotten package, and reboot again. You cannot run the transactional-update command multiple times without rebooting to add more changes to the snapshot, because this will create separate independent snapshots that do not include changes from the previous snapshots.

Use the --continue option to make as many changes as you want without rebooting. A separate snapshot is made each time, and each snapshot contains all the changes you made in the previous snapshots, plus your new changes. Repeat this process as many times as you want, and when the final snapshot includes everything you want, reboot the system, and your final snapshot becomes the new root file system.

Another useful feature of the --continue option is you may select any existing snapshot as the base for your new snapshot. The following example demonstrates running transactional-update to install a new package in a snapshot based on snapshot 13, and then running it again to install another package:

# transactional-update pkg install package_1
# transactional-update --continue 13 pkg install package_2
--no-selfupdate

Disables self updating of transactional-update.

--drop-if-no-change, -d

Discards the snapshot created by transactional-update if there were no changes to the root file system. If there are some changes to the /etc directory, those changes merged back to the current file system.

--quiet

The transactional-update command will not output to stdout.

--help, -h

Prints help for the transactional-update command.

--version

Displays the version of the transactional-update command.

The general commands are the following:

General commands
cleanup-snapshots

The command marks all unused snapshots that are intended to be removed.

cleanup-overlays

The command removes all unused overlay layers of /etc.

cleanup

The command combines the cleanup-snapshots and cleanup-overlays commands. For more details refer to Section 2.2, “Snapshots cleanup”.

grub.cfg

Use this command to rebuild the GRUB boot loader configuration file.

bootloader

The command reinstall the boot loader.

initrd

Use the command to rebuild initrd.

kdump

In case you perform changes to your hardware or storage, you may need to rebuild the kdump initrd.

shell

Opens a read-write shell in the new snapshot before exiting. The command is typically used for debugging purposes.

reboot

The system reboots after the transactional-update is complete.

run <command>

Runs the provided command in a new snapshot.

setup-selinux

Installs and enables targeted SELinux policy.

The package commands are the following:

Important
Important: Installing packages outside of the official SLE Micro repositories

The installation of packages from repositories other than the official ones (for example, the SUSE Linux Enterprise Server repositories) is not supported and not recommended. To use the tools available for SUSE Linux Enterprise Server, run the toolbox container and install the tools inside the container. For details about the toolbox container, refer to Section 5, “toolbox for SLE Micro debugging”.

Package commands
dup

Performs upgrade of your system. The default option for this command is --non-interactive.

migration

The command migrates your system to a selected target. Typically it is used to upgrade your system if it has been registered via SUSE Customer Center.

patch

Checks for available patches and installs them. The default option for this command is --non-interactive.

pkg install

Installs individual packages from the available channels using the zypper install command. This command can also be used to install Program Temporary Fix (PTF) RPM files. The default option for this command is --interactive.

# transactional-update pkg install package_name

or

# transactional-update pkg install rpm1 rpm2
pkg remove

Removes individual packages from the active snapshot using the zypper remove command. This command can also be used to remove PTF RPM files. The default option for this command is --interactive.

# transactional-update pkg remove package_name
pkg update

Updates individual packages from the active snapshot using the zypper update command. Only packages that are part of the snapshot of the base file system can be updated. The default option for this command is --interactive.

# transactional-update pkg update package_name
register

The register command enables you to register/deregister your system. For a complete usage description, refer to Section 2.1.1, “The register command”.

up

Updates installed packages to newer versions. The default option for this command is --non-interactive.

The standalone commands are the following:

Standalone commands
rollback <snapshot number>

This sets the default subvolume. The current system is set as the new default root file system. If you specify a number, that snapshot is used as the default root file system. On a read-only file system, it does not create any additional snapshots.

# transactional-update rollback snapshot_number
rollback last

This command sets the last known to be working snapshot as the default.

status

This prints a list of available snapshots. The currently booted one is marked with an asterisk, the default snapshot is marked with a plus sign.

2.1.1 The register command

The register command enables you to handle all tasks regarding registration and subscription management. You can supply the following options:

--list-extensions

With this option, the command will list available extensions for your system. You can use the output to find a product identifier for product activation.

-p, --product

Use this option to specify a product for activation. The product identifier has the following format: <name>/<version>/<architecture>, for example sle-module-live-patching/15.3/x86_64. The appropriate command will then be the following:

# transactional-update register -p sle-module-live-patching/15.3/x86_64
-r, --regcode

Register your system with the provided registration code. The command will register the subscription and enable software repositories.

-d, --de-register

The option deregisters the system, or when used along with the -p option, deregisters an extension.

-e, --email

Specify an email address that will be used in SUSE Customer Center for registration.

--url

Specify the URL of your registration server. The URL is stored in the configuration and will be used in subsequent command invocations. For example:

# transactional-update register --url https://scc.suse.com
-s, --status

Displays the current registration status in JSON format.

--write-config

Writes the provided options value to the /etc/SUSEConnect configuration file.

--cleanup

Removes old system credentials.

--version

Prints the version.

--help

Displays usage of the command.

2.2 Snapshots cleanup

If you run the command transactional-update cleanup, all old snapshots without a cleanup algorithm will have one set. All important snapshots are also marked. The command also removes all unreferenced (and thus unused) /etc overlay directories in /var/lib/overlay.

The snapshots with the set number cleanup algorithm will be deleted according to the rules configured in /etc/snapper/configs/root by the following parameters:

NUMBER_MIN_AGE

Defines the minimum age of a snapshot (in seconds) that can be automatically removed.

NUMBER_LIMIT/NUMBER_LIMIT_IMPORTANT

Defines the maximum count of stored snapshots. The cleaning algorithms delete snapshots above the specified maximum value, without taking the snapshot and file system space into account. The algorithms also delete snapshots above the minimum value until the limits for the snapshot and file system are reached.

The snapshot cleanup is also preformed regularly by systemd.

2.3 System rollback

GRUB 2 enables booting from btrfs snapshots and thus allows you to use any older functional snapshot in case that the new snapshot does not work correctly.

When booting a snapshot, the parts of the file system included in the snapshot are mounted read-only; all other file systems and parts that are excluded from snapshots are mounted read-write and can be modified.

Tip
Tip: Rolling back to a specific installation state

An initial bootable snapshot is created at the end of the initial system installation. You can go back to that state at any time by booting this snapshot. The snapshot can be identified by the description after installation.

There are two methods how you can perform a system rollback.

In case your current snapshot is functional, you can use the following procedure for system rollback.

Procedure 2: Rollback from a running system
  1. Choose the snapshot that should be set as default, run:

    # transactional-update status

    to get a list of available snapshots. Note the number of the snapshot to be set as default.

  2. Set the snapshot as the default by running:

    # transactional-update rollback snapshot_number

    If you omit the snapshot number, the current snapshot will be set as default.

  3. Reboot your system to boot in to the new default snapshot.

The following procedure is used in case the current snapshot is broken and you are not able to boot into it.

Procedure 3: Rollback to a working snapshot
  1. Reboot your system and select Start bootloader from a read-only snapshot

  2. Choose a snapshot to boot. The snapshots are sorted according to the date of creation, with the latest one at the top.

  3. Log in to your system and check whether everything works as expected. Data written to directories excluded from the snapshots will stay untouched.

  4. If the snapshot you booted into is not suitable for rollback, reboot your system and choose another one.

    If the snapshot works as expected, you can perform rollback by running the following command:

    # transactional-update rollback

    And reboot afterwards.

2.4 Managing automatic transactional updates

Automatic updates are controlled by a systemd.timer that runs once per day. This applies all updates, and informs rebootmgrd that the machine should be rebooted. You may adjust the time when the update runs, see systemd.timer(5) documentation.

You can disable automatic transactional updates with this command:

# systemctl --now disable transactional-update.timer

3 Health checker

Health checker is a program delivered with SLE Micro that checks whether services are running properly during booting of your system.

During the boot process, systemd calls Health checker, which in turn calls its plugins. Each plugin checks a particular service or condition. If each check passes, a status file (/var/lib/misc/health-checker.state) is created. The status file marks the current root file system as correct.

If any of the health checker plugins reports an error, the action taken depends on a particular condition, as described below:

The snapshot is booted for the first time.

If the current snapshot is different from the last one that worked properly, an automatic rollback to the last working snapshot is performed. This means that the last change performed to the file system broke the snapshot.

The snapshot has already booted correctly in the past.

There could be just a temporary problem, and the system is rebooted automatically.

The reboot of a previously correctly booted snapshot has failed.

If there was already a problem during boot and automatic reboot has been triggered, but the problem still persists, then the system is kept running to enable to the administrator to fix the problem. The services that are tested by the health checker plugins are stopped if possible.

3.1 Adding custom plugins

Health checker supports the addition of your own plugins to check services during the boot process. Each plugin is a bash script that must fulfill the following requirements:

  • Plugins are located within a specific directory—/usr/libexec/health-checker

  • The service that will be checked by the particular plugin must be defined in the Unit section of the /usr/lib/systemd/system/health-checker.service file. For example, the etcd service is defined as follows:

    [Unit]
    ...
    After=etcd.service
    ...
  • Each plugin must have functions called run.checks and stop_services defined. The run.checks function checks whether a particular service has started properly. Bear in mind that service that has not been enabled by systemd, should be ignored. The function stop_services is called to stop the particular service in case the service has not been started properly. You can use the plugin template for your reference.

4 SLE Micro administration using Cockpit

Cockpit is a web-based graphical interface that enables you to manage your SLE Micro deployments from one place. Cockpit is included in the delivered pre-built images, or can be installed if you are installing your own instances manually. For details regarding the manual installation, refer to 第 12.9.2 节 “软件”.

Though Cockpit is present in the pre-built images by default, the plugin for administration of virtual machines needs to be installed manually. You can do so by installing the microos-cockpit pattern as described bellow. Use the command below as well in case Cockpit is not installed on your system.

# transactional-update pkg install -t pattern microos-cockpit

Reboot your machine to switch to the latest snapshot.

Note
Note: Cockpit's plugins installed from the microos-cockpit pattern may differ according to technologies installed on your system

The plugin Podman containers is installed only if the Containers Runtime for non-clustered systems patterns are installed on your system. Similarly, the Virtual Machines plugin is installed only if the KVM Virtualization Host pattern is installed on your system.

Before running Cockpit on you machine, you need to enable the cockpit socket in systemd by running:

# systemctl enable --now cockpit.socket

In case you have enabled the firewall, you also must open the firewall for Cockpit as follows:

# firewall-cmd --permanent --zone=public --add-service=cockpit

And then reload the firewall configuration by running:

# firewall-cmd --reload

Now you can access the Cockpit web interface by opening the following address in your web browser:

https://IP_ADDRESS_OF_MACHINE:9090

A login screen opens. To login, use the same credentials as you use to login to your machine via console or SSH.

Cockpit login screen
Figure 1: Cockpit login screen

After successful login, the Cockpit web console opens. Here you can view and administer your system's performance, network interfaces, Podman containers, your virtual machines, services, accounts and logs. You can also access your machine using shell in a terminal emulator.

Cockpit dashboard
Figure 2: Cockpit dashboard

4.1 Users administration

Note
Note: Users administration only for server administrators

Only users with the Server administrator role can edit other users.

Cockpit enables you to manage users of your system. Click Accounts to open the user administration page. Here you can create a new account by clicking Create new account or manage already existing accounts by clicking on the particular account.

The Accounts screen
Figure 3: The Accounts screen

4.1.1 Creating new accounts

Click Create new account to open the window that enables you to add a new user. Fill in the user's login and/or full name and password, then confirm the form by clicking Create.

To add authorized SSH keys for the new user or set the Server administrator role, edit the already created account by clicking on it. For details, refer to Section 4.1.2, “Modifying accounts”.

4.1.2 Modifying accounts

After clicking the user icon in the Accounts page, the user details view opens and you can edit the user.

User details
Figure 4: User details

In the user's details view, you perform the following actions.

Delete the user

Click Delete to remove the user from the system.

Terminate user's session

By clicking Terminate session, you can log out the particular user from the system.

Change user's role

By checking/unchecking the Server administrator check box, you can assign or remove the administrator role from the user.

Manage access of the account

You can lock the account or you can set a date when the account will expire.

Manage the user's password

Click Set password to set a new password for the account.

By clicking Force change, the user will have to change the password on the next login.

Click edit to set whether or when the password expires.

Add SSH key

You can add a SSH key for passwordless authentication via SSH. Click Add key, paste the contents of the public SSH key and confirm it by clicking Add.

5 toolbox for SLE Micro debugging

SLE Micro uses the transactional-update command to apply changes to the system, but the changes are applied only after reboot. That solution has several benefits, but it also has some disadvantages. If you need to debug your system and install a new tool, the tool will be available only after reboot. Therefore you are not able to debug the currently running system. For this reason a utility called toolbox has been developed.

toolbox is a small script that pulls a container image and runs a privileged container based on that image. In the toolbox container you can install any tool you want with zypper and then use the tool without rebooting your system.

To start the toolbox container, run the following:

# /usr/bin/toolbox

If the script completes successfully, you will see the toolbox container prompt.

Note
Note: Obtaining the toolbox image

You can also use Podman or Cockpit to pull the toolbox image and start a container based on that image.

6 Monitoring performance

For performance monitoring purposes, SLE Micro provides a container image that enables you to run the Performance Co-Pilot (PCP) analysis toolkit in a container. The toolkit comprises tools for gathering and processing performance information collected either in real time or from PCP archive logs.

The performance data are collected by performance metrics domain agents and passed to the pmcd daemon. The daemon coordinates the gathering and exporting of performance statistics in response to requests from the PCP monitoring tools. pmlogger is then used to log the metrics. For details, refer to the PCP documentation.

6.1 Getting the PCP container image

The PCP container image is based on the BCI-Init container that utilizes systemd used to manage the PCP services.

You can pull the container image using podman or from the Cockpit web management console. To pull the image by using podman, run the following command:

# podman pull registry.suse.com/suse/pcp:latest

To get the container image using Cockpit, go to Podman containers, click Get new image, and search for pcp. Then select the image from the registry.suse.com for SLE 15 SP3 and download it.

6.2 Running the PCP container

The following command shows minimal options that you need to use to run a PCP container:

# podman run -d  \
  --systemd always \
  -p HOST_IP:HOST_PORT:CONTAINER_PORT \
  -v HOST_DIR:/var/log/pcp/pmlogger \
  PCP_CONTAINER_IMAGE

where the options have the following meaning:

-d

Runs the container in the systemd mode. All services needed to run in the PCP container will be started automatically by systemd in the container.

--systemd always

Runs the container in the systemd mode. All services needed to run in the PCP container will be started automatically by systemd in the container.

--privileged

The container runs with extended privileges. Use this option if your system has SELinux enabled, otherwise the collected metrics will be incomplete.

-v HOST_DIR:/var/log/pcp/pmlogger

Creates a bind mount so that pmlogger archives are written to the HOST_DIR on the host. By default, pmlogger stores the collected metrics in /var/log/pcp/pmlogger.

PCP_CONTAINER_IMAGE

Is the downloaded PCP container image.

Other useful options of the podman run command follow:

Other options
-p HOST_IP:HOST_PORT:CONTAINER_PORT

Publishes ports of the container by mapping a container port onto a host port. If you do not specify HOST_IP, the ports will be mapped on the local host. If you omit the HOST_PORT value, a random port number will be used. By default, the pmcd daemon listens and exposes the PMAPI to receive metrics on the port 44321, so it is recommended to map this port on the same port number on the host. The pmproxy daemon listens on and exposes the REST PMWEBAPI to access metrics on the 44322 port by default, so it is recommended to map this port on the same host port number.

--net host

The container uses the host's network. Use this option if you want to collect metrics from the host's network interfaces.

-e

The option enables you to set the following environment variables:

PCP_SERVICES

Is a comma-separated list of services to start by systemd in the container.

Default services are: pmcd, pmie, pmlogger, pmproxy.

You can use this variable, if you want to run a container with a list of services that is different from the default one, for example, only with pmlogger:

# podman run -d \
  --name pmlogger \
  --systemd always \
  -e PCP_SERVICES=pmlogger  \
  -v pcp-archives:/var/log/pcp/pmlogger  \
  registry.suse.com/suse/pcp:latest
HOST_MOUNT

Is a path inside the container to the bind mount of the host's root file system. The default value is not set.

REDIS_SERVERS

Specifies a connection to a Redis server. In a non-clustered setup, provide a comma-separated list of hosts specs. In a clustered setup, provide any individual cluster host, other hosts in the cluster are discovered automatically. The default value is: localhost:6379.

If you need to use different configuration then provided by the environment variables, proceed as described in Section 6.3, “Configuring PCP services”.

6.3 Configuring PCP services

All services that run inside the PCP container have a default configuration that might not suit your needs. If you need a custom configuration that cannot be covered by the environment variables described above, create configuration files for the PCP services and pass them to the PCP using a bind mount as follows:

# podman run -d \
  --name CONTAINER_NAME \
  --systemd always \
  -v $HOST_CONFIG:CONTAINER_CONFIG_PATH:z \
  -v HOST_LOGS_PATH:/var/log/pcp/pmlogger  \
  registry.suse.com/suse/pcp:latest

Where:

CONTAINER_NAME

Is an optional container name.

HOST_CONFIG

Is an absolute path to the config you created on the host machine. You can choose any file name you want.

CONTAINER_CONFIG_PATH

Is an absolute path to a particular configuration file inside the container. Each available configuration file is described in the corresponding sections further.

HOST_LOGS_PATH

Is a directory that should be bind mount to the container logs.

For example, a container called pcp, with the configuration file pmcd on the host machine and the pcp-archives directory for logs on the host machine, is run by the following command:

# podman run -d \
  --name pcp  \
  --systemd always \
  -v $(pwd)/pcp-archives:/var/log/pcp/pmlogger \
  -v $(pwd)/pmcd:/etc/sysconfig/pmcd \
registry.suse.com/suse/pcp:latest

6.3.1 Custom pmcd daemon configuration

The pmcd daemon configuration is stored in the /etc/sysconfig/pmcd file. The file stores environment variables that modify the behavior of the pmcd daemon.

6.3.1.1 The /etc/sysconfig/pmcd file

You can add the following variables to the file to configure the pmcd daemon:

PMCD_LOCAL

Defines whether the remote host can connect to the pmcd daemon. If set to 0, remote connections to the daemon are allowed. If set to 1, the daemon listens only on the local host. The default value is 0.

PMCD_MAXPENDING

Defines the maximum count of pending connections to the agent. The default value is 5.

PMCD_ROOT_AGENT

If the pmdaroot is enabled (the value is set to 1), adding a new PDMA does not trigger restarting of other PMDAs. If pmdaroot is not enabled, pmcd will require to restart all PMDAs when a new PMDA is added. The default value is 1.

PMCD_RESTART_AGENTS

If set to 1, the pmcd daemon tries to restart any exited PMDA. Enable this option only if you have enabled pmdaroot, as pmcd itself does not have privileges to restart PMDA.

PMCD_WAIT_TIMEOUT

Defines the maximum time in seconds, pmcd can wait to accept a connection. After this time, the connection is reported as failed. The default value is 60.

PCP_NSS_INIT_MODE

Defines the mode in which pmcd initializes the NSS certificate database when secured connections are used. The default value is readonly. You can set the mode to readwrite, but if the initialization fails, the default value is used as a fallback.

An example follows:

      PMCD_LOCAL=0
      PMCD_MAXPENDING=5
      PMCD_ROOT_AGENT=1
      PMCD_RESTART_AGENTS=1
      PMCD_WAIT_TIMEOUT=70
      PCP_NSS_INIT_MODE=readwrite

6.3.2 Custom pmlogger configuration

The custom configuration for the pmlogger is stored in the following configuration files:

  • /etc/sysconfig/pmlogger

  • /etc/pcp/pmlogger/control.d/local

6.3.2.1 The /etc/sysconfig/pmlogger file

You can use the following attributes to configure the pmlogger:

PMLOGGER_LOCAL

Defines whether pmlogger allows connections from remote hosts. If set to 1, pmlogger allows connections from local host only.

PMLOGGER_MAXPENDING

Defines the maximum count of pending connections. The default value is 5.

PMLOGGER_INTERVAL

Defines the default sampling interval pmlogger uses. The default value is 60 s. Keep in mind that this value can be overridden by the pmlogger command line.

PMLOGGER_CHECK_SKIP_LOGCONF

Setting this option to yes disables the regeneration and checking of the pmlogger configuration if the configuration pmlogger comes from pmlogconf. The default behavior is to regenerate configuration files and check for changes every time pmlogger is started.

An example follows:

PMLOGGER_LOCAL=1
PMLOGGER_MAXPENDING=5
PMLOGGER_INTERVAL=10
PMLOGGER_CHECK_SKIP_LOGCONF=yes
6.3.2.2 The /etc/pcp/pmlogger/control.d/local file

The file /etc/pcp/pmlogger/control.d/local stores specifications of the host, which metrics should be logged, the logging frequency (default is 24 hours), and pmlogger options. For example:

# === VARIABLE ASSIGNMENTS ===
#
# DO NOT REMOVE OR EDIT THE FOLLOWING LINE
$version=1.1

# Uncomment one of the lines below to enable/disable compression behaviour
# that is different to the pmlogger_daily default.
# Value is days before compressing archives, 0 is immediate compression,
# "never" or "forever" suppresses compression.
#
#$PCP_COMPRESSAFTER=0 
#$PCP_COMPRESSAFTER=3
#$PCP_COMPRESSAFTER=never
    
# === LOGGER CONTROL SPECIFICATIONS ===
#   
#Host           P?  S?  directory                       args

# local primary logger
LOCALHOSTNAME   y   n   PCP_ARCHIVE_DIR/LOCALHOSTNAME   -r -T24h10m -c config.default -v 100Mb
Note
Note: Defaults point to local host

If you run the pmlogger in a container on a different machine than the one that runs the pmcd (a client), change the following line to point to the client:

# local primary logger
CLIENT_HOSTNAME   y   n   PCP_ARCHIVE_DIR/CLIENT_HOSTNAME   -r -T24h10m -c config.default -v 100Mb

For example, for the slemicro_1 host name, the line should look as follows:

# local primary logger
slemicro_1   y   n   PCP_ARCHIVE_DIR/slemicro_1   -r -T24h10m -c config.default -v 100Mb

6.4 Starting the PCP container automatically on boot

After you run the PCP container, you can configure systemd to start the container on boot. To do so, follow the procedure below:

  1. Create a unit file for the container by using the podman generate systemd command:

    # podman generate systemd --name CONTAINER_NAME > /etc/systemd/system/container-CONTAINER_NAME.service

    where CONTAINER_NAME is the name of the PCP container you used when running the container from the container image.

  2. Enable the service in systemd:

    # systemctl enable container-CONTAINER_NAME

6.5 Metrics management

6.5.1 Listing available performance metrics

From within the container, you can use the command pminfo to list metrics. For example, to list all available performance metrics, run:

# pminfo

You can list a group of related metrics by specifying the metrics prefix:

# pminfo METRIC_PREFIX

For example, to list all metrics related to kernel, use:

# pminfo disk

disk.dev.r_await
disk.dm.await
disk.dm.r_await
disk.md.await
disk.md.r_await
...

You can also specify additional strings to narrow down the list of metrics, for example:

# piminfo disk.dev

disk.dev.read
disk.dev.write
disk.dev.total
disk.dev.blkread
disk.dev.blkwrite
disk.dev.blktotal
...

To get online help text of a particular metric, use the -t option followed by the metric, for example:

# pminfo -t kernel.cpu.util.user

kernel.cpu.util.user [percentage of user time across all CPUs, including guest CPU time]

To display a description text of a particular metric, use the -T option followed by the metric, for example:

# pminfo -T kernel.cpu.util.user

Help:
percentage of user time across all CPUs, including guest CPU time

6.5.2 Checking local metrics

After you start the PCP container, you can verify that metrics are being recorded properly by running the following command inside the container:

# pcp

Performance Co-Pilot configuration on localhost:

 platform: Linux localhost 5.3.18-150300.59.68-default #1 SMP Wed May 4 11:29:09 UTC 2022 (ea30951) x86_64
 hardware: 1 cpu, 1 disk, 1 node, 1726MB RAM
 timezone: UTC
 services: pmcd pmproxy
     pmcd: Version 5.2.2-1, 9 agents, 4 clients
     pmda: root pmcd proc pmproxy xfs linux mmv kvm jbd2
 pmlogger: primary logger: /var/log/pcp/pmlogger/localhost/20220607.09.24
     pmie: primary engine: /var/log/pcp/pmie/localhost/pmie.log

Now check if the logs are written to a proper destination:

# ls PATH_TO_PMLOGGER_LOGS

where PATH_TO_PMLOGGER_LOGS should be /var/log/pcp/pmlogger/localhost/ in this case.

6.5.3 Recording metrics from remote systems

You can deploy collector containers that collect metrics from different remote systems than the ones where the pmlogger containers are running. Each remote collector system needs the pmcd daemon and a set of pmda. To deploy several collectors with a centralized monitoring system, proceed as follows.

  1. On each system you want to collect metrics from (clients), run a container with the pmcd daemon:

    # podman run -d \
        --name pcp-pmcd \
        --privileged \
        --net host \
        --systemd always \
        -e PCP_SERVICES=pmcd \
        -e HOST_MOUNT=/host \
        -v /:/host:ro,rslave \
        registry.suse.com/suse/pcp:latest
  2. On the monitoring system, create a pmlogger configuration file for each client control.CLIENT with the following content:

    $version=1.1
     
    CLIENT_HOSTNAME n n PCP_ARCHIVE_DIR/CLIENT -N -r -T24h10m -c config.default -v 100Mb

    Keep in mind that the CLIENT_HOSTNAME must be resolvable in DNS. You can use IP addresses or fully qualified domain names (FQDN) instead.

  3. On the monitoring system, create a directory for each client to store the recorded logs:

    # mkdir /root/pcp-archives/CLIENT

    For example, for slemicro_1:

    # mkdir /root/pcp-archives/slemicro_1
  4. On the monitoring system, run a container with pmlogger for each client:

    # podman run -d \
        --name pcp-pmlogger-CLIENT \
        --systemd always \
        -e PCP_SERVICES=pmlogger \
        -v /root/pcp-archives/CLIENT:/var/log/pcp/pmlogger:z \
        -v $(pwd)/control.CLIENT:/etc/pcp/pmlogger/control.d/local:z \
        registry.suse.com/suse/pcp:latest

    For example, for a client called slemicro_1:

    # podman run -d \
        --name pcp-pmlogger-slemicro_1 \
        --systemd always \
        -e PCP_SERVICES=pmlogger \
        -v /root/pcp-archives:/var/log/pcp/pmlogger:z \
        -v $(pwd)/control.slemicro_1:/etc/pcp/pmlogger/control.d/local:z \
        registry.suse.com/suse/pcp:latest
    Note
    Note

    The second bind mount points to the configuration file created in Step 2 and replaces the default pmlogger configuration. If you do not create this bind mount, pmlogger uses the default /etc/pcp/pmlogger/control.d/local file and logging from clients fails as the default configuration points to a local host. For details about the configuration file, refer to Section 6.3.2.2, “The /etc/pcp/pmlogger/control.d/local file”.

  5. To check if the logs collection is working properly, run:

    # ls -l pcp-archives/CLIENT/CLIENT

    For example:

    # ls -l pcp-archives/slemicro_1/slemicro_1
    
    total 1076
    -rw-r--r--. 1 systemd-network systemd-network 876372 Jun  8 11:24 20220608.10.58.0
    -rw-r--r--. 1 systemd-network systemd-network    312 Jun  8 11:22 20220608.10.58.index
    -rw-r--r--. 1 systemd-network systemd-network 184486 Jun  8 10:58 20220608.10.58.meta
    -rw-r--r--. 1 systemd-network systemd-network    246 Jun  8 10:58 Latest
    -rw-r--r--. 1 systemd-network systemd-network  24595 Jun  8 10:58 pmlogger.log

7 User management

You can define users during the deployment process of SLE Micro. Although you can define users as you want when deploying pre-built images, during the manual installation, you define only the root user in the installation flow. Therefore, you might want to use other users than those provided during the installation process. There are two possibilities for adding users to an already installed system:

  • using CLI - the command useradd. Run the following command for usage:

    # useradd --help

    Bear in mind that a user that should have the server administrator role, must be included in the group wheel.

  • using Cockpit; for details refer to Section 4.1, “Users administration”.