Check and Monitor CPU Temperature
- WHAT?
Step-by-step explanation of how to configure CPU temperature monitoring.
- WHY?
You want to reduce your electricity bill and make sure that the hardware runs optimally.
- EFFORT
10 minutes to read the article and 20 minutes to install and configure the required tool.
- GOAL
Put in place a mechanism for checking and monitoring CPU temperature.
Root permissions to install the required package
The package sensors
1 Introduction #
Checking and monitoring CPU temperature has several benefits.
Energy savings and cost reduction. When a CPU runs at full speed, it consumes more energy than when it is idling. Also, running CPUs cool is a critical cost factor, especially in data centers.
Identifying and monitoring processes that consume too much CPU power. Doing that can help to free your CPU resources and increase the CPU's responsiveness.
Easier detection of cooling issues. If the CPU temperature reaches 80°C or higher, it indicates that there is a problem with the cooling system or the fan, or that the thermal paste was not applied correctly.
A long-term reduction of the carbon footprint can be achieved by adjusting the cooling parameters.
The sensors package is available on all architectures except IBM Z.
2 Installing and configuring hardware sensors #
To measure the CPU temperature, install and configure the sensors tool that can access and read the hardware sensors.
Install the required package:
>
sudo
zypper install sensors
To detect all the sensors in the system, run the following command as
root
:>
sudo
sensors-detect --auto
The
--auto
option allows checking for all hardware monitoring chips at once without probing them one by one. When finished, the script shows a summary of what chips were detected:Now follows a summary of the probes I have just done. Driver `coretemp': * Chip `Intel digital thermal sensor' (confidence: 9) Driver `to-be-written': * ISA bus, address 0xa40 Chip `ITE IT8686E Super IO Sensors' (confidence: 9) Do you want to generate /etc/sysconfig/lm_sensors? (YES/no):
Confirm to generate the file
/etc/sysconfig/lm_sensors
. After confirmation, the script creates asystemd
service (/usr/lib/systemd/system/lm_sensors.service.
) that is enabled by default.
Check the status of the systemd
service:
>
sudo
systemctl status lm_sensors
● lm_sensors.service - Initialize hardware monitoring sensors Loaded: loaded (/usr/lib/systemd/system/lm_sensors.service; enabled; vendor preset: disabled) Active: active (exited) since Fri 2021-09-10 16:57:55 CEST; 2min 23s ago Process: 32552 ExecStart=/usr/bin/sensors -s (code=exited, status=0/SUCCESS) Process: 32551 ExecStart=/sbin/modprobe -qab $BUS_MODULES $HWMON_MODULES (code=exited, status=0/SUCCESS) Main PID: 32552 (code=exited, status=0/SUCCESS) Tasks: 0 CGroup: /system.slice/lm_sensors.service Sep 10 16:57:55 edison systemd[1]: Starting Initialize hardware monitoring sensors... Sep 10 16:57:55 edison systemd[1]: Started Initialize hardware monitoring sensors.
After you have completed these steps, your computer has detected all sensors and has started to monitor them.
3 Getting temperature data #
To obtain a snapshot of the current temperature, run the following command:
>
sensors
[...] nvme-pci-0700 1 Adapter: PCI adapter 2 Composite: +36.9°C (low = -273.1°C, high = +83.8°C)3 (crit = +83.8°C) Sensor 1: +36.9°C (low = -273.1°C, high = +65261.8°C)4 Sensor 2: +43.9°C (low = -273.1°C, high = +65261.8°C)56 Adapter: ACPI device temp1: +16.8°C (crit = +18.8°C) temp2: +27.8°C (crit = +119.0°C) temp3: +29.8°C (crit = +119.0°C) coretemp-isa-0000 Adapter: ISA adapter Package id 0: +43.0°C (high = +82.0°C, crit = +100.0°C) Core 0: +41.0°C (high = +82.0°C, crit = +100.0°C) Core 1: +41.0°C (high = +82.0°C, crit = +100.0°C) Core 2: +43.0°C (high = +82.0°C, crit = +100.0°C) Core 3: +41.0°C (high = +82.0°C, crit = +100.0°C) Core 4: +41.0°C (high = +82.0°C, crit = +100.0°C) Core 5: +40.0°C (high = +82.0°C, crit = +100.0°C)
Specific hardware component or sensor chip being monitored. | |
The descriptive name for the specific sensor on the chip. | |
Aggregate temperature measurement from several sensors. The
| |
This is a stand-alone sensor on the motherboard that is currently reading at 36.9 degrees Celsius. | |
This is another stand-alone sensor on the motherboard reading at 43.9 degrees Celsius. | |
The value of +65261.8°C is a placeholder or a default maximum value, indicating that the sensor is not programmed to measure temperatures above that level. Since the actual reading (+36.9°C) is far below this value, we can ignore the anomalously high maximum. |
The output of the sensors
command depends on the type
of hardware installed on your machine, as different hardware components
have different sensors.
4 Monitoring CPU temperature in real time #
To monitor the temperature in real time, run the watch
command:
>
watch
sensors
The watch
command is a built-in Linux utility that runs
user-defined commands at regular intervals. Its combination with
sensors
is useful if you need to keep an eye on your
system's temperatures or voltages. The result looks as follows:
Every 2.0s: sensors iwlwifi_1-virtual-0 Adapter: Virtual device temp1: +56.0°C k10temp-pci-00c3 Adapter: PCI adapter Tctl: +57.8°C amdgpu-pci-0600 Adapter: PCI adapter vddgfx: +0.73 V vddnb: +0.74 V edge: +50.0°C PPT: 0.00 W
By default, the watch
command updates the output every
two seconds. You can change this interval by using the -n
option followed by the number of seconds. For example, to change the
interval to 5 seconds, use:
>
watch
-n 5 sensors
Press Ctrl–C
to stop the watch
command.
5 Troubleshooting #
This part describes potential problems when monitoring CPU temperatures and their solutions.
5.1 No sensors were detected #
On laptops, the sensors-detect
command may provide the
following output:
Sorry, no sensors were detected. This is relatively common on laptops, where thermal management is handled by ACPI rather than the OS.
This message is displayed when sensors-detect
cannot
find any hardware sensors on your laptop because most laptops handle
thermal management through ACPI (Advanced Configuration and Power
Interface), not the operating system.
sensors
command
Despite the message about the failure to detect sensors, the
sensors
command may still work and provide expected
results.
You can check the CPU temperature using the tools that read from the ACPI interface.
Check if the acpi package is installed. This package provides an interface for the hardware's embedded controller via ACPI, allowing you to check battery status, thermal zone temperature, and more. To install, run the command:
>
sudo
zypper install acpi
Check the CPU temperature directly from the
/sys
file system. The CPU temperature is located in/sys/class/thermal/thermal_zone*/temp
. Below is an example of the command with its output:>
cat /sys/class/thermal/thermal_zone*/temp 41000The temperature is displayed in milliCelsius. To get the temperature in Celsius, divide the output by 1000 to get, in our example, 41°C.
For more information about ACPI, refer to https://documentation.suse.com/sles/html/SLES-all/cha-power-mgmt.html#sec-power-mgmt-acpi.
Mainframes do not have the same power management needs as desktops, laptops and servers, and so they do not typically use ACPI. Instead, mainframes use different architectures and technologies for their configuration and management.
5.2 The displayed temperatures are unrealistic #
If you suspect that the displayed temperature is too low or too high, you can try the following:
Check whether the sensors are detected correctly: Rerun the
sensors-detect
command to redetect the sensors.>
sudo
sensors-detect
Then, run the
sensors
command again to see if the temperature readings are more realistic.Check the raw thermal data in the
/sys/class/thermal/
directory. See whether the raw data matches the output of thesensors
command.>
cat /sys/class/thermal/thermal_zone*/tempUse a different tool to read the CPU temperature, for example, Hardinfo, which is a system profiler and benchmark tool. It can gather information about your system's hardware and operating system, perform benchmarks, and generate printable reports. It can also show the CPU temperature. To install Hardinfo, use the following commands:
>
sudo
zypper install hardinfo
Then, you can launch Hardinfo from the app menu.
If none of these recommendations solves the issue, the problem might be due to unsupported or faulty hardware. In this case, you need to seek help from your hardware manufacturer.
5.3 The displayed temperature is too high #
If the CPU temperature is too high, here are actions you can take:
Verify that the CPU cooling system, such as the fan or heat sink, works correctly. Ensure that the fan is spinning properly and that the heat sink is making proper contact with the CPU. If necessary, you may need to replace the thermal paste between the CPU and the heat sink to improve heat transfer.
Adjust the power settings on your system to reduce heat generation. Lowering the CPU frequency or enabling power-saving features can help keep the temperature in check. For more information about lowering CPU frequency, see https://documentation.suse.com/sbp/all/single-html/SBP-performance-tuning/index.html#sec-cpupower-tool.
Monitor the system load and CPU usage. High CPU usage for extended periods can lead to increased temperatures. Identify any resource-intensive processes and consider optimizing or limiting their usage. For more information, refer to https://www.suse.com/support/kb/doc/?id=000016916.
6 For more information #
For more information about lm_sensors, see the repository's README at https://github.com/lm-sensors/lm-sensors.
For more information about programs, tools and utilities that you can use to examine the status of your system and monitor power-consuming processes, see System monitoring utilities.
For more information about making your computing more environmentally sustainable, see https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1009324.
7 Legal Notice #
Copyright© 2006–2024 SUSE LLC and contributors. All rights reserved.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”.
For SUSE trademarks, see https://www.suse.com/company/legal/. All other third-party trademarks are the property of their respective owners. Trademark symbols (®, ™ etc.) denote trademarks of SUSE and its affiliates. Asterisks (*) denote third-party trademarks.
All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its affiliates, the authors, nor the translators shall be held liable for possible errors or the consequences thereof.