Chapter 33. Power Management

Contents

33.1. Power Saving Functions
33.2. Advanced Configuration and Power Interface (ACPI)
33.3. Rest for the Hard Disk
33.4. Troubleshooting
33.5. For More Information

Power management is especially important on laptop computers, but is also useful on other systems. ACPI (Advanced Configuration and Power Interface) is available on all modern computers (laptops, desktops, and servers). Power management technologies require suitable hardware and BIOS routines. Most laptops and many modern desktops and servers meet these requirements. It is also possible to control CPU frequency scaling to save power or decrease noise.

33.1. Power Saving Functions

Power saving functions are not only significant for the mobile use of laptops, but also for desktop systems. The main functions and their use in ACPI are:

Standby

not supported.

Suspend (to memory)

This mode writes the entire system state to the RAM. Subsequently, the entire system except the RAM is put to sleep. In this state, the computer consumes very little power. The advantage of this state is the possibility of resuming work at the same point within a few seconds without having to boot and restart applications. This function corresponds to the ACPI state S3. The support of this state is still under development and therefore largely depends on the hardware.

Hibernation (suspend to disk)

In this operating mode, the entire system state is written to the hard disk and the system is powered off. There must be a swap partition at least as big as the RAM to write all the active data. Reactivation from this state takes about 30 to 90 seconds. The state prior to the suspend is restored. Some manufacturers offer useful hybrid variants of this mode, such as RediSafe in IBM Thinkpads. The corresponding ACPI state is S4. In Linux, suspend to disk is performed by kernel routines that are independent from ACPI.

Battery Monitor

ACPI checks the battery charge status and provides information about it. Additionally, it coordinates actions to perform when a critical charge status is reached.

Automatic Power-Off

Following a shutdown, the computer is powered off. This is especially important when an automatic shutdown is performed shortly before the battery is empty.

Processor Speed Control

In connection with the CPU, energy can be saved in three different ways: frequency and voltage scaling (also known as PowerNow! or Speedstep), throttling and putting the processor to sleep (C-states). Depending on the operating mode of the computer, these methods can also be combined.

33.2. Advanced Configuration and Power Interface (ACPI)

ACPI was designed to enable the operating system to set up and control the individual hardware components. ACPI supersedes both Power Management Plug and Play (PnP) and Advanced Power Management (APM). It delivers information about the battery, AC adapter, temperature, fan and system events, like close lid or battery low.

The BIOS provides tables containing information about the individual components and hardware access methods. The operating system uses this information for tasks like assigning interrupts or activating and deactivating components. Because the operating system executes commands stored in the BIOS, the functionality depends on the BIOS implementation. The tables ACPI can detect and load are reported in /var/log/boot.msg. See Section 33.2.3, “Troubleshooting” for more information about troubleshooting ACPI problems.

33.2.1. Controlling the CPU Performance

The CPU can save energy in three ways:

Depending on the operating mode of the computer, these methods can be combined. Saving energy also means that the system heats up less and the fans are activated less frequently.

Frequency scaling and throttling are only relevant if the processor is busy, because the most economic C-state is applied anyway when the processor is idle. If the CPU is busy, frequency scaling is the recommended power saving method. Often the processor only works with a partial load. In this case, it can be run with a lower frequency. Usually, dynamic frequency scaling controlled by the kernel on-demand governor is the best approach.

Throttling should be used as the last resort, for example, to extend the battery operation time despite a high system load. However, some systems do not run smoothly when they are throttled too much. Moreover, CPU throttling does not make sense if the CPU has little to do.

33.2.1.1. Frequency and Voltage Scaling

PowerNow! and Speedstep are the designations AMD and Intel use for this technology. However, this technology is also applied in processors of other manufacturers. The clock frequency of the CPU and its core voltage are reduced at the same time, resulting in more than linear energy savings. This means that when the frequency is halved (half performance), far less than half of the energy is consumed. This technology is independent from ACPI.

There are two main approaches to performing CPU frequency scaling—by the kernel itself (CPUfreq infrastructure with in-kernel governors) or by a userspace application. The in-kernel governors are policy governors that can change the CPU frequency based on different criteria (a sort of pre-configured power schemes for the CPU). The following governors are available with the CPUfreq subsystem:

Performance Governor

The CPU frequency is statically set to the highest possible for maximum performance. Consequently, saving power is not the focus of this governor.

Powersave Governor

The CPU frequency is statically set to the lowest possible. This can have severe impact on the performance, as the system will never rise above this frequency no matter how busy the processors are.

On-demand Governor

The kernel implementation of a dynamic CPU frequency policy: The governor monitors the processor utilization. As soon as it exceeds a certain threshold, the governor will set the frequency to the highest available. If the utilization is less than the threshold, the next lowest frequency is used. If the system continues to be underutilized, the frequency is again reduced until the lowest available frequency is set.

Conservative Governor

Similar to the on-demand implementation, this governor also dynamically adjusts frequencies based on processor utilization, except that it allows for a more gradual increase in power. If processor utilization exceeds a certain threshold, the governor does not immediately switch to the highest available frequency (as the on-demand governor does), but only to next higher frequency available.

The relevant files for the kernel governors are located at /sys/devices/system/cpu/cpu*/cpufreq/. If your machine has more than one CPU, /sys/devices/system/cpu/ will hold a subdirectory for each processor: cpu0, cpu1, etc. If your system currently uses the on-demand or conservative governor, you will see a separate subdirectory for those governors in cpufreq, containing the parameters for the governors.

33.2.1.2. Throttling the Clock Frequency (T-states)

This technology omits a certain percentage of the clock signal impulses for the CPU. At 25% throttling, every fourth impulse is omitted. At 87.5%, only every eighth impulse reaches the processor. However, the energy savings are a little less than linear. Normally, throttling is only used if frequency scaling is not available or to maximize power savings. This technology must be controlled by a special process, as well. The system interface for Processor Throttling States (T-states) is /proc/acpi/processor/*/throttling.

33.2.1.3. Putting the Processor to Sleep (C-states)

Modern processors have several power saving modes called C-states. They reflect the capability of an idle processor to turn off unused components in order to save power. The operating system puts the processor to sleep whenever there is no activity. In this case, the operating system sends the CPU a halt command. There are three idle states: C1, C2, and C3. In the most economic state, C3, even the synchronization of the processor cache with the main memory is halted. Therefore, this state can only be applied if no other device modifies the contents of the main memory via bus master activity. Some drivers prevent the use of C3. The current state is displayed in /proc/acpi/processor/*/power.

33.2.2. Tools

To view or adjust the current settings of the CPUfreq subsystem use the tools provided by cpufrequtils for that. After you have installed the cpufrequtils package, use the cpufreq-info to retrieve CPUfreq kernel information. The cpufreq-set command can be used to modify CPUfreq settings. For example, run the following command as root to activate the on-demand governor at runtime:

cpufreq-set -g ondemand

For more details and the available options, refer to the cpufreq-info and the cpufreq-set man pages or run cpufreq-info --help or cpufreq-set --help, respectively.

A useful tool for monitoring system power consumption is powerTOP, available after installation of the powertop package. It helps you to identify the reasons for unnecessary high power consumption (for example, processes that are mainly responsible for waking up a processor from its idle state) and to optimize your system settings to avoid these. It supports both Intel and AMD processors. For detailed information, refer to the powerTOP project page at http://www.lesswatts.org/projects/powertop/.

Apart from the tools above, the following ACPI utilities is available:

  • To merely display information, like the battery charge level and the temperature, you can use the acpi command. For a list of available options, run acpi --help.

  • For editing the ACPI tables in the BIOS, install the acpica package.

33.2.3. Troubleshooting

There are two different types of problems. On one hand, the ACPI code of the kernel may contain bugs that were not detected in time. In this case, a solution will be made available for download. More often, the problems are caused by the BIOS. Sometimes, deviations from the ACPI specification are purposely integrated in the BIOS to circumvent errors in the ACPI implementation of other widespread operating systems. Hardware components that have serious errors in the ACPI implementation are recorded in a blacklist that prevents the Linux kernel from using ACPI for these components.

The first thing to do when problems are encountered is to update the BIOS. If the computer does not boot at all, one of the following boot parameters may be helpful:

pci=noacpi

Do not use ACPI for configuring the PCI devices.

acpi=ht

Only perform a simple resource configuration. Do not use ACPI for other purposes.

acpi=off

Disable ACPI.

[Warning]Problems Booting without ACPI

Some newer machines (especially SMP systems and AMD64 systems) need ACPI for configuring the hardware correctly. On these machines, disabling ACPI can cause problems.

Sometimes, the machine is confused by hardware that is attached over USB or FireWire. If a machine refuses to boot, unplug all unneeded hardware and try again.

Monitor the boot messages of the system with the command dmesg | grep -2i acpi (or all messages, because the problem may not be caused by ACPI) after booting. If an error occurs while parsing an ACPI table, the most important table—the DSDT (Differentiated System Description Table)—can be replaced with an improved version. In this case, the faulty DSDT of the BIOS is ignored. The procedure is described in Section 33.4.1, “ACPI Activated with Hardware Support but Functions Do Not Work”.

In the kernel configuration, there is a switch for activating ACPI debug messages. If a kernel with ACPI debugging is compiled and installed, experts searching for an error can be supported with detailed information.

If you experience BIOS or hardware problems, it is always advisable to contact the manufacturers. Especially if they do not always provide assistance for Linux, they should be confronted with the problems. Manufacturers will only take the issue seriously if they realize that an adequate number of their customers use Linux.

33.2.3.1. For More Information

33.3. Rest for the Hard Disk

In Linux, the hard disk can be put to sleep entirely if it is not needed or it can be run in a more economic or quieter mode. On modern laptops, you do not need to switch off the hard disks manually, because they automatically enter an economic operating mode whenever they are not needed. However, if you want to maximize power savings, test some of the following methods, using the hdparm command.

It can be used to modify various hard disk settings. The option -y instantly switches the hard disk to the standby mode. -Y puts it to sleep. hdparm -S x causes the hard disk to be spun down after a certain period of inactivity. Replace x as follows: 0 disables this mechanism, causing the hard disk to run continuously. Values from 1 to 240 are multiplied by 5 seconds. Values from 241 to 251 correspond to 1 to 11 times 30 minutes.

Internal power saving options of the hard disk can be controlled with the option -B. Select a value from 0 to 255 for maximum saving to maximum throughput. The result depends on the hard disk used and is difficult to assess. To make a hard disk quieter, use the option -M. Select a value from 128 to 254 for quiet to fast.

Often, it is not so easy to put the hard disk to sleep. In Linux, numerous processes write to the hard disk, waking it up repeatedly. Therefore, it is important to understand how Linux handles data that needs to be written to the hard disk. First, all data is buffered in the RAM. This buffer is monitored by the pdflush daemon. When the data reaches a certain age limit or when the buffer is filled to a certain degree, the buffer content is flushed to the hard disk. The buffer size is dynamic and depends on the size of the memory and the system load. By default, pdflush is set to short intervals to achieve maximum data integrity. It checks the buffer every 5 seconds and writes the data to the hard disk. The following variables are interesting:

/proc/sys/vm/dirty_writeback_centisecs

Contains the delay until a pdflush thread wakes up (in hundredths of a second).

/proc/sys/vm/dirty_expire_centisecs

Defines after which timeframe a dirty page should be written out latest. Default is 3000, which means 30 seconds.

/proc/sys/vm/dirty_background_ratio

Maximum percentage of dirty pages until pdflush begins to write them. Default is 5%.

/proc/sys/vm/dirty_ratio

When the dirty page exceeds this percentage of the total memory, processes are forced to write dirty buffers during their time slice instead of continuing to write.

[Warning]Impairment of the Data Integrity

Changes to the pdflush daemon settings endanger the data integrity.

Apart from these processes, journaling file systems, like ReiserFS, Ext3, Ext4 and others write their metadata independently from pdflush, which also prevents the hard disk from spinning down. To avoid this, a special kernel extension has been developed for mobile devices. To make use of the extension, install the laptop-mode-tools package and see /usr/src/linux/Documentation/laptops/laptop-mode.txt for details.

Another important factor is the way active programs behave. For example, good editors regularly write hidden backups of the currently modified file to the hard disk, causing the disk to wake up. Features like this can be disabled at the expense of data integrity.

In this connection, the mail daemon postfix makes use of the variable POSTFIX_LAPTOP. If this variable is set to yes, postfix accesses the hard disk far less frequently.

33.4. Troubleshooting

All error messages and alerts are logged in the file /var/log/messages. The following sections cover the most common problems.

33.4.1. ACPI Activated with Hardware Support but Functions Do Not Work

If you experience problems with ACPI, search the output of dmesg for ACPI-specific messages by using the command dmesg|grep -i acpi.

A BIOS update may be required to resolve the problem. Go to the home page of your laptop manufacturer, look for an updated BIOS version, and install it. Ask the manufacturer to comply with the latest ACPI specification. If the errors persist after the BIOS update, proceed as follows to replace the faulty DSDT table in your BIOS with an updated DSDT:

Procedure 33.1. Updating the DSDT Table in the BIOS

For the procedure below, make sure the following packages are installed: kernel-source, acpica, and mkinitrd.

  1. Download the DSDT for your system from http://acpi.sourceforge.net/dsdt/index.php. Check if the file is decompressed and compiled as shown by the file extension .aml (ACPI machine language). If this is the case, continue with step 3.

  2. If the file extension of the downloaded table is .asl (ACPI source language) instead, compile it by executing the following command:

    iasl -sa file.asl
  3. Copy the (resulting) file DSDT.aml to any location (/etc/DSDT.aml is recommended).

  4. Edit /etc/sysconfig/kernel and adapt the path to the DSDT file accordingly.

  5. Start mkinitrd. Whenever you install the kernel and use mkinitrd to create an initrd file, the modified DSDT is integrated and loaded when the system is booted.

33.4.2. CPU Frequency Does Not Work

Refer to the kernel sources to see if your processor is supported. You may need a special kernel module or module option to activate CPU frequency control. If the kernel-source package is installed, this information is available in /usr/src/linux/Documentation/cpu-freq/*.

33.4.3. Suspend and Standby Do Not Work

ACPI systems may have problems with suspend and standby due to a faulty DSDT implementation (BIOS). If this is the case, update the BIOS.

When the system tries to unload faulty modules, the system is arrested or the suspend event is not triggered. The same can also happen if you do not unload modules or stop services that prevent a successful suspend. In both cases, try to identify the faulty module that prevented the sleep mode. The log file /var/log/pm-suspend.log contains detailed information about what is going on and where possible errors are. Modify the SUSPEND_MODULES variable in /usr/lib/pm-utils/defaults to unload problematic modules prior to a suspend or standby.

Refer to http://old-en.opensuse.org/Pm-utils and http://en.opensuse.org//SDB:Suspend_to_RAM to get more detailed information on how to modify the suspend and resume process.

33.5. For More Information