Linux Intel Pstate driver. Energy. WWW.Smythies.com

This web page has some energy verses pstate data for various processors, as well as the procedure and tools to acquire such data.
There is also a digression included for some tests on Stratos' ondemand patches for the acpi-cpufreq driver, because the basic procedure is the same, and that test needs to be done on multiple processors also.

Concept
By modifying the load tool to be more representative of real life periodic workflows because the amount of work to do is fixed rather than the load being fixed, insight is gained into the energy tradeoffs for proposed algorithms. If the processor finishes the work faster it gets to sleep longer before the next cycle. The fixed load methods can not be used for energy aware testing.

Note that herein the program is always calibrated so that 100% load means 100% load at forced minimum pstate (16 for i7-2600K). That means it should be able to go to about 230% before a real 100% load at the highest pstate (38 for i7-2600K). (much higher percent (500%) for a haswell processor).

Here is the energy verses max percent graph (basically energy Verses pstate) for the i7-2900K processor.

A graph

Here is the energy verses max percent graph (basically energy Verses pstate) for the i7-4690K processor.
Note: the computer was a desktop, not a server, and the graph only made some sense once the GUI stuff was shutdown

A graph

Similar procedures can be used to create energy verses load graphs. Here is an example using the ondemand mode with the acpi-cpufreq driver and some proposed patchs (before and after the patches):

A graph

And again for an i7-3770 (Stratos'):

A graph

Calibration
The program needs to be calibrated for the users processor. It can be calibrated in any convienient way, but by convention and for comparible test results calibarate so that 100% load means 100% at minimum pstate and at 5 Hertz.
From the references link below, get and compile the program.

cc consume1.c -o consume1

Force the system to only use the minimum pstate. For the acpi-cpufreq driver this means powersave mode. For the intel pstate driver, determine what min_pct is and force the maximum to that value for the duration of the calibration phase. Example:

 
sudo cat /sys/devices/system/cpu/intel_pstate/min_perf_pct   <<< gives 42 for i7-2600K (It will be 20 for a haswell).
echo 42 | sudo tee /sys/devices/system/cpu/intel_pstate/max_perf_pct

Now run the program at 100% and 5 Hertz for 10 seconds and time it:
Check for overruns and use the timing information to iterate the calibration value. Recompile and check again.
Example starting with a CALIBRATION of 180 ( in the program: "#define CALIBRATION 180" )

doug@s15:~/c$ time ./consume1 100 5 10
consume: 100 5 10  PID: 7711  Elapsed: 10157504  Now: 1405373989668273  Overruns: 43

real    0m10.159s
user    0m10.146s
sys     0m0.006s

There were many overruns, so backoff on the CALIBRATION. There were supposed to 50 loops run, but around only 43 got done so approximate by 180 * 43 / 50 = 155. However this method is crude, so backoff some more. Try 140:
So look for this line in the consume1.c program: "#define CALIBRATION 180" and change it to "#define CALIBRATION 140". Then recompile and check again.

doug@s15:~/c$ time ./consume1 100 5 10
consume: 100 5 10  PID: 7724  Elapsed: 10000097  Now: 1405374307540703  Overruns: 0

real    0m10.001s
user    0m9.240s
sys     0m0.007s

There were no overruns. Use the times to calculate a new CALIBRATION: (9.240 + 0.007) / 10.001 = 92.46%
NEW CALIBRATION = OLD CALIBRATION * 100 / 92.46 = 151
Again change this line in the consume1.c program "#define CALIBRATION 140" to this "#define CALIBRATION 151" and then recompile and check again.

doug@s15:~/temp$ time ../c/consume1 100 5 10
consume: 100 5 10  PID: 7676  Elapsed: 10000062  Now: 1405373458929635  Overruns: 0

real    0m10.001s
user    0m9.948s
sys     0m0.006s

To within an integer value, that is as close as we can get. It isn't that important anyhow as it is usally used to compare different alogrithms, and any calibration errors are presented each runs the same way.

Procedure
The scripts in the references are used to, somewhat, automate the data acquisition process.
They need to be modified for whereever the user has the program and for whatever processor the user has.

IMPORTANT NOTE: Other than running the script, the computer must be as idle as possible.
Server computers are the best for these types of tests because desktop computers have a lot of extra tasks and GUI stuff running. "Idle" on a server computer is actually considerably more idle than "idle" on a desktop computer.

To Do:
After this write up it occurs that it might make more sense to calibrate to the max pstate instead of the min pstate, then some of the scripts (well, one) would become processor invariant.

References:

Hacked consume Program - for fixed work packet - as a text file. (Note: must be calibrated for users processor)
Hacked consume Program - for fixed work packet. Stratos improved version. (Note: must be calibrated for users processor)
Energy curve script, forced pstates.
Energy curve script, forced pstates. pqwoerituytrueiwoq improved version.
Energy curve script, not forced pstates.
Turbostat (save it to your computer)
Joules extraction parser (as a text file).

Linux Intel Pstate driver. Energy. WWW.Smythies.com emaildoesnotwork@smythies.com 2014.07.14 Updated 2014.07.15