Linux Intel Pstate driver. Energy. WWW.Smythies.com
This web page has some energy verses pstate data for various processors, as well as the procedure and tools to acquire such data.
There is also a digression included for some tests on Stratos' ondemand patches for the acpi-cpufreq driver, because the basic procedure is the same, and that test needs to be done on multiple processors also.
Concept
By modifying the load tool to be more representative of real life periodic workflows because the amount of work to do is fixed rather than the load being fixed, insight is gained into the energy tradeoffs for proposed algorithms.
If the processor finishes the work faster it gets to sleep longer before the next cycle. The fixed load methods can not be used for energy aware testing.
Note that herein the program is always calibrated so that 100% load means 100% load at forced minimum pstate (16 for i7-2600K).
That means it should be able to go to about 230% before a real 100% load at the highest pstate (38 for i7-2600K). (much higher percent (500%) for a haswell processor).
Here is the energy verses max percent graph (basically energy Verses pstate) for the i7-2900K processor.

Here is the energy verses max percent graph (basically energy Verses pstate) for the i7-4690K processor.
Note: the computer was a desktop, not a server, and the graph only made some sense once the GUI stuff was shutdown

Similar procedures can be used to create energy verses load graphs. Here is an example using the ondemand mode with the acpi-cpufreq driver and some proposed patchs (before and after the patches):

And again for an i7-3770 (Stratos'):

Calibration
The program needs to be calibrated for the users processor. It can be calibrated in any convienient way, but by convention and for comparible test results calibarate so that 100% load means 100% at minimum pstate and at 5 Hertz.
From the references link below, get and compile the program.
cc consume1.c -o consume1
Force the system to only use the minimum pstate. For the acpi-cpufreq driver this means powersave mode. For the intel pstate driver, determine what min_pct is and force the maximum to that value for the duration of the calibration phase. Example:
sudo cat /sys/devices/system/cpu/intel_pstate/min_perf_pct <<< gives 42 for i7-2600K (It will be 20 for a haswell). echo 42 | sudo tee /sys/devices/system/cpu/intel_pstate/max_perf_pct
Now run the program at 100% and 5 Hertz for 10 seconds and time it:
Check for overruns and use the timing information to iterate the calibration value. Recompile and check again.
Example starting with a CALIBRATION of 180 ( in the program: "#define CALIBRATION 180" )
doug@s15:~/c$ time ./consume1 100 5 10 consume: 100 5 10 PID: 7711 Elapsed: 10157504 Now: 1405373989668273 Overruns: 43 real 0m10.159s user 0m10.146s sys 0m0.006s
There were many overruns, so backoff on the CALIBRATION. There were supposed to 50 loops run, but around only 43 got done so approximate by 180 * 43 / 50 = 155. However this method is crude, so backoff some more. Try 140:
So look for this line in the consume1.c program: "#define CALIBRATION 180" and change it to "#define CALIBRATION 140". Then recompile and check again.
doug@s15:~/c$ time ./consume1 100 5 10 consume: 100 5 10 PID: 7724 Elapsed: 10000097 Now: 1405374307540703 Overruns: 0 real 0m10.001s user 0m9.240s sys 0m0.007s
There were no overruns. Use the times to calculate a new CALIBRATION: (9.240 + 0.007) / 10.001 = 92.46%
NEW CALIBRATION = OLD CALIBRATION * 100 / 92.46 = 151
Again change this line in the consume1.c program "#define CALIBRATION 140" to this "#define CALIBRATION 151" and then recompile and check again.
doug@s15:~/temp$ time ../c/consume1 100 5 10 consume: 100 5 10 PID: 7676 Elapsed: 10000062 Now: 1405373458929635 Overruns: 0 real 0m10.001s user 0m9.948s sys 0m0.006s
To within an integer value, that is as close as we can get. It isn't that important anyhow as it is usally used to compare different alogrithms, and any calibration errors are presented each runs the same way.
Procedure
The scripts in the references are used to, somewhat, automate the data acquisition process.
They need to be modified for whereever the user has the program and for whatever processor the user has.
IMPORTANT NOTE: Other than running the script, the computer must be as idle as possible.
Server computers are the best for these types of tests because desktop computers have a lot of extra tasks and GUI stuff running. "Idle" on a server computer is actually considerably more idle than "idle" on a desktop computer.
To Do:
After this write up it occurs that it might make more sense to calibrate to the max pstate instead of the min pstate, then some of the scripts (well, one) would become processor invariant.
References:
Hacked consume Program - for fixed work packet - as a text file. (Note: must be calibrated for users processor)
Hacked consume Program - for fixed work packet. Stratos improved version. (Note: must be calibrated for users processor)
Energy curve script, forced pstates.
Energy curve script, forced pstates. pqwoerituytrueiwoq improved version.
Energy curve script, not forced pstates.
Turbostat (save it to your computer)
Joules extraction parser (as a text file).