BD-PROCHOT utilities for Windows

DOWNLOAD (the binaries are for Win64)

PROCHOT# = Processor Hot, active low

It is a pin on a processor package, and an internal signal.
Originally, this signal is asserted by the CPU, upon exceeding some temperature, as measured and compared internally, on die.
Except, in x86 CPU's, the signal pin can be made bi-directional, using a pull-up and a "wired-OR" of open collectors pulling to GND. Thus, it can be asserted by the motherboard.
The CPU cores are supposed to respond by throttling = applying a T-state less than 100% = PWM-gating the CPU clock, with a period of about 16 clock ticks.

The bi-directional feature can be suppressed, thus making the CPU immune to external PROCHOT activation.

There is additional status and configuration information to learn, and some potential shenanigans to pull off.

Hence this humble toolkit.

Tools in this package

bd_off.exe switch off the "bi-directional" feature, just once, at runtime
check.exe list and parse the most interesting MSR's into the terminal window (stdout), just once. The listing includes the current state of the BD-PROCHOT configuration bit and a few other bits.
throttle.exe manipulate on-demand throttling (T-state) in the CPU hardware, via the IA32_CLOCK_MODULATION   MSR. You can disable throttling, or set a particular duty cycle. The utility iterates across all the CPU cores automatically.
rdmsr.exe read a single MSR on a single or all the cores
wrmsr.exe write a single MSR on a single or all the cores
modmsr.exe bitbang a single MSR on a single or all the cores (takes a value and a bitmask)
dumpmsr.exe given a range of MSR addresses, iterates across the range, reading each MSR on all four cores. I.e. package-scoped MSR's just appear four times. Non-existent MSRs are reported with a failure, or can optionally be ignored using the "compact" arg. Apart from stdout, the utility produces a file called dumpmsr.csv in the current directory, containing a simple machine-readable output.
bd_svc.exe a service that disables BD-PROCHOT on startup and keeps running, keeping an eye on things, keeping the BD-PROCHOT disabled if possible. Interesting stuff gets reported into the Windows event log ("Application" department). The service does watch and report SW-invoked throttling (clockmod) but does not manipulate it.

The service checks and disables the BD-PROCHOT on startup, reporting what it found and if the "disable" succeeded. It then goes to sleep, and wakes up every 1 second, to perform a round of checks:

Only weird behaviors of the BD-PROCHOT config bit are reported, and only changes to the THERM/CLOCKMOD status flags. This is to make the event log somewhat useful.

Still, on flakey hardware / in wild scenarios, status flags can keep coming up and down, or the "BD-PROCHOT bit" can resist disabling etc. Any of this would result in a waterfall of messages landing in the event log.

To avoid flooding the log, the service has a limit on the number of events (messages):

  1. produced in a row (reset by any "tick of a second" when there was nothing to report) Compile-time default: 15
  2. produced per day (reset every midnight) Compile-time default: 50

My CPU is probably throttling... what do I do?

Before running the service, which kinda does its thing but its reports are pretty condensed and spartan, I suggest that you try the command-line tools first. Start your terminal window (run CMD), CD into your BDPROCHOT directory and type

   check.exe
If you see BD-PROCHOT enabled (at the top of the listing), try the one-off
   bd_off.exe
Which should tell you that it has disabled the bi-dir feature of the PROCHOT# pin.

After that, I suggest that you try

   check.exe
once again, to see if the config hack sticks.

If the output of check.exe is telling you that On-demand throttling is ON, you can try

   throttle off
or set your own throttling factor between 1..15, and check again.

If you're able to do all this on a running throttled system, you might see an immediate improvement.

Improvement of the throttling symptoms !

BEWARE:
If those are valid symptoms of a serious hardware problem, your system may as well become unstable / freakin steamin hot.

NOTE:
on a practical example, I have learned that these measures and MSR flags are not a complete picture. Looks like the config of P-states can also have a role. See e.g. the config of power profiles in Windows.

See also prior art by a different author:
1) ThrottleStop
2) the MSR Tool
both by unclewebb .
They have a nice GUI, but no service for unattended startup.

Hence my primary inspiration to write a Win32 service.

Background

(an intro, really)

On modern Intel CPU's, PROCHOT is a particular pin, a discrete signal. The purpose and function of this signal is to convey information, that the CPU or some other part of the system is overheating, or running out of juice in the battery, or some such - and that it is highly desirable for the CPU (or other components) to enter an aggressive power saving mode. The power saving happens by gating the CPU clocks in a PWM fashion, with a period of 8 or 16 clock ticks - called "throttling", also known as T-states (not to be confused with P-states or C-states).

The CPU itself contains temperature sensors (one per CPU core), and further "sources of PROCHOT alarm" can be on the motherboard: typically the VRM and the Embedded Controller (in mobile platforms). Therefore, the respective CPU pin can be bi-directional: input or output, as required by the situation unfolding.

The PROCHOT signal has a weak resistive pull-up to some power rail (e.g. 1.05V around the Haswell generation of CPU's) and is active low, i.e. the CPU or some peripheral can short it to ground, thus making the CPU throttle its clock unconditionally.

While the CPU's own sensors are quite an authoritative alarm source to start throttling the CPU clock, external on-motherboard sources can be less trustworthy.

Under some circumstances, the PC system admin may deem it inappropriate, for the discrete PROCHOT signal (its input aspect) to throttle the CPU. Such as, when the symptoms would indicate an electric or EC Firmware design glitch, rather than an actual overheat or brownout condition. Or, let's say the PC owner is willing to run the risk of thermal damage, at his own discretion.

The key point of this software piece is to disable the "bi-directional" character of the PROCHOT# signal pin on the CPU. If this BD-PROCHOT configuration flag is cleared, the signal pin becomes a pure output (Note: Haswell datasheets claim "pure input", which appears to be incorrect.) By doing that, external on-motherboard sources of PROCHOT are eliminated, and the CPU stops throttling.

This chipset configuration adjustment is temporary: until the PC reboots, or falls asleep and wakes up, or until some other software re-enables that particular MSR flag at runtime.

On computer startup, the PC BIOS (or UEFI) sets the BD-PROCHOT chipset configuration bit. Some motherboard vendors make this choice available as a menu item in the BIOS SETUP, but most do not and just leave the BD-PROCHOT enabled.

Therefore, it might be a good idea to have a software "service" whose primary purpose it would be to "keep an eye" on the BD-PROCHOT config bit.

And maybe some other status flags related to PROCHOT and throttling - to give the system admin a better idea of what's going on.

A particular catch is, that apart from the solid wired PROCHOT, the CPU can also be asked to do "on-demand" throttling. I.e. enter a T-state by writing the respective CLOCKMOD MSR. This can be done e.g. by the BIOS in an interrupt service, or as part of handling some ACPI ritual callbacks. Obviously, the software-solicited on-demand throttling does not get disabled by suppressing the discrete PROCHOT# input. We might try disabling throttling via the CLOCKMOD MSR. And maybe the BIOS would not notice and let it be, or maybe our override would only last a couple milliseconds... We definitely can and do watch the contents of the CLOCKMOD register - which should at least provide a clue as to what's going on.

The particular MSR registers of interest are:

MSR_POWER_CTL             at 0x1FC  (BD-PROCHOT en = bit 0)
IA32_PACKAGE_THERM_STATUS at 0x1B1  (flags in bit 0..3)
IA32_THERM_STATUS         at 0x19C  (flags in bit 0..3)
IA32_CLOCK_MODULATION     at 0x19A  (flags in bit 0..4)
The latter two registers are per-core.
Around Haswell, at least three additional relevant MSR's have surfaced:
MSR_CORE_PERF_LIMIT_REASONS      at 0x690
MSR_GRAPHICS_PERF_LIMIT_REASONS  at 0x6B0
MSR_RING_PERF_LIMIT_REASONS      at 0x6B1
which, curiously to me, predate the addition of HWP by Intel (in the Skylake generation). Also note that these three are "per package", i.e. even the MSR_CORE_PERF_LIMIT_REASONS is not "per core".

I've actually stumbled over a known Haswell bug, where just one bit in MSR_CORE_PERF_LIMIT_REASONS would indicate throttling due to overheating, even though no overheating was in fact happening.

Apparently, IA32_PERF_STATUS, and maybe other MSRs, would deserve further investigation.

The MSR registers can only be accessed by dedicated privileged instructions (rdmsr/wrmsr), available in "ring 0". In DOS directly, but in Windows or Linux only in the kernel space. Hence the need for a minimal kernel-space driver + its companion user-space API library - these are called WinRing0 . Fortunately such a driver exists and a signed build is available.