Overview of watchdog drivers in Linux, suggested patches

By: Frank Rysanek of FCC Prumyslove systemy s.r.o., Czech Republic
e-mail: rysanek AT fccps.cz

DOWNLOAD (for Linux 2.4, around 2.4.26-2.4.27)

Introduction

This web page was set up primarily to cover the several different Advantech watchdogs in some detail. Nevertheless, it provides a significant deal of information regarding other platforms too. Some patches are suggested, mainly to clear up the "menuconfig" entry labels and the accompanying help entries.

What is a watchdog

A watchdog in a PC is a device intended to detect if a system hangs and reset it. It is mostly useful in servers and industrial control machines of various sorts.

There are addon watchdog boards for the ISA bus, but various sorts of PC systems nowadays have watchdogs onboard or even on-chip. Watchdogs are omnipresent on various embedded PC platforms and industrial automation server gear.

A generic Linux kernel has drivers for several popular watchdogs - the drivers have a uniform interface known as /dev/watchdog (character device, major 10, minor 130). It works in a very simple way: in addition to the hardware part, there has to be a software counterpart "feeding" the watchdog at regular intervals. If the watchdog doesn't get its food for a preset interval, it barks - it resets the PC. The repetitive software part can be done by a critical application itself, but most commonly this is done by a dedicated demon, so that the watchdog triggers when the system scheduler becomes defunct. To be completely precise, the watchdog demon opens /dev/watchdog and that's when the watchdog starts counting down - then the demon has to write a character to the device every few seconds to keep it satisfied (to refill its counter). See /usr/src/linux/Documentation/watchdog.txt for a simple software example.

Common watchdog hardware

Advantech hardware (and others) have had a watchdog for a very long time. Looking at the details of its implementation, it's really down to the PC chipset (north bridge, south bridge) used to build the particular platform at hand.

The Port 443 watchdog

Older platforms, say from a 486 up to and including a Pentium III, require some special additional circuitry to perform the watchdog function - this is usually a generic gate array (Altera, Xilinx), working as a counter, hooked up to the platform's ISA bus or to some available SuperIO GPIO port. A very typical representative of this family of watchdogs is the "port 443" watchdog, used by several vendors in very similar arrangements.

On older Advantech platforms, this "port 443" watchdog can count from 1 to 62 seconds. On newer platforms, typically a PIII with the W83977 ISA SuperIO (using its general-purpose address decoder), it can count from 1 to 63 seconds. On some platforms though, Advantech and others, again with a W83977 or W83877, the watchdog is only capable of a fixed 1.6 second interval - it appears that a general-purpose address decoder (a GP feature of the SuperIO chip) is hooked up straight to a dumb MAX706 single-shot style circuit. For most applications, this 1.6s style watchdog requires a slightly more intelligent driver, doing the counting in software, thus being able to provide watchdog deadlines of up to a minute or so.

The aforementioned development was observed on Advantech hardware, but, by the look of the drivers present in the vanilla kernel, it would seem that other vendors are using similar arrangements. The timing interval patterns are the same, the precise sequencing of hardware reads/writes necessary to start/feed/stop the watchdog are slightly different - hence, port 443 watchdogs from different vendors need slightly different drivers, but it's usually not a problem to find an existing driver and adapt it to your hardware. All you need is the vendor's watchdog programming documentation.

The W83627HF watchdog

The recent generation of Advantech gear for the Pentium 4, based on Intel 845/865 chip sets, features a new kind of a watchdog - this one is a standard subsystem of the Winbond W83627HF SuperIO chip. It is reported to work on many other brands of hardware, too - it's probably completely generic and hard-wired on-chip, so that a single driver works on hardware from various vendors. The WDT can be set to any interval from 1 to 255 seconds or minutes, which probably makes it the most flexible watchdog available today.

All Intel ICH south bridges accompanying the Pentium 4 (see a more detailed paragraph in the chapter on the ICH WDC) feature a narrow bus called LPC (ISA on diet and plastic surgery), used for connecting BIOS Flash EEPROMS and SuperIO chips. The W83627HF is one of the few LPC SuperIO chips available. On its own name-brand motherboards, Intel tends to use its own SuperIO chip - nevertheless, most other manufacturers, especially those in eastern Asia, tend to prefer the W83627HF. Hence, if you have a Taiwanese motherboard with an Intel P4 chip set, you have high chances of having the W83627HF in the system.

As the respective driver author has pointed out, the W83627HF datasheet is crippled to the extent that it doesn't describe how to enter the "enhanced mode" that's necessary to control the watchdog. Bad job on part of Winbond... Fortunately some people remember the previous Winbond SuperIO chips or have read the Advantech manual, where the complete algorithm is well documented.

Please note one other interesting feature of the W83627HF, its health monitoring subsystem. The chip has a built-in "sensor", monitoring up to three fans and temperatures and several power supply voltages. This sensor is fairly standard - it's a superset of the venerable LM78, it's similar to the stand-alone W83781/2 family, it can be reached both via LPC (=ISA) and via external I2C/SMBus and, most importantly, the lm-sensors package contains a native driver for it.

The W83977 watchdog (disabled on Advantech hardware)

The ISA-based W83977 SuperIO chip does contain something that Winbond calls a "watchdog" in the datasheets - nevertheless, in contrast to the W83627HF (see above), this "watchdog" doesn't have its output hooked up to the system-wide RESET line internally. Thus, it's up to the system designer whether or not to make that connection externally - and that's why this inherent W83977 watchdog is defunct on many hardware models.

Please note that several drivers in the vanilla kernel seem to be labeled vaguely "The W83977 watchdog". Such a label is usually misleading - the W83977 provides only the general purpose address decoder, the watchdog itself is a separate custom chip! Consequently, the precise usage algorithm differs from vendor to vendor.

There's only one driver in the vanilla kernel that addresses the inherent W83977 "watchdog" - it's the wdc977.c. The attached patch provides another driver that is written to be more generic and better commented.

The Intel P4 watchdog (I810 wdt, ICH4/5/6 wdt)

Intel has released several chipsets for the Pentium 4: i810, i815, i845, i865, i875, i7501, i915, i925 just to name the most important. All of them have a southbridge nicknamed ICH (...ICH4, 5, 6) - to a great degree, the several generations of the ICH are mutually software compatible. Intel uses a proprietary bus called the HubLink (HL) between its north bridge and south bridge, that precludes mixing intel north bridges (GMCH) with third-party south bridges (and vice versa) - so if you have one of those i8xx/9xx north bridges, you also have an ICH south bridge.

Apologies for the lenthy prologue. The point is, that the all the ICH generations contain another watchdog. It can count 4 to 63 ticks per 0.6 seconds, and it has to wrap around twice to reset the PC - which results in a watchdog deadline of about 5 to 75 seconds. Pretty useful for most purposes.

To sum up, if you have a P4 with an Intel chipset, quite probably you have an on-chip ICH watchdog.

If you believe that you have an Intel P4-class ICH and yet the i810-tco driver fails to insmod, please provide a listing of 'lspci' and 'lspci -n'. The driver may be missing some PCI ID's. If you understand a bit of C, check the i810-tco.c below pci_for_each_dev(), to see what the driver is looking for.

Summary: watchdogs on Intel P4 platforms

The most important conclusion is, that if you happen to have the (quite frequent) combination of an Intel P4 chipset and a W83627HF, you have two able watchdogs in the system. In particular, this is the case with all P4-based Advantech processor boards with Intel chipsets.

One last explicit note, about dual-P4 chipsets: if you have an Intel 7501, you have an ICH, and possibly also the W83627HF. That would be two watchdogs and neat health monitoring. This is the case with the Advantech RS-200-RPS-D.
If on the other hand you have a ServerWorks chipset, GC-LE or some such, you don't have an Intel ICH => no ICH watchdog, no LPC, no W83627HF, no W83627HF watchdog, no on-chip health monitoring. The GC-LE south bridge (OSB4 or CSB5) can be combined with ISA-based SuperIO chips, maybe with the W83977 (see above) - so maybe you have the "port 443 watchdog" or the W83977 watchdog. The OSB4/CSB5 are somewhat compatible with Intel PIIX4, at least it appears to be true about its IDE and I2C interfaces - thus, maybe you have some health monitoring sensors on the I2C bus.

Suggested patches

In the recent years, the Linux kernel has accumulated an impressing set of watchdog drivers. Unfortunately, the set is somewhat messy - the filenames, menuconfig menu entries and their documentation may be misguiding. The W83627HF is not supported in the vanilla kernel (as of Linux 2.4.27).

The suggested patch set, available for download here, aims to improve all that. Plus, it adds a few actual code improvements, including support for the W83627HF contributed by Mr. Padraig Brady.

DOWNLOAD HERE (for Linux 2.4, around 2.4.26-2.4.27)

Appendix - watchdogs on Advantech PC hardware

GIF: A table of watchdogs available on various Advantech processor boards

Revision history