PCI: document PCIe fundamental reset interfaces
The attached patch updates the Documentation/PCI/pci-error-recovery.txt file with changes related to this new bit field, as well a few unrelated updates. Signed-off-by: Linas Vepstas <linasvepstas@gmail.com> Signed-off-by: Mike Mason <mmlnx@us.ibm.com> Signed-off-by: Richard Lary <rlary@us.ibm.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This commit is contained in:
parent
260d703adc
commit
fe14acd4e7
1 changed files with 77 additions and 42 deletions
|
@ -4,15 +4,17 @@
|
|||
February 2, 2006
|
||||
|
||||
Current document maintainer:
|
||||
Linas Vepstas <linas@austin.ibm.com>
|
||||
Linas Vepstas <linasvepstas@gmail.com>
|
||||
updated by Richard Lary <rlary@us.ibm.com>
|
||||
and Mike Mason <mmlnx@us.ibm.com> on 27-Jul-2009
|
||||
|
||||
|
||||
Many PCI bus controllers are able to detect a variety of hardware
|
||||
PCI errors on the bus, such as parity errors on the data and address
|
||||
busses, as well as SERR and PERR errors. Some of the more advanced
|
||||
chipsets are able to deal with these errors; these include PCI-E chipsets,
|
||||
and the PCI-host bridges found on IBM Power4 and Power5-based pSeries
|
||||
boxes. A typical action taken is to disconnect the affected device,
|
||||
and the PCI-host bridges found on IBM Power4, Power5 and Power6-based
|
||||
pSeries boxes. A typical action taken is to disconnect the affected device,
|
||||
halting all I/O to it. The goal of a disconnection is to avoid system
|
||||
corruption; for example, to halt system memory corruption due to DMA's
|
||||
to "wild" addresses. Typically, a reconnection mechanism is also
|
||||
|
@ -37,10 +39,11 @@ is forced by the need to handle multi-function devices, that is,
|
|||
devices that have multiple device drivers associated with them.
|
||||
In the first stage, each driver is allowed to indicate what type
|
||||
of reset it desires, the choices being a simple re-enabling of I/O
|
||||
or requesting a hard reset (a full electrical #RST of the PCI card).
|
||||
If any driver requests a full reset, that is what will be done.
|
||||
or requesting a slot reset.
|
||||
|
||||
After a full reset and/or a re-enabling of I/O, all drivers are
|
||||
If any driver requests a slot reset, that is what will be done.
|
||||
|
||||
After a reset and/or a re-enabling of I/O, all drivers are
|
||||
again notified, so that they may then perform any device setup/config
|
||||
that may be required. After these have all completed, a final
|
||||
"resume normal operations" event is sent out.
|
||||
|
@ -101,7 +104,7 @@ if it implements any, it must implement error_detected(). If a callback
|
|||
is not implemented, the corresponding feature is considered unsupported.
|
||||
For example, if mmio_enabled() and resume() aren't there, then it
|
||||
is assumed that the driver is not doing any direct recovery and requires
|
||||
a reset. If link_reset() is not implemented, the card is assumed as
|
||||
a slot reset. If link_reset() is not implemented, the card is assumed to
|
||||
not care about link resets. Typically a driver will want to know about
|
||||
a slot_reset().
|
||||
|
||||
|
@ -111,7 +114,7 @@ sequence described below.
|
|||
|
||||
STEP 0: Error Event
|
||||
-------------------
|
||||
PCI bus error is detect by the PCI hardware. On powerpc, the slot
|
||||
A PCI bus error is detected by the PCI hardware. On powerpc, the slot
|
||||
is isolated, in that all I/O is blocked: all reads return 0xffffffff,
|
||||
all writes are ignored.
|
||||
|
||||
|
@ -139,7 +142,7 @@ The driver must return one of the following result codes:
|
|||
a chance to extract some diagnostic information (see
|
||||
mmio_enable, below).
|
||||
- PCI_ERS_RESULT_NEED_RESET:
|
||||
Driver returns this if it can't recover without a hard
|
||||
Driver returns this if it can't recover without a
|
||||
slot reset.
|
||||
- PCI_ERS_RESULT_DISCONNECT:
|
||||
Driver returns this if it doesn't want to recover at all.
|
||||
|
@ -169,11 +172,11 @@ is STEP 6 (Permanent Failure).
|
|||
|
||||
>>> The current powerpc implementation doesn't much care if the device
|
||||
>>> attempts I/O at this point, or not. I/O's will fail, returning
|
||||
>>> a value of 0xff on read, and writes will be dropped. If the device
|
||||
>>> driver attempts more than 10K I/O's to a frozen adapter, it will
|
||||
>>> assume that the device driver has gone into an infinite loop, and
|
||||
>>> it will panic the kernel. There doesn't seem to be any other
|
||||
>>> way of stopping a device driver that insists on spinning on I/O.
|
||||
>>> a value of 0xff on read, and writes will be dropped. If more than
|
||||
>>> EEH_MAX_FAILS I/O's are attempted to a frozen adapter, EEH
|
||||
>>> assumes that the device driver has gone into an infinite loop
|
||||
>>> and prints an error to syslog. A reboot is then required to
|
||||
>>> get the device working again.
|
||||
|
||||
STEP 2: MMIO Enabled
|
||||
-------------------
|
||||
|
@ -182,15 +185,14 @@ DMA), and then calls the mmio_enabled() callback on all affected
|
|||
device drivers.
|
||||
|
||||
This is the "early recovery" call. IOs are allowed again, but DMA is
|
||||
not (hrm... to be discussed, I prefer not), with some restrictions. This
|
||||
is NOT a callback for the driver to start operations again, only to
|
||||
peek/poke at the device, extract diagnostic information, if any, and
|
||||
eventually do things like trigger a device local reset or some such,
|
||||
but not restart operations. This is callback is made if all drivers on
|
||||
a segment agree that they can try to recover and if no automatic link reset
|
||||
was performed by the HW. If the platform can't just re-enable IOs without
|
||||
a slot reset or a link reset, it wont call this callback, and instead
|
||||
will have gone directly to STEP 3 (Link Reset) or STEP 4 (Slot Reset)
|
||||
not, with some restrictions. This is NOT a callback for the driver to
|
||||
start operations again, only to peek/poke at the device, extract diagnostic
|
||||
information, if any, and eventually do things like trigger a device local
|
||||
reset or some such, but not restart operations. This callback is made if
|
||||
all drivers on a segment agree that they can try to recover and if no automatic
|
||||
link reset was performed by the HW. If the platform can't just re-enable IOs
|
||||
without a slot reset or a link reset, it will not call this callback, and
|
||||
instead will have gone directly to STEP 3 (Link Reset) or STEP 4 (Slot Reset)
|
||||
|
||||
>>> The following is proposed; no platform implements this yet:
|
||||
>>> Proposal: All I/O's should be done _synchronously_ from within
|
||||
|
@ -228,9 +230,6 @@ proceeds to either STEP3 (Link Reset) or to STEP 5 (Resume Operations).
|
|||
If any driver returned PCI_ERS_RESULT_NEED_RESET, then the platform
|
||||
proceeds to STEP 4 (Slot Reset)
|
||||
|
||||
>>> The current powerpc implementation does not implement this callback.
|
||||
|
||||
|
||||
STEP 3: Link Reset
|
||||
------------------
|
||||
The platform resets the link, and then calls the link_reset() callback
|
||||
|
@ -253,16 +252,33 @@ The platform then proceeds to either STEP 4 (Slot Reset) or STEP 5
|
|||
|
||||
>>> The current powerpc implementation does not implement this callback.
|
||||
|
||||
|
||||
STEP 4: Slot Reset
|
||||
------------------
|
||||
The platform performs a soft or hard reset of the device, and then
|
||||
calls the slot_reset() callback.
|
||||
|
||||
A soft reset consists of asserting the adapter #RST line and then
|
||||
In response to a return value of PCI_ERS_RESULT_NEED_RESET, the
|
||||
the platform will peform a slot reset on the requesting PCI device(s).
|
||||
The actual steps taken by a platform to perform a slot reset
|
||||
will be platform-dependent. Upon completion of slot reset, the
|
||||
platform will call the device slot_reset() callback.
|
||||
|
||||
Powerpc platforms implement two levels of slot reset:
|
||||
soft reset(default) and fundamental(optional) reset.
|
||||
|
||||
Powerpc soft reset consists of asserting the adapter #RST line and then
|
||||
restoring the PCI BAR's and PCI configuration header to a state
|
||||
that is equivalent to what it would be after a fresh system
|
||||
power-on followed by power-on BIOS/system firmware initialization.
|
||||
Soft reset is also known as hot-reset.
|
||||
|
||||
Powerpc fundamental reset is supported by PCI Express cards only
|
||||
and results in device's state machines, hardware logic, port states and
|
||||
configuration registers to initialize to their default conditions.
|
||||
|
||||
For most PCI devices, a soft reset will be sufficient for recovery.
|
||||
Optional fundamental reset is provided to support a limited number
|
||||
of PCI Express PCI devices for which a soft reset is not sufficient
|
||||
for recovery.
|
||||
|
||||
If the platform supports PCI hotplug, then the reset might be
|
||||
performed by toggling the slot electrical power off/on.
|
||||
|
||||
|
@ -274,10 +290,12 @@ may result in hung devices, kernel panics, or silent data corruption.
|
|||
|
||||
This call gives drivers the chance to re-initialize the hardware
|
||||
(re-download firmware, etc.). At this point, the driver may assume
|
||||
that he card is in a fresh state and is fully functional. In
|
||||
particular, interrupt generation should work normally.
|
||||
that the card is in a fresh state and is fully functional. The slot
|
||||
is unfrozen and the driver has full access to PCI config space,
|
||||
memory mapped I/O space and DMA. Interrupts (Legacy, MSI, or MSI-X)
|
||||
will also be available.
|
||||
|
||||
Drivers should not yet restart normal I/O processing operations
|
||||
Drivers should not restart normal I/O processing operations
|
||||
at this point. If all device drivers report success on this
|
||||
callback, the platform will call resume() to complete the sequence,
|
||||
and let the driver restart normal I/O processing.
|
||||
|
@ -302,11 +320,21 @@ driver performs device init only from PCI function 0:
|
|||
- PCI_ERS_RESULT_DISCONNECT
|
||||
Same as above.
|
||||
|
||||
Drivers for PCI Express cards that require a fundamental reset must
|
||||
set the needs_freset bit in the pci_dev structure in their probe function.
|
||||
For example, the QLogic qla2xxx driver sets the needs_freset bit for certain
|
||||
PCI card types:
|
||||
|
||||
+ /* Set EEH reset type to fundamental if required by hba */
|
||||
+ if (IS_QLA24XX(ha) || IS_QLA25XX(ha) || IS_QLA81XX(ha))
|
||||
+ pdev->needs_freset = 1;
|
||||
+
|
||||
|
||||
Platform proceeds either to STEP 5 (Resume Operations) or STEP 6 (Permanent
|
||||
Failure).
|
||||
|
||||
>>> The current powerpc implementation does not currently try a
|
||||
>>> power-cycle reset if the driver returned PCI_ERS_RESULT_DISCONNECT.
|
||||
>>> The current powerpc implementation does not try a power-cycle
|
||||
>>> reset if the driver returned PCI_ERS_RESULT_DISCONNECT.
|
||||
>>> However, it probably should.
|
||||
|
||||
|
||||
|
@ -348,7 +376,7 @@ software errors.
|
|||
|
||||
Conclusion; General Remarks
|
||||
---------------------------
|
||||
The way those callbacks are called is platform policy. A platform with
|
||||
The way the callbacks are called is platform policy. A platform with
|
||||
no slot reset capability may want to just "ignore" drivers that can't
|
||||
recover (disconnect them) and try to let other cards on the same segment
|
||||
recover. Keep in mind that in most real life cases, though, there will
|
||||
|
@ -361,8 +389,8 @@ That is, the recovery API only requires that:
|
|||
|
||||
- There is no guarantee that interrupt delivery can proceed from any
|
||||
device on the segment starting from the error detection and until the
|
||||
resume callback is sent, at which point interrupts are expected to be
|
||||
fully operational.
|
||||
slot_reset callback is called, at which point interrupts are expected
|
||||
to be fully operational.
|
||||
|
||||
- There is no guarantee that interrupt delivery is stopped, that is,
|
||||
a driver that gets an interrupt after detecting an error, or that detects
|
||||
|
@ -381,16 +409,23 @@ anyway :)
|
|||
>>> Implementation details for the powerpc platform are discussed in
|
||||
>>> the file Documentation/powerpc/eeh-pci-error-recovery.txt
|
||||
|
||||
>>> As of this writing, there are six device drivers with patches
|
||||
>>> implementing error recovery. Not all of these patches are in
|
||||
>>> As of this writing, there is a growing list of device drivers with
|
||||
>>> patches implementing error recovery. Not all of these patches are in
|
||||
>>> mainline yet. These may be used as "examples":
|
||||
>>>
|
||||
>>> drivers/scsi/ipr.c
|
||||
>>> drivers/scsi/sym53cxx_2
|
||||
>>> drivers/scsi/ipr
|
||||
>>> drivers/scsi/sym53c8xx_2
|
||||
>>> drivers/scsi/qla2xxx
|
||||
>>> drivers/scsi/lpfc
|
||||
>>> drivers/next/bnx2.c
|
||||
>>> drivers/next/e100.c
|
||||
>>> drivers/net/e1000
|
||||
>>> drivers/net/e1000e
|
||||
>>> drivers/net/ixgb
|
||||
>>> drivers/net/ixgbe
|
||||
>>> drivers/net/cxgb3
|
||||
>>> drivers/net/s2io.c
|
||||
>>> drivers/net/qlge
|
||||
|
||||
The End
|
||||
-------
|
||||
|
|
Loading…
Reference in a new issue