[s]sd-config-list version 2 and retry count tuning [PSARC/2007/505 FastTrack timeout 09/11/2007]

Mark Carlson markcarl at sac.sfbay.sun.com
Tue Sep 4 07:56:57 PDT 2007


Template Version: @(#)sac_nextcase 1.64 07/13/07 SMI
This information is Copyright 2007 Sun Microsystems
1. Introduction
    1.1. Project/Component Working Name:
	 [s]sd-config-list version 2 and retry count tuning
    1.2. Name of Document Author/Supplier:
	 Author:  Larry Liu
    1.3  Date of This Document:
	04 September, 2007
4. Technical Description
Background
----------
Solaris 8/9 has a tunable [s]sd_retry_count which specifies
the number of times a disk operation should be retried
before failure is returned. [s]sd_retry_count can be set by
adding the following line to the /etc/system file

    set sd:sd_retry_count=5

[s]sd is a system wide driver tunable in Solaris 8/9: once
set, all [s]sd device nodes in the system use the same value
of retry count.

A few third party multipathing software(e.g Veritas DMP, EMC
Powerpath) depend on [s]sd driver's error handling ability
for path failover operations. With ssd default setting of
IO timeout (60 seconds) and retry counts(3), it often takes
more than 3 minutes for the third party multipathing
software to detect the path error and failover. This prevents
some large customers from migrating to s10. [s]sd_retry_count
needs to be tuned to achieve a quicker failover.

[s]sd-config-list is a [s]sd (7D) driver property which can
be used to configure a set of disk behaviors on a per disk
type basis. This property is matched to the vendor ID and
product ID strings in the device's SCSI INQUIRY data. It was
first introduced in PSARC 1999/015 (delayed retries), with
an interface level of Partner Private. PSARC 2001/692 added
minimum throttle setting support to ssd-config-list
property: the minimum throttle can be set on per device type
level via ssd-config-list property in ssd.conf. Similarly
PSARC 2001/693 approved the use of ssd-config-list to
enable/disable disk sorting. Both of the above cases were
approved as Project Private interfaces. PSARC 2002/294 (SCSI
LOGICAL UNIT RESET) extended [s]sd-config-list property to
support scsi lun reset, with an interface level of Project
Private.

An example that enables LOGICAL UNIT RESET use is:
  [s]sd-config-list=
        "SUN     T4", "t4-data";
  t4-data=1,0x20000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1;

The first number of t4-data is the version of the
[s]sd-config-list. Currently only version 1 is supported.
The second number of t4-data is a bit mask of values to set.
Bits 0-17 are already defined and each bit identifies a
particular disk behavior. For example, bit 17 is defined for
LOGICAL UNIT RESET support. 

Problem
-------
There are two problems this fasttrack would like to address:
1. Solaris 10 and above currently do not support tuning
[s]sd_retry_count. This has been a problem for customers who
are willing to migrate to S10 but still use these third
party multipathing software. 

2. Due to historical reasons, the definitions of bit positions
in [s]sd-config-list are different across platforms: for
example, fabricate device id is defined as bit 3 on Sparc
but as bit 2 on x86/x64. The inconsistency in bit position
definition is confusing and error-prone to [s]sd-config-list
users. Please refer to Appendix A for details on the bit
fields currently defined.

Proposed Solution
-----------------
The proposal is to introduce a new version (2) of
[s]sd-config-list property which has a unified definition of
bit fields across platforms. The syntax of [s]sd-config-list
version 2 is the same as that of [s]sd-config-list version 1.

Another part of the proposal is to add a new bit in the
[s]sd-config-list version 2 proposed above. This new bit is
defined for setting retry count of [s]sd driver for the
disk devices with a matching combination of VID/PID.

Please refer to Technical Details section for sepcific bit
field definitions for [s]sd-config-list version 2.

[s]sd driver will be enhanced to support both version 1 and
2 of [s]sd-config-list property. Version 1 is kept unchanged
for backward compatibility. For a given vender ID and
product ID string, if both version 1 and version 2 are found,
version 2 will take effect. 

Interface name     Commitment         Comments
---------------------------------------------------------------------
IO timeout         Committed          DDI property on sd device node
retries bit in                        that specifies IO timeout retry
sd-config-list                        count for the attached disk.
version 2

Other bits in      Project Private    DDI property on sd device node
sd-config-list                        that specifies various behaviors
version 2                             for the attached disk.

IO timeout         Committed          DDI property on ssd device node
retries bit in                        that specifies IO timeout retry
ssd-config-list                       count for the attached disk.
version 2

Other bits in      Project Private    DDI property on ssd device node
ssd-config-list                       that specifies various behaviors
version 2                             for the attached disk.


Technical Details
-----------------
Bit field definition for [s]sd-config-list version 2:
_______________________________________
|Bit|sd(sparc) ssd(Sparc) sd(x86/x64) |
|___|_________________________________|
| 0 | max throttle                    |
|___|_________________________________|
| 1 | min throttle                    |
|___|_________________________________|
| 2 | controller type                 |
|___|_________________________________|
| 3 | reservation release time        |
|___|_________________________________|
| 4 | disable disksort                |
|___|_________________________________|
| 5 | enable LUN reset                |
|___|_________________________________|
| 6 | not ready retries               |
|___|_________________________________|
| 7 | busy retries                    |
|___|_________________________________|
| 8 | reset retries                   |
|___|_________________________________|
| 9 | IO timeout retries              |
|___|_________________________________|
| 10| cache is non-volatile           |
|___|_________________________________|

Those bits that only exist in version 1 are obsolete in version 2.

Release Binding
---------------
Micro release/patch binding is requested

References
----------
PSARC 1999/015 delayed retries
PSARC 2002/294 SCSI LOGICAL UNIT RESET
PSARC 2001/692 Per-Disk-Device Minimum Throttle Setting
PSARC 2001/693 Per-Disk-Device Disabling of disksort
http://software.emc.com/images/software/products/software_az/expanded_images/emc_powerpath.gif

6. Resources and Schedule
    6.4. Steering Committee requested information
     6.4.1. Consolidation C-team Name: ON
    6.5. ARC review type: FastTrack

Appendix A: Bit field definitions for [s]sd-config-list
version 1.
________________________________________________________
|Bit|sd (sparc)      |ssd (Sparc)     |sd/ssd (x86/x64) |
|___|________________|________________|_________________|
|   | max            | max            | max             |
| 0 | throttle       | throttle       | throttle        |
|___|________________|________________|_________________|
|   | controller     | not ready      | controller      |
| 1 | type           | retries        | type            |
|___|________________|________________|_________________|
|   | not ready      | busy           | fabricate       |
| 2 | retries        | retries        | device id       |
|___|________________|________________|_________________|
|   | fabricate      | fabricate      | disable         |
| 3 | device id      | device id      | caching         |
|___|________________|________________|_________________|
|   | disable        | disable        | play            |
| 4 | caching        | caching        | BCD             |
|___|________________|________________|_________________|
|   | busy           | controller     | read sub-       |
| 5 | retries        | type           | channel BCD     |
|___|________________|________________|_________________|
|   | play           | play           | read TOC        |
| 6 | BCD            | BCD            | TRK BCD         |
|___|________________|________________|_________________|
|   | read sub-      | read sub-      | read TOC        |
| 7 | channel BCD    | channel BCD    | ADDR BCD        |
|___|________________|________________|_________________|
|   | read TOC       | read TOC       | no READ_HDR     |
| 8 | TRK BCD        | TRK BCD        |                 |
|___|________________|________________|_________________|
|   | read TOC       | read TOC       | read CD XD4     |
| 9 | ADDR BCD       | ADDR BCD       |                 |
|___|________________|________________|_________________|
|   | no READ_HDR    | no READ_HDR    | not ready       |
| 10|                |                | retries         |
|___|________________|________________|_________________|
|   | read CD XD4    | read CD XD4    | busy retries    |
| 11|                |                |                 |
|___|________________|________________|_________________|
|   | reset retries  | reset retries  | reset retries   |
| 12|                |                |                 |
|___|________________|________________|_________________|
|   | reservation    | reservation    | reservation     |
| 13| release time   | release time   | release time    |
|___|________________|________________|_________________|
|   | TUR check      | TUR check      | TUR check       |
| 14|                |                |                 |
|___|________________|________________|_________________|
|   | min throttle   | min throttle   | min throttle    |
| 15|                |                |                 |
|___|________________|________________|_________________|
|   | disable        | disable        | disable         |
| 16| disksort       | disksort       | disksort        |
|___|________________|________________|_________________|
|   | enable LUN     | enable LUN     | enable LUN      |
| 17| reset          | reset          | reset           |
|___|________________|________________|_________________|
|   | cache is non-  | cache is non-  | cache is non-   |
| 18| volatile       | volatile       | volatile        |
|___|________________|________________|_________________|


6. Resources and Schedule
    6.4. Steering Committee requested information
   	6.4.1. Consolidation C-team Name:
		ON
    6.5. ARC review type: FastTrack
    6.6. ARC Exposure: open




More information about the opensolaris-arc mailing list