Re-syncing out-of-sync zones [PSARC/2007/691 FastTrack timeout 12/21/2007]

Isaac isaac at sun.com
Fri Dec 14 10:21:24 PST 2007


I propose that, if patching a zone fails, the decision to detach it  
would not be automatic.

Thx,
Isaac

On Dec 14, 2007, at 12:47 PM, Edward Pilatowicz <Edward.Pilatowicz at Sun.COM 
 > wrote:

> this case seems to have a lot of overlap with:
>    PSARC/2007/621 zone update on attach
>    6480464 RFE: zoneadm attach should patch/update the zone to the  
> new hosts level
> has this case been discussed with with Jerry?
>
> I see that the intended commitment level is "Uncommitted".  so what is
> the planned usage for this new flag?  If it's only going to be used
> internally by tools shipped from sun then it sounds like it might be  
> ok
> (althought you should still talk to jerry), but if we're expecting  
> customers
> to use these new options then i'm not sure it's the right approach.   
> perhaps
> if patching of a zone fails the zone should be automatically  
> detached, and
> then the patch re-application could be handled automatically by  
> PSARC/2007/621
> (which is a public and committed interface).
>
> ed
>
> On Fri, Dec 14, 2007 at 06:35:21AM -0800, Brian Utterback wrote:
>>
>> I am sponsoring this Fasttrack on behalf of Nagaraj Yedathore. The  
>> exposure
>> is open. Patch binding is requested. The case times out on December  
>> 21.
>>
>> Template Version: @(#)sac_nextcase 1.64 07/13/07 SMI
>> This information is Copyright 2007 Sun Microsystems
>> 1. Introduction
>>    1.1. Project/Component Working Name:
>>     Re-syncing out-of-sync zones
>>    1.2. Name of Document Author/Supplier:
>>     Author:  Nagaraj Yedathore
>>    1.3  Date of This Document:
>>    14 December, 2007
>> 4. Technical Description
>> Summary
>> -------
>>
>> This proposed enhancement will allow the patching utility to add or
>> remove global patches to or from non-global zones that have become  
>> out
>> of step with the global zone following error conditions encountered
>> during global patching operations.
>>
>> Problem
>> -------
>>
>> The "pdo" binary, which provides the "patchadd" and "patchrm" user
>> commands, handles the application and removal of global patches by  
>> first
>> carrying out the operation on the global zone itself, and then each  
>> of
>> the non-global zones in turn.
>>
>> However, when a failure occurs in any of the non-global zones, pdo
>> tolerates it by reporting the error, but continues the operation in
>> any further zones.
>>
>> This leads to two situations in which the patch level in a non-global
>> zone can be out of step with the global zone:
>>
>> 1/ If a "patchadd" fails in one of the non-global zones, that zone
>>   will then be missing the patch which is present in the global zone,
>>   and will be behind the global patch level.
>>
>> 2/ If a "patchrm" fails in one of the non-global zones, that zone  
>> will
>>   then retain the patch which has been removed from the global zone,
>>   and will be ahead of the global patch level.
>>
>> The problem then is that when "patchadd" or "patchrm" are run again,
>> "pdo" performs dependency checks on the requested patching operation
>> before starting the operation in the non-global zones, but these
>> checks only account for the global zone.  Because of an implicit
>> assumption that the patch levels in the non-global zones will always
>> match those in the global zone, "pdo" then incorrectly assumes that
>> the requested operation is unnecessary, and exits without taking any
>> action.
>>
>> In the case of global patches, "pdo" cannot be run individually in
>> non-global zones, so the only way for this to be corrected is to
>> either reverse and repeat the patching operation, or to uninstall and
>> then reinstall the non-global zone.  Neither of these options is
>> generally attractive, but the problem is particularly difficult when
>> the patch in question is the Kernel update patch.
>>
>> Solution
>> --------
>>
>> The "pdo" implementations of "patchadd" and "patchrm" support an
>> undocumented feature which allows the patching of particular named
>> zones via a "zonelist" argument.  For example, the following command
>> line would be used to add the patch "123456-78" to only zones "zone1"
>> and "zone2":
>>
>> # patchadd -O "zonelist=zone1 zone2" 123456-78
>>
>> However, in the case of global patches, "pdo" detects that a patching
>> operation is being attempted non-globally, and displays an error
>> message before exiting without allowing the operation.
>>
>> The proposal is therefore to add a further option which can be
>> specified alongside the "zonelist" in order to indicate that the
>> dependency checks in the global zone should be ignored, because an
>> attempt is being made to re-synchronize a non-global zone whose patch
>> state must not be assumed to match that of the global zone.  In this
>> case, "pdo" will start the patching operation in the indicated
>> non-global zones, where dependency checking is then done on a
>> zone-specific basis before any further action is taken.
>>
>> The new proposed argument is "-O retry", and would be added to the
>> existing code which parses the "zonelist" argument such that it  
>> could be
>> specified either by itself, or with a "," separator as shown:
>>
>> # patchadd -O retry -O "zonelist=zone1 zone2" 123456-78
>>
>> . . . or . . .
>>
>> # patchadd -O "zonelist=zone1 zone2,retry" 123456-78
>>
>> In addition, since it may not always be known which zones are out of
>> step with the global zone, it it propsed that the "retry" argument
>> will automatically imply a "zonelist" consisting of all the non- 
>> global
>> zones on the system if no other "zonelist" argument is explicitly
>> specified.  For example:
>>
>> # patchadd -O retry 123456-78
>>
>> This will exploit the fact that dependency checks are run in the
>> non-global zones before the patching operations are carried out, so  
>> by
>> requesting the operation in every zone only those which are found to
>> be out of step will be updated.
>>
>> Finally, because of a peculiarity in the way that "pdo" handles
>> information about the status of requested patch operations, the  
>> global
>> zone must specifically be barred from the "zonelist" when the "retry"
>> option is used, otherwise the effect of the "retry" option will be
>> negated.
>>
>> To summarize, the "-O retry" option will:
>>
>> 1/ Apply to both "patchadd" and "patchrm" commands.
>>
>> 2/ Prevent dependency checking in the global zone, allowing the
>>   patching operations to be passed to the non-global zones where
>>   specific dependency checks are done anyway.
>>
>> 3/ Allow the patching operations to be run on any non-global zones
>>   specified via the "zonelist" argument, or all non-global zones on
>>   the system by default.
>>
>> 4/ Prevent the global zone from being included in a "zonelist" by
>>   displaying an error message and exiting.
>>
>>
>> Interfaces
>> ----------
>>
>> Interface               Stability     Description
>> =========               =========     ===========
>>
>> option -O retry         Uncommitted   Allow "patchadd" and "patchrm"
>>                                      to carry out patching operations
>>                                      on global patches in separate
>>                                      non-global zones.
>>
>>
>> References
>> ----------
>>
>> RFE 6540979: "patchadd RFE to incrementally patch non global zones
>>              that are out of sync with the global zone."
>>
>>
>> 6. Resources and Schedule
>>    6.4. Steering Committee requested information
>>       6.4.1. Consolidation C-team Name:
>>        admininstall
>>    6.5. ARC review type: FastTrack
>>    6.6. ARC Exposure: open



More information about the opensolaris-arc mailing list