[caiman-discuss] Adding the ability to restart DC from a checkpoint
Jean McCormack
Jean.McCormack at Sun.COM
Fri Jan 25 09:12:04 PST 2008
In the DC meeting yesterday we discussed the future user experience for
the Distro Constructor. The first thing I'm
looking at is the ability to restart the DC build at different
checkpoints or steps in the process.
There were 3 ways of specifying the restart that were considered
1) The user would edit the manifest file to specify they wanted to start
the build at a certain point
2) a command line option
3) Making the command have an interactive option
After consulting with Frank Ludolph #2 (command line option) was decided
upon.
His suggestion was this:
dist_const -resume [step]
dist_const -resume would resume the build from the failed step in the
previous build
dist_const -resume step would resume the build from the step specified.
Some technical thoughts behind this new option:
- In order to keep the build from having issues because the user changes
the manifest between the two
runs, we would not have them specify a new manifest file.
- The build does need to have the manifest information somehow, so my
thought was that during a build
we would copy the current manifest file to .step<step number>. As the
step completes successfully this
file would be deleted. It would then serve as a marker for the
-resume case as to where to restart and
would contain all the information for the restarted build.
- dist_const -resume step would check that the step specified is <= the
failed step. Restarting at step+n is not
allowed
- We could do some checking to make sure that the user hasn't modified
.step<number> which has the potential
to cause havoc in the build. Depending upon where you were in the
build process, some modifications would be OK, others not.
I'm not sure the extra complication is worth it. How do others feel
about this?
- The messaging coming from the DC would be worded such that the user
would know what step failed in the process.
That's the next step in this work.
- the .step<number> files would be cleaned up at the start of every
build and the end of every successful build.
- dist_const -resume doesn't make sense after a complete successful
build but dist_const -resume step does. If the user
has a build that completes successfully but doesn't work, they could
rerun the build from any step they think is appropriate.
Any comments?
Jean
More information about the caiman-discuss
mailing list