Add sparse file support to cpio [PSARC/2008/727 Self Review]

Don Cragun don.cragun at sun.com
Fri Nov 21 15:39:42 PST 2008


I'm sponsoring this case for Cynthia Eastham.

Since this case follows the same general practices used when sparse
file support was added to the pax archiving utility, I'm marking this
case as close approved automatic.  If any members believe this needs to
be promoted to a fast track let me know.

This case seeks a patch binding.

 - Don

Template Version: @(#)sac_nextcase %I% %G% SMI
This information is Copyright 2008 Sun Microsystems
1. Introduction
    1.1. Project/Component Working Name:
	 Add sparse file support to cpio
    1.2. Name of Document Author/Supplier:
	 Author:  Cynthia Eastham
    1.3  Date of This Document:
	21 November, 2008
4. Technical Description
	4.1 Details

	PSARC case 2006/331 (Add holey file support to pax) created
	extended headers to maintain information about sparse files in
	some archive formats written by pax.  Holes are not expunged
	from the file in the archive, but are recreated when the file
	is extracted from the archive.

	This case adds similar sparse file support to the cpio utility.

	In pass mode, (cpio -p), sparse files will be recreated at the
	destination with the same holes that were present in the source
	file, as long as the source file system supports reporting
	holes (as described by PSARC case 2004/770) and the destination
	file is seekable.  Otherwise, holes in sparse files will be
	filled with '\0' btyes in corresponding destination files as
	they are now.

	In copy out mode (-o) the following new option arguments to the
	cpio -H option will be added to provide sparse file support:
		ascii_sparse	- assumes -c is specified.  Only available
				in copy out (-o) mode.
		odc_sparse	- assumes -H odc is specified  Only available
				in copy out (-o) mode.

	When sparse files are archived using one of the above archive
	formats, and the underlying file system supports the detection
	of holes as reported by pathconf(2), it will be compressed
	(holes will not be stored in the archive), and hole and data
	offsets as returned by lseek(2) will be saved at the start of
	the file data.  Otherwise, sparse files will be archived as
	they are now.

	When a compressed sparse file is extracted from an archive
	(cpio -i), if the destination file is seekable, holes will be
	restored using the saved hole and data offsets.  Otherwise
	holes in the source file will be filled with '\0' bytes in the
	destination file.

	The following will apply when either '-H ascii_sparse' or
	'-H odc_sparse' is specified with -o: 
		- The c_mode field will in the archive header will
		  indicate that the file is a sparse file. In the old
		  stat structure, the mode field is an unsigned short
		  (16 bit) field.  To avoid conflicts with other file
		  types, a high order bit (17) in the c_mode field of
		  the header will be set.
		- the file size field of the header will be the size of
		  the compressed sparse file (i.e., the size of the
		  header below plus the size of the file contents after
		  removing the holes).
		- A string of the following format will be prepended to
		  the compressed file data:
			"%lu %llu%s", prepended_info_size,
				expanded_file_size, data/hole_offsets
		where data/hole_offsets contains 2 or more entries of the
		following format:
			" %llu %llu", data_offset, hole_offset

		The following is an example of the string that would be
		prepended to the compressed file:
	  "64 16777223 0 8192 65536 73728 1048576 1056768 16777216 16777223"
		for a file that is (after expansion) 16,777,223 bytes
		long with file data starting at offset 0 followed by a
		hole starting at offset 8,192, more data starting at
		offset 65,536 followed by another hole starting at
		offset 1,048,576, ..., followed by data starting at
		offset 26,777,216 and ending at offset 16,777,223 (the
		file size).  The compressed file is all of the data in
		the file, except that the holes in the file have been
		removed from the archive.

	When the c_mode field is set, cpio will detect the sparse file
	upon file extraction, and use the prepended sparse file
	information to restore the holes in the file if the
	destiination file is seekable.  If the destination file is not
	seekable, the sparse file information will be used to fill the
	holes with '\0' bytes.  Archivers that do not recognize the
	sparse file mode bit will restore the compressed file and its
	prepended data as a regular file.

	4.3 Exported Interfaces
            ______________________________________
            |     Interface       |Classification|
            |_____________________|______________|
            |/usr/bin/cpio        |   Committed  |
            |_____________________|______________|

5. Reference Documents:
    5.1 CR 4480319 cpio does not properly copy files that are
	sparse (have holes in them)
    5.2 PSARC/2004/770: Holey file knowledge
    5.3 PSARC/2006/361: Add holey file support to pax
    5.4 PSARC/2008/727/materials/cpio.1: Updated cpio.1 man page

6. Resources and Schedule
    6.4. Steering Committee requested information
   	6.4.1. Consolidation C-team Name:
		ON
    6.5. ARC review type: Automatic
    6.6. ARC Exposure: open




More information about the opensolaris-arc mailing list