[zfs-discuss] Need Help Invalidating Uberblock
Nathan Hand
nhand at manu.com.au
Mon Dec 15 14:23:37 PST 2008
I've had some success.
I started with the ZFS on-disk format PDF.
http://opensolaris.org/os/community/zfs/docs/ondiskformat0822.pdf
The uberblocks all have magic value 0x00bab10c. Used od -x to find that value in the vdev.
root at opensolaris:~# od -A x -x /mnt/zpool.zones | grep "b10c 00ba"
020000 b10c 00ba 0000 0000 0004 0000 0000 0000
020400 b10c 00ba 0000 0000 0004 0000 0000 0000
020800 b10c 00ba 0000 0000 0004 0000 0000 0000
020c00 b10c 00ba 0000 0000 0004 0000 0000 0000
021000 b10c 00ba 0000 0000 0004 0000 0000 0000
021400 b10c 00ba 0000 0000 0004 0000 0000 0000
021800 b10c 00ba 0000 0000 0004 0000 0000 0000
021c00 b10c 00ba 0000 0000 0004 0000 0000 0000
022000 b10c 00ba 0000 0000 0004 0000 0000 0000
022400 b10c 00ba 0000 0000 0004 0000 0000 0000
...
So the uberblock array begins 128kB into the vdev and there's an uberblock every 1kb.
To identify the active uberblock I used zdb.
root at kestrel:/opt$ zdb -U -uuuv zones
Uberblock
magic = 0000000000bab10c
version = 4
txg = 1504158 (= 0x16F39E)
guid_sum = 10365405068077835008 = (0x8FD950FDBBD02300)
timestamp = 1229142108 UTC = Sat Dec 13 15:21:48 2008 = (0x4943385C)
rootbp = [L0 DMU objset] 400L/200P DVA[0]=<0:52e3edc00:200> DVA[1]=<0:6f9c1d600:200> DVA[2]=<0:16e280400:200> fletcher4 lzjb LE contiguous birth=1504158 fill=172 cksum=b0a5275f3:474e0ed6469:e993ed9bee4d:205661fa1d4016
I spy those hex values at the uberblock starting 027800.
027800 b10c 00ba 0000 0000 0004 0000 0000 0000
027810 f39e 0016 0000 0000 2300 bbd0 50fd 8fd9
027820 385c 4943 0000 0000 0001 0000 0000 0000
027830 1f6e 0297 0000 0000 0001 0000 0000 0000
027840 e0eb 037c 0000 0000 0001 0000 0000 0000
027850 1402 00b7 0000 0000 0001 0000 0703 800b
027860 0000 0000 0000 0000 0000 0000 0000 0000
027870 0000 0000 0000 0000 f39e 0016 0000 0000
027880 00ac 0000 0000 0000 75f3 0a52 000b 0000
027890 6469 e0ed 0474 0000 ee4d ed9b e993 0000
0278a0 4016 fa1d 5661 0020 0000 0000 0000 0000
0278b0 0000 0000 0000 0000 0000 0000 0000 0000
Breaking it down
* the first 8 bytes are the magic uberblock number (b10c 00ba 0000 0000)
* the second 8 bytes are the version number (0004 0000 0000 0000)
* the third 8 bytes are the transaction group a.k.a txg (f39e 0016 0000 0000)
* the fourth 8 bytes are the guid sum (2300 bbd0 50fd 8fd9)
* the fifth 8 bytes are the timestamp (385c 4943 0000 0000)
The remainder of the bytes are the "blkptr" structure and I'll ignore them.
Those values match the active uberblock exactly, so I know this is the on-disk location of the first active uberblock.
Scanning further I find an exact duplicate 256kB later in the device.
067800 b10c 00ba 0000 0000 0004 0000 0000 0000
067810 f39e 0016 0000 0000 2300 bbd0 50fd 8fd9
067820 385c 4943 0000 0000 0001 0000 0000 0000
067830 1f6e 0297 0000 0000 0001 0000 0000 0000
067840 e0eb 037c 0000 0000 0001 0000 0000 0000
067850 1402 00b7 0000 0000 0001 0000 0703 800b
067860 0000 0000 0000 0000 0000 0000 0000 0000
067870 0000 0000 0000 0000 f39e 0016 0000 0000
067880 00ac 0000 0000 0000 75f3 0a52 000b 0000
067890 6469 e0ed 0474 0000 ee4d ed9b e993 0000
0678a0 4016 fa1d 5661 0020 0000 0000 0000 0000
0678b0 0000 0000 0000 0000 0000 0000 0000 0000
I know ZPOOL keeps four copies of the label; two at the front and two at the back, each 256kB in size.
root at opensolaris:~# ls -l /mnt/zpool.zones
-rw-r--r-- 1 root root 42949672960 Dec 15 04:49 /mnt/zpool.zones
That's 0xA00000000 = 42949672960 = 41943040kB. If I subtract 512kB I should see the third and fourth labels.
root at opensolaris:~# dd if=/mnt/zpool.zones bs=1k skip=41942528 | od -A x -x | grep "385c 4943 0000 0000"
027820 385c 4943 0000 0000 0001 0000 0000 0000
512+0 records in
512+0 records out
524288 bytes (524 kB) copied, 0.0577013 s, 9.1 MB/s
root at opensolaris:~#
Oddly enough I see the third uberblock at 0x27800 but the fourth uberblock at 0x67800 is missing. Perhaps corrupted?
No matter. I now work out the exact offsets to the three valid uberblocks and confirm I'm looking at the right uberblocks.
root at opensolaris:~# dd if=/mnt/zpool.zones bs=1k skip=158 | od -A x -x | head -3
000000 b10c 00ba 0000 0000 0004 0000 0000 0000
000010 f39e 0016 0000 0000 2300 bbd0 50fd 8fd9
000020 385c 4943 0000 0000 0001 0000 0000 0000
root at opensolaris:~# dd if=/mnt/zpool.zones bs=1k skip=414 | od -A x -x | head -3
000000 b10c 00ba 0000 0000 0004 0000 0000 0000
000010 f39e 0016 0000 0000 2300 bbd0 50fd 8fd9
000020 385c 4943 0000 0000 0001 0000 0000 0000
root at opensolaris:~# dd if=/mnt/zpool.zones bs=1k skip=41942686 | od -A x -x | head -3
000000 b10c 00ba 0000 0000 0004 0000 0000 0000
000010 f39e 0016 0000 0000 2300 bbd0 50fd 8fd9
000020 385c 4943 0000 0000 0001 0000 0000 0000
They all have the same timestamp. I'm looking at the correct uberblocks. Now I intentionally harm them.
root at opensolaris:/mnt# dd if=/dev/zero of=/mnt/zpool.zones bs=1k seek=158 count=1 conv=notrunc
1+0 records in
1+0 records out
1024 bytes (1.0 kB) copied, 0.000315229 s, 3.2 MB/s
root at opensolaris:/mnt# dd if=/dev/zero of=/mnt/zpool.zones bs=1k seek=414 count=1 conv=notrunc
1+0 records in
1+0 records out
1024 bytes (1.0 kB) copied, 3.5e-08 s, 29.3 GB/s
root at opensolaris:/mnt# dd if=/dev/zero of=/mnt/zpool.zones bs=1k seek=41942686 count=1 conv=notrunc
1+0 records in
1+0 records out
1024 bytes (1.0 kB) copied, 0.00192728 s, 531 kB/s
And... fingers crossed...
root at opensolaris:/mnt# zpool import -d /mnt -f zones
root at opensolaris:/mnt#
Huzzah, the import worked.
root at opensolaris:/mnt# zpool status
pool: zones
state: ONLINE
status: The pool is formatted using an older on-disk format. The pool can
still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
pool will no longer be accessible on older software versions.
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
zones ONLINE 0 0 0
/mnt/zpool.zones ONLINE 0 0 0
errors: No known data errors
And my filesystems are back.
root at opensolaris:/mnt# zfs list
NAME USED AVAIL REFER MOUNTPOINT
zones 23.7G 15.5G 27K /zones
zones/appserver 1.69G 15.5G 5.55G /zones/appserver
zones/base 847M 15.5G 4.20G /zones/base
zones/centos 1.35G 15.5G 1.34G /zones/centos
zones/cgiserver 2.43G 15.5G 6.24G /zones/cgiserver
zones/ds1 5.47G 15.5G 3.91G /zones/ds1
zones/ds2 616M 15.5G 3.88G /zones/ds2
zones/webserver 11.3G 15.5G 15.1G /zones/webserver
Initial inspection of the filesystems are promising. I can read from files, there are no panics, everything seems to be intact.
I hope this helps other people recover corrupted zpools, until such time as there are tools to automate this process.
--
This message posted from opensolaris.org
More information about the zfs-discuss
mailing list