[zfs-discuss] dedup question
Victor Latushkin
Victor.Latushkin at Sun.COM
Mon Nov 2 09:07:36 PST 2009
Enda O'Connor wrote:
> it works at a pool wide level with the ability to exclude at a dataset
> level, or the converse, if set to off at top level dataset can then set
> lower level datasets to on, ie one can include and exclude depending on
> the datasets contents.
>
> so largefile will get deduped in the example below.
And you can use 'zdb -S' (which is a lot better now than it used to be
before dedup) to see how much benefit is there (without even turning
dedup on):
bash-3.2# zdb -S rpool
Simulated DDT histogram:
bucket allocated referenced
______ ______________________________ ______________________________
refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE
------ ------ ----- ----- ----- ------ ----- ----- -----
1 625K 9.9G 7.90G 7.90G 625K 9.9G 7.90G 7.90G
2 9.8K 184M 132M 132M 20.7K 386M 277M 277M
4 1.21K 16.6M 10.8M 10.8M 5.71K 76.9M 48.6M 48.6M
8 395 764K 745K 745K 3.75K 6.90M 6.69M 6.69M
16 125 2.71M 888K 888K 2.60K 54.2M 17.9M 17.9M
32 56 2.10M 750K 750K 2.33K 85.6M 29.8M 29.8M
64 9 22.0K 22.0K 22.0K 778 2.04M 2.04M 2.04M
128 4 6.00K 6.00K 6.00K 594 853K 853K 853K
256 2 8K 8K 8K 711 2.78M 2.78M 2.78M
512 2 4.50K 4.50K 4.50K 1.47K 3.52M 3.52M 3.52M
8K 1 128K 128K 128K 15.9K 1.99G 1.99G 1.99G
16K 2 8K 8K 8K 50.7K 203M 203M 203M
Total 637K 10.1G 8.04G 8.04G 730K 12.7G 10.5G 10.5G
dedup = 1.30, compress = 1.22, copies = 1.00, dedup * compress / copies
= 1.58
bash-3.2#
Be careful - can eat lots of RAM!
Many thanks to Jeff and all the team!
Regards,
Victor
> Enda
>
> Breandan Dezendorf wrote:
>> Does dedup work at the pool level or the filesystem/dataset level?
>> For example, if I were to do this:
>>
>> bash-3.2$ mkfile 100m /tmp/largefile
>> bash-3.2$ zfs set dedup=off tank
>> bash-3.2$ zfs set dedup=on tank/dir1
>> bash-3.2$ zfs set dedup=on tank/dir2
>> bash-3.2$ zfs set dedup=on tank/dir3
>> bash-3.2$ cp /tmp/largefile /tank/dir1/largefile
>> bash-3.2$ cp /tmp/largefile /tank/dir2/largefile
>> bash-3.2$ cp /tmp/largefile /tank/dir3/largefile
>>
>> Would largefile get dedup'ed? Would I need to set dedup on for the
>> pool, and then disable where it isn't wanted/needed?
>>
>> Also, will we need to move our data around (send/recv or whatever your
>> preferred method is) to take advantage of dedup? I was hoping the
>> blockpointer rewrite code would allow an admin to simply turn on dedup
>> and let ZFS process the pool, eliminating excess redundancy as it
>> went.
>>
>
More information about the zfs-discuss
mailing list