Child pages
  • ZFS Deduplication
Skip to end of metadata
Go to start of metadata

As size starts becoming a consideration again; I have started thinking about dedupe a bit.

Looking up some of the information here:

http://sun.systemnews.com/articles/146/4/OpenStorage/22961

It gives some decent #'s about DeDupeTable size, and explains which operations are affected by the DDT.

Given the x4540 configuration; we get ~30T of storage, probably 27T of which we would want to be in use at most.

This suggests a table that is larger than the total memory of the system, much less the portion allocatable to DDT.

While this should mostly be irrelevant, it will affect a number of operations that are either beyond our control, or common. 

For example: deletion of large deduped objects would be painfully slow, as it requires multiple DDT operations, which are small, random iops and would not perform well on our raidz2 pools.

I would like to see the deletion of large objects with a L2 arc in place, vs without, vs no-dedup.

A number of people see systems basically hang during these operations due to the time taken to perform the operations, and the I/O queues not servicing other requests/being full. 
I am curious if we would be running in to that (should be improved in around 132 or 133 release I think) or not, with the large memory available.