I have been trying compaction with some existing volumes that seem like good candiates. Multi TB in size, million of small files, that sort of thing.. Now I know this will take time to process, but has anyone had some experience with this to give some estimates, even wild ones are better than nothing
One of these volumes I have has an average file size of around 700 bytes, about 4TB of data space used. So taking a conservative estimate of packing 2 files into 1 block, my aggregate level savings should be at least 50% of the data set size. Ideally we should pack 5 of these in a block, but just working on an estimate to keep math simple. At the rate space savings is showing up on my aggregate, it looks like we are getting up to 500 MB / hour back. This looks like it will take many months to finish. Curious if this is a multi stage process that may speed up at some point, or if this is simlar to what others have seen.
I did some testing with small files on a non-prod environment, and when creating new files this worked as one would expect. So for new volumes this looks like a big win, but not sure if this is even worth running for existing volumes if it is going to take months to process a volume. I have many multi TB volumes with small archived image files where this could really save us a lot of space.
Since compaction takes place at the aggregate level, how does this work with Snapmirror? Is this going to mirror logical data to the target where it will need to be compacted while injested? If a source volume has compaction enabled, and the target aggregate also has compaction enabled, does snapmirror enable compaction on the target volume? Or if the source volume does not have compaction enabled, but the target aggregate and volume does, will the target get compaction disabled since the source volume is disabled?
-Rowl