We have three 2012 DFS/DFS-R based servers at separate sites storing roaming profiles and redirected folders. We recently made the decision to enable deduplication to take advantage of the reduced storage savings. The third server was basically a reinstall of the OS and pre-seeding data before we enabled replication to start on it, so there was no jet database available on it. In the past prior to enabling data deduplication on the volumes, given our current amount of data (which is only around 2 TB, nothing earth shattering), it would usually take 2-3 days tops for it to split out all of the 4104's.
With deduplication enabled prior to starting the replication process, and basically with deduplication running with out of the box settings set at 0 days with Enable background optimization, it took roughly 3x as longer time wise to get the same 4104's on the same set of data.
Now, all this being said, one of the servers recently encountered a blue screen of death dump that caused an improper shutdown of the volumes, which spits out the 2212's as expected. One of the volumes unfortunately did not recover with a 2218, so off we go on another rebuild process. Time wise, it is still taking a very, very noticeable longer amount of time for the jet databases to rebuild. I recently made a change from 0 to 5 days, and told it to only run at 1 hour intervals to see if that would help speed up the process, but it doesn't appear to have made any difference.
I guess my question is this considered normal behavior given the deduplication on the data set prior to rebuild? What would be the recommended way of getting back to 1-3 days of recovery time versus 7-10 days? Disabling deduplication on the volume before allowing the jet database to rebuild? Excluding prestage, or other folders of interest related to DFS-R? Server 2012 R2 is not an option for the time being for the jet database copy quickness.
Any help would be appreciative, because if it is going to further increase the time it takes to rebuild the jet database like it is now, I don't think us saving 40-60% of data is worth it in our environment.
Gary Adkins