First some background:
- 2 DFS Servers running Storage Server 2008 w/ SP2
- Both servers are in a replication group that replicates a folder in the namespace to both servers.
- The servers are in different AD sites connected by a fast WAN connection.
- The replication schedule is set to 24/7
The problem I am having is that the replication backlog never catches up. The backlog is currently at 1,900,000+ files but only from SERVER1 -> SERVER2. Replication from SERVER2 -> SERVER1 is successful with no backlog. The majority of file changes are made on SERVER1. After troubleshooting this for a week or so trying to find out why the backlog keeps increasing I've found the following:
- The file hashes on SERVER1 are different than the file hashes on SERVER2 when checked with the DFSRDIAG FILEHASH command. The hashes are different even on files with the same modified date, NTFS permissions, and size.
- If I run a robocopy and copy files from SERVER1 to SERVER2 the hashes are the same.
- Files I've added as recently as 3/31/2014 with no changes made since have different hashes on both servers which is causing DFS to try to replicate them even though it doesn't need to.
It seems like something is causing the file hashes to change on every single file in the replicated folder on one of the servers. I've checked the antivirus (TrendMicro) and it is supported on DFS and isn't even configured to run any scheduled scans. I'd like to reseed SERVER2 via robocopy so the hashes are all the same again but I'm afraid that same problem will just happen again.
Does anyone have any idea why the file hashes would be changing even if the file timestamp, permissions, size, etc the same?