Hi there,
Yesterday we set up a new DFS-R server (Windows 2012 R2, windows updated) and configured the various replication groups we needed. After a bit of time we saw that the DFS-R wasn't replicating on some RG, because of the event 4004, (error 9098, a tombstone content....)
We followed this post 2K8 DFSR Replication not working set the IsPrimary on all the RGs. Everything went ok and the smaller RG began to show as fully replicated in a few minutes (event 4104)
At 0:37:23 the biggest one (Clients) had the 4104 posted on the new server.
This morning the users had lost most of their files and after looking for them we found that on the RG Clients, DFS-R was detecting conflicts and moved the files to ConflictsAndDeleted, as the quota for ConflictsAndDeleted was smaller than the sum of the files moved, we lost some files and had to restore from backup
On the primary and production server we found many log entries that show that although it was the primary (no typo setting primary), it though that was a good idea to move the files because the secondary had a better version of the file. As the file isn't on the secondary server we think that was only referenced on the DFS-R database
I attach what we see about one of the file moved:
20140218 03:50:39.154 1684 MEET 4265 Meet::ProcessUid Uid related found uidRelatedGvsn:{F48E1426-B808-48DF-A34B-FCEB3E1859A6}-v1623883 updateName:FILENAME.sql uid:{F48E1426-B808-48DF-A34B-FCEB3E1859A6}-v1623883 gvsn:{59754836-2D30-450E-81CB-FF0CFD984951}-v1013776
connId:{8B82B4F8-055D-49BB-A501-8B435A5D8638} csName:Clients
20140218 03:50:39.154 1684 MEET 6337 Meet::LocalDominates Remote version dominates localgvsn:{F48E1426-B808-48DF-A34B-FCEB3E1859A6}-v1623883 updateName:FILENAME.sql uid:{F48E1426-B808-48DF-A34B-FCEB3E1859A6}-v1623883 gvsn:{59754836-2D30-450E-81CB-FF0CFD984951}-v1013776
connId:{8B82B4F8-055D-49BB-A501-8B435A5D8638} csName:Clients
20140218 03:50:39.154 1684 MEET 5481 Meet::MoveOut Moving contents and children out of replica. newName:FILENAME-{F48E1426-B808-48DF-A34B-FCEB3E1859A6}-v1623883.sql updateName:FILENAME.sql uid:{F48E1426-B808-48DF-A34B-FCEB3E1859A6}-v1623883
gvsn:{59754836-2D30-450E-81CB-FF0CFD984951}-v1013776 connId:{8B82B4F8-055D-49BB-A501-8B435A5D8638} csName:Clients record:
+ fid 0x1000000064EAB
+ usn 0x3ca393c8
+ uidVisible 1
+ filtered 0
+ journalWrapped 0
+ slowRecoverCheck 0
+ pendingTombstone 0
+ internalUpdate 0
+ dirtyShutdownMismatch 0
+ meetInstallUpdate 0
+ meetReanimated 0
+ recUpdateTime 20140217 15:38:11.906 GMT
+ present 1
+ nameConflict 0
+ attributes 0x20
+ ghostedHeader 0
+ data 0
+ gvsn {F48E1426-B808-48DF-A34B-FCEB3E1859A6}-v1623883
+ uid {F48E1426-B808-48DF-A34B-FCEB3E1859A6}-v1623883
+ parent {F48E1426-B808-48DF-A34B-FCEB3E1859A6}-v1623863
+ fence Initial Primary (2)
+ clockDecrementedInDirtyShutdown 0
+ clock 20121126 20:29:47.985 GMT (0x1cdcc14c75e0a85)
+ createTime 20111028 14:16:42.703 GMT
+ csId {265B5CE8-584B-4D45-992C-637FA56D6F20}
+ hash 16B9EA7E-A8EFDF3E-C1C68CA9-68C7CCB0
+ similarity 00000000-00000000-00000000-00000000
+ name FILENAME.sql
+
20140218 03:50:39.154 1684 MEET 5657 Meet::MoveOut Moving to conflict/deleted:0x3000000000257 updateName:FILENAME.sql uid:{F48E1426-B808-48DF-A34B-FCEB3E1859A6}-v1623883 gvsn:{59754836-2D30-450E-81CB-FF0CFD984951}-v1013776 connId:{8B82B4F8-055D-49BB-A501-8B435A5D8638}
csName:Clients
As you could see, the fence is "Initial Primary" (MSDN - Fence) so although the remote has a newer version, should be overwriting it as it is a primary one, isn't it?
We currently solved the crisis, but we will need to enable again DFS-R and we can't have the same problem again
Thanks in advance
Sergi