Tuesday, 16 December 2014

Understanding DFS replication processes and what to do if it stops working

First of all, a quick introduction to Microsoft's DFSR - DFSR was introduced in Windows 2003 R2 and is the mechanism used to replicate files between servers.  This is especially useful when you are using DFS namespaces to publish file shares as you want all possible targets in DFS to have the same files on them.  If you're now lost, you should probably go and read up on DFS concepts on Microsoft's sites.

Now on to the nuts and bolts of how the DFS Replication service works.

The DFS Replication service maintains a database of filenames, paths and hashes in the system volume information\DFSR folder.  It also holds a copy of the database in memory when it's running.  There is only a single database *per drive letter*.  Do not mess with this!

DFSRPrivate is a symlink in each folder configured for replication which points to the DFSR database.
When the first member is added to a replication group, it’s designated primary and builds it’s database of file hashes.
During the build process it is marked as primary (check with dfsradmin membership list /RgName:xxxxx /Attr:MemName,RfName,IsPrimary where xxxxx is your replication group name - use the full path including the fqdn if present)
Once the database build is complete, the primary flag goes away and an Event 4112 is logged in the event log.  Also the replication state changes to 4.

e.g. (where dfsns is the namespace)

D:\>dfsradmin membership list /RgName:domain.local\dfsns\RepGroup1 /Attr:MemName,RfName,IsPrimary
MemName  RfName   IsPrimary
D:\>Wmic /namespace:\\root\microsoftdfs path dfsrreplicatedfolderinfo get replicationgroupname,replicatedfoldername,state | find /I "FOLDER1"
FOLDER1      domain.local\dfsns\RepGroup1   0

D:\>dfsradmin membership list /RgName:domain.local\dfsns\RepGroup1 /Attr:MemName,RfName,IsPrimary
MemName  RfName   IsPrimary
D:\>Wmic /namespace:\\root\microsoftdfs path dfsrreplicatedfolderinfo get replicationgroupname,replicatedfoldername,state | find /I "FOLDER1"
FOLDER1  domain.local\dfsns\RepGroup1   4

If another server is introduced to the replication group while this process is happening, bad things ™ happen, so let it complete!  This can take several hours on very large servers.

The database is built with something called fence value assigned in the database against each file.  This is used in the event of conflicts (e.g. the same file being found in the same place on another server)

All files on the primary member are assigned the “Initial Primary” fence value.  This guarantees that this server is considered the authoritative server during  initial replication.  If another server is introduced before the initial database build is complete, that second server considers itself primary too, so it will assign the same fence value, hence conflicts and the bad things ™.

Once the database has been built, a second member can be introduced.  This will start building its database and assign the Initial Sync fence value to all files it finds (assuming there are existing files).  The second server will then compare the database with the first server and start copying over any missing or different files.

•    If a file doesn’t exist on the secondary member, it will just be copied over and it will move to the next file
•    If a file exists already, then the fence values are compared for conflict resolution.
   o    The higher fence value wins and overwrites the lower.  RDC (remote differential compression) is used to check the files for differences (comparing the file hashes) and changed blocks are copied to the second if required (usually nothing will be copied because the files are usually the same e.g. preseeded
   o    If they have the same fence value, the bad things ™ now occur as there is a conflict.  Conflict resolution is invoked and uses first create time, last modified time to decide which file should win. Conflicting files are moved to the DrfsPrivate\ConflictedandDeleted folder.
This means that the live data on the primary server is moved out of live shares. It will stay in the ConflictedandDeleted folder until it runs out of quota at which time it will be flushed.
This isn’t necessarily the end of the world because the file is still on the second server and *should* eventually be replicated back, but if a User is looking for their file before this happens, they will not see it.

More information on the sequencing can be found on the MS page here:

As indicated below, the Initial Primary fence (2) is higher than the initial sync value (1) which are the two values assigned during the initial setup of replication.

0 Unfence This file or folder will lose all conflicts.
1 Initial Sync Initial fence value for non-primary member.
2 Initial Primary Initial fence value for primary member.
3 Default Default fencing value.
4 Fence Fence with current time stamp.

How to hide unknown devices in Dell OpenManage Essentials (or IT Assistant)

When discovering a large number of devices, usually you'll pick up a few non-Dell bits of hardware, such as Virtual Machines or network switches.

Luckily Dell have an option to avoid cluttering up OME with these, although of course they have put it in a wierd place.

Instead of being with all the main configuration options under preferences, it's actually under the discovery schedule settings in Manage / Discovery and Inventory / Discovery Schedule

Wednesday, 24 September 2014

Force removal of vmkernel ports

Trying to unbind a vmkernel port from an iSCSI adapter that is currently in use fails with the message
"Unable to unbind iscsi port."

Use the --force true parameter to force removal
esxcli iscsi networkportal remove -A vmhbaxx-n vmkx --force true

Equallogic setup.pl script not working with vCLI 5.5

Dell supply (with the VMware Multipath module) a perl setup script for their Equallogic arrays to aid configuring VMware host networking (distributed switches are not recommended for storage).

This script relies on both perl, and the VMware command line interface (vCLI) to connect to hosts.

The vCLI version usually needs to match the host ESXi version you're trying to connect to, but the latest v5.5 vCLI appears to be missing some modules.

The usual way to install them would be to use ppm or cpan, but ppm as implemented in the vCLI doesn't appear to have all modules listed and if you've installed vCLI to the default path, dmake.exe will have issues with the spaces in the path when installing modules with cpan.  The answer to this is to install it to a short directory name off the root (e.g. c:\vCLI)

Once you're able to install modules, from the vCLI run the following commands:
install CPAN
reload cpan
install SOAP::Lite
install MIME::Parser
install Data::UUID

This should install all the required components for the setup script. 

Unfortunately the UUID module isn't installed into the default include (@INC) path so you need to manually specify it.

You can do this either by setting an environment variable or my preferred option of manually specifying the path in the perl command to run the script.
First locate the module in your installation
dir c:\UUID.pm /S

Then run perl with the -I (capital i) parameter: 
perl -I C:\vCLI\Perl\site\lib\Data setup.pl

See this page for more information on non-default @INC paths: http://perlmaven.com/how-to-change-inc-to-find-perl-modules-in-non-standard-locations

Wednesday, 12 March 2014

Changing network settings in Linux

Change IP
ifconfig eth0 netmask up
or edit configs:
Debian / Ubuntu: /etc/network/interfaces
Redhat: /etc/sysconfig/network/network-scripts/ifcfg-eth0 or run command /usr/sbin/netconfig

Add a new IP
ifconfig - a netmask

Set default gateway
route add default gw

Change dns servers
edit resolv.conf with:
nameserver #google public dns
nameserver #opendns