Compare two directories in Linux using rsync

TL:DR: Use this script to compare two complex directories after a backup / restore / upgrade.

The best way to upgrade HCL Domino 12 to HCL Domino 14 on Linux is to create a new HCL Domino server using the latest internal Centos Stream template that is customized and hardened. Once created you can copy the Domino files from the previous server to the new server using SCP – “scp -r  root@172.22.10.234:/local/notesdata/* /local/notesdata/” . All you then need to do is change ownership of the files and run the HCL Domino 14 installer again. The last step is to change the IP address to the previous server and HCL Domino is up and running on a new Centos Server on a new Proxmox server. A blog post should follow this easy statement.

The biggest issue or question though is did all the files copy across? How can you be sure?

Here is an amazing method to compare a local and remote directory to see what is in one directory but not the other and what files are common but the size and timestamp is different.

This task is more complex than you think due to the many options available. Here is an option that works for us but your mileage may vary (YMMV). The script was created by “ndemou” and can be found here: https://unix.stackexchange.com/questions/57305/rsync-compare-directories

In essence you create the shell script in a directory of choice and make it executable.

The file is called with: diff-dirs Left_Dir Right_Dir [options]

#!/bin/bash
# Compare two directories using rsync and print the differences
# CAUTION: options MUST appear after the directories
#
# SYNTAX
#---------
# diff-dirs Left_Dir Right_Dir [options]
#
# EXAMPLE OF OUTPUT
#------------------
# L             file-only-in-Left-dir
# R             file-only-in-right-dir
# X >f.st...... file-with-dif-size-and-time
# X .f...p..... file-with-dif-perms
#
# L / R mean that the file/dir appears only at the `L`eft or `R`ight dir. 
#
# X     means that a file appears on both sides but is not the same (in which
#       case the next 11 characters give you more info. In most cases knowing
#       that s,t,T and p depict differences in Size, Time and Permissions 
#       is enough but `man rsync` has more info
#       (look at the --itemize-changes option)
#
# OPTIONS
#---------
# All options are passed to rsync. Here are the most useful for the purpose
# of directory comparisons:
#
# -c will force comparison of file contents (otherwise only
#    time & size is compared which is much faster)
#
# -p/-o/-g will force comparison of permissions/owner/group

if [[ -z $2 ]] ; then
    echo "USAGE: $0 dir1 dir2 [optional rsync arguments]"
    exit 1
fi

set -e

LEFT_DIR=$1; shift
RIGHT_DIR=$1; shift
OPTIONS="$*"

# Files that don't exist in Right_Dir
rsync $OPTIONS -rin --ignore-existing "$LEFT_DIR"/ "$RIGHT_DIR"/|sed -e 's/^[^ ]* /L             /'
# Files that don't exist in Left_Dir
rsync $OPTIONS -rin --ignore-existing "$RIGHT_DIR"/ "$LEFT_DIR"/|sed -e 's/^[^ ]* /R             /'
# Files that exist in both dirs but have differences
rsync $OPTIONS -rin --existing "$LEFT_DIR"/ "$RIGHT_DIR"/|sed -e 's/^/X /'

Here is how we call the file:

./filecomp.sh /local/notesdata/ root@172.22.10.232:/local/notesdata/ > /mnt/SMBshare/primetoprimeoldfullcompare.txt

You will need to enter the ssh password three times or use SSH keys. — https://superuser.com/questions/1605215/how-to-specify-password-in-ssh-command

As an extract after running the script:

Files on the left so the local machine. In our case the new HCL Domino server.

L IDB05289.DTF
L IDB08585.DTF
L IDB15501.DTF
L IDB19442.DTF
L IDB44535.DTF
L IDB46512.DTF
L IDB54971.DTF
L IDB59036.DTF
L IDB89341.DTF
L admincentral.nsf
L admincentral.ntf
L autoupdate.ntf

Files that only exist on the right side so the remote or old HCL Domino server

R domino/workspace/.config/org.eclipse.osgi/117/0/
R domino/workspace/.config/org.eclipse.osgi/117/0/.cp/
R domino/workspace/.config/org.eclipse.osgi/117/0/.cp/lib/
R domino/workspace/.config/org.eclipse.osgi/117/0/.cp/lib/commons-codec-1.3-minus-mp.jar
R domino/workspace/.config/org.eclipse.osgi/117/0/.cp/lib/commons-fileupload-1.4.jar
R domino/workspace/.config/org.eclipse.osgi/117/0/.cp/lib/commons-io-2.2.jar

Files that exist on both side but the size or time is different.

X <f.sT…… AgentRunner.nsf
X <f.sT…… DAOSsnap.ntf
X <f..T…… DCT.nsf
X <f..T…… DDMRepCach.dat
X <f..T…… DOMCFG.nsf
X <f..T…… DomShrct.sh
X <f.sT…… Forms9_x.ntf

You have to interpret this with your HCL Domino mindset. Some indexes are not needed and you could have deleted the *.log.gz files used my Daniel’s script. Also, because it is best to run the Domino 14 upgrade afterwards there may be additional files on the left side. In addition you can only compare one local to one remote directory.

To find the largest files use this command:

du -a /local/notesdata/ | sort -n -r | head -n 20 
Or
find /local/notesdata/ -type f -printf '%s %p\n' | sort -nr | head -10

Leave a Reply

Your email address will not be published. Required fields are marked *