Comparing directory trees with diff or rsync

When you are trying to figure out if (and how) the contents of two directories differ, there are a couple of standard/common ways to use diff.

diff -Naur DIR1 DIR2 shows all differences between the two directories as a unified diff:

$ diff -Naur website website-new
diff -Naur website/index.shtml website-new/index.shtml
--- website/index.shtml        2008-05-22 20:16:12.000000000 -0400
+++ website-new/index.shtml    2008-06-04 12:10:50.000000000 -0400
@@ -14,6 +14,7 @@
 <!-- page body -->
 <div id="body">

+<p>Welcome!</p>

 <p>
   <b>About:</b> This subject is aimed at students with little or no
diff -Naur website/style.css website-new/style.css
--- website/style.css  2008-04-11 01:25:12.000000000 -0400
+++ website-new/style.css      2008-06-04 12:11:01.000000000 -0400
@@ -24,7 +24,7 @@
     color: white; text-decoration: none; font-weight: bold; padding: 0 0.25em;
 }

-div#body { padding: 0.1em 55px 2em 55px; font-size: small }
+div#body { padding: 0.1em 55px 2em 55px; font-size: medium }

 dd { margin-bottom: 1em }

On the other hand, if you just want to get a quick sense of the differences, diff -qr DIR1 DIR2 merely names the differing files:

$ diff -qr website website-new
Files website/index.shtml and website-new/index.shtml differ
Files website/style.css and website-new/style.css differ

rsync can do something similar, and it works even when the files are not on the same host. rsync -rvnc --delete DIR1/ remotehost:path/to/DIR2/ (the trailing slash on DIR1/ is important here!) will tell you what files rsync would have updated or deleted on the remote host. (The -n option makes rsync do a "dry-run", meaning it makes no changes on the remote host.)

$ rsync -rvnc --delete website/ laptop:projects/website/
deleting schedule.shtml
style.css

The -c option is used because we're paranoid: it forces rsync to compute and compare checksums for each file to verify that they are the same. Otherwise, rsync assumes that files are the same if they have the same timestamp and size (sometimes this gives false positives if timestamps were not preserved when you made the copy). If that is acceptable, you can omit -c to get a speedup.

7 comments:

  1. That's actually pretty handy!

    ReplyDelete
  2. Very handy rsync script. Thanks for sharing.

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Thanks. I was getting different results using diff and rsync, and your post clarified it for me (--checksum/-c is what you want to use in those cases).

    ReplyDelete
  5. Hi
    I am looking for a rsync script to list me files existing on Server1 and not existing on Server2 and vice versa. And also if they are not same (content wise)?.

    can any one help me.

    ReplyDelete
  6. Wow, thank you so much for posting this!

    ReplyDelete