iTunes Library Backup, rsync & cover art

A while ago, I wrote about synchronizing iTunes libraries between a couple of machines using rsync while allowing for multiple locations.

I continue to use that system and it works great. Since I wrote that, iTunes has added significantly improved cover art handling. In fact, the whole cover flow thing has totally changed the way I interact with my library. Much more visual now. Has that feel of flipping through a stack of CDs to discover that hidden gem I haven’t listened to in a while.

With over 16,000 songs across several thousand CDs worth of music, I have a huge number of albums that don’t have cover art and don’t automatically resolve art from the music store. For automatic downloading of cover art, I have been using SonicSwap Boink. For the 30% of the time it mismatches cover art, I do a Google Images search to find the cover art.

Now, I have been using this command (paths modified, obviously) to backup my music library, including all metadata:

rsync -a -v --progress --block-size=220000 mastermachine:/Volumes/Music/ /Volumes/Music/

And I noticed that the synchronization process was taking a really long time recently! Much longer than would be expected given that I haven’t been ripping the rest of the CD collection recently.

It turns out that any track with updated cover art was having at least 220,000 bytes copied between machines. Given that the actual cover art is typically around 20,000 bytes and I couldn’t imagine an additional 200k of ID3 tags associated with the cover art, that order of magnitude difference in bytes changed vs. bytes transferred is a hell of a penalty!

The problem is the block-size. It is both a scanning window size for rsync and appears to also specify the minimum # of bytes transferred to describe a change in a large file.

I cranked it down to 15000…

rsync -a -v --progress --block-size=15000 mastermachine:/Volumes/Music/ /Volumes/Music/

…and the backups now go much, much faster. Of course, it’ll slow down the transfer of newly added music, but not by that much.

Update #1:

Denis suggested that turning on compression would fix the issue. To be specific, he suggested turning on compression in SSH and using rsync over SSH.

It only helps for the transmission of the iTunes database file(s). The media files, including the cover art, doesn’t compress by any noticeable amount when transmitted.

Pushing compression down from rsync into ssh may have the additional advantage of also compressing the rsync adminstrative noise, but larger block sizes are vastly more efficient when pushing media around with rsync!

So, no, compression won’t really help this problem for anything but the iTunes database files. Given their size — many many times the size of the average audio media file — I have been using compression and simply eating the slight CPU overhead related to fruitlessly compressing the already compressed media.



5 Responses to “iTunes Library Backup, rsync & cover art”

  1. Denis says:

    Hi

    I never had such a problem with the blocksize since I’m using compression with ssh. Simply turn it on with “Compression yes” in /etc/ssh_config. And I assume everybody uses rsync over ssh!

    Denis

  2. Robert Nicholson says:

    And so the only reason why rsync doesn’t crash is because you’re not using extended attributes right?

    I mean whenever I rsync b/w two machines I will eventually see

    Invalid checksum length 1936852992
    rsync error: protocol incompatibility (code 2) at /SourceCache/rsync/rsync-24/rsync/sender.c(59)
    rsync: writefd_unbuffered failed to write 103 bytes: phase “unknown” [generator]: Broken pipe (32)
    rsync error: error in rsync protocol data stream (code 12) at /SourceCache/rsync/rsync-24/rsync/io.c(909)

  3. Robert Nicholson says:

    Actually it looks like the above problems only occur when you use local mode so as long as you don’t do that rsync works relatively
    reliably.

  4. Marc says:

    I am receiving the same Invalid checksum length, and I was a bit confused by the local mode comment. I have a shared drive on my MacBook Pro to where I want to backup data on the MacBook Pro. I seem to get the invalid checksum length everytime I attempt a backup. I am not using SSH for rsh, this is simply a samba share.

    Any ideas?

  5. bbum says:

    It should be noted that the Genius database file is both very large and encrypted. Thus, it is sync’d across nearly in whole every time. As such and because of the increased frequency of video files since writing this, I have bumped the block size back up significantly.

Leave a Reply

Line and paragraph breaks automatic.
XHTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>