A while ago, I wrote about synchronizing iTunes libraries between a couple of machines using rsync while allowing for multiple locations.
I continue to use that system and it works great. Since I wrote that, iTunes has added significantly improved cover art handling. In fact, the whole cover flow thing has totally changed the way I interact with my library. Much more visual now. Has that feel of flipping through a stack of CDs to discover that hidden gem I haven’t listened to in a while.
With over 16,000 songs across several thousand CDs worth of music, I have a huge number of albums that don’t have cover art and don’t automatically resolve art from the music store. For automatic downloading of cover art, I have been using SonicSwap Boink. For the 30% of the time it mismatches cover art, I do a Google Images search to find the cover art.
Now, I have been using this command (paths modified, obviously) to backup my music library, including all metadata:
rsync -a -v --progress --block-size=220000 mastermachine:/Volumes/Music/ /Volumes/Music/
And I noticed that the synchronization process was taking a really long time recently! Much longer than would be expected given that I haven’t been ripping the rest of the CD collection recently.
It turns out that any track with updated cover art was having at least 220,000 bytes copied between machines. Given that the actual cover art is typically around 20,000 bytes and I couldn’t imagine an additional 200k of ID3 tags associated with the cover art, that order of magnitude difference in bytes changed vs. bytes transferred is a hell of a penalty!
The problem is the block-size. It is both a scanning window size for rsync and appears to also specify the minimum # of bytes transferred to describe a change in a large file.
I cranked it down to 15000…
rsync -a -v --progress --block-size=15000 mastermachine:/Volumes/Music/ /Volumes/Music/
…and the backups now go much, much faster. Of course, it’ll slow down the transfer of newly added music, but not by that much.
Denis suggested that turning on compression would fix the issue. To be specific, he suggested turning on compression in SSH and using rsync over SSH.
It only helps for the transmission of the iTunes database file(s). The media files, including the cover art, doesn’t compress by any noticeable amount when transmitted.
Pushing compression down from rsync into ssh may have the additional advantage of also compressing the rsync adminstrative noise, but larger block sizes are vastly more efficient when pushing media around with rsync!
So, no, compression won’t really help this problem for anything but the iTunes database files. Given their size — many many times the size of the average audio media file — I have been using compression and simply eating the slight CPU overhead related to fruitlessly compressing the already compressed media.