pdfcat — concatenate PDF documents

Today, I was doing my taxes and I ended up with a directory full of JPG images that were scans of documents and a handful of PDFs. I wanted a single document that I could print and/or send to my accountant.

You’d think that an OS that uses PDF — a multipage document format — would make this easy. I assumed it would. My first thought was to open all the JPG images in Preview and save as PDF (single page documents with encapsulated bitmaps).

OK. Great. Got a bunch of PDF documents. Nothing struck me as capable of saving a multipage PDF from said documents. Nor did a search of MacUpdate or Google reveal solutions beyond some random Java app, a TeX based hack that involved installing a huge pile of stuff and re-learning LaTeX, and a handful of apps that might have been able to solve the problem. Maybe.

In the end, I decided it would be easier just to write some code to do the concatenation. It was easy. I grabbed the pdf2png hack from a while back and modified it to scan a bunch of PDF documents, looking for maximum dimensions, and then images all pages of all documents into a new document with that maximum dimension.

The end result is pdfcat.

To use:

[albbum:/tmp/pinball] bbum% pdfcat part* whole.pdf
Processing 3 input files
Output whole.pdf
Size: 612 x 792
Processing part1.pdf
Processing part2.pdf
Processing part3.pdf
Wrote 170 pages

The script is a totally stupid, one-off, hack. It served my current purpose, which may be entirely coincidental. If you find it useful, great! If you improve it, please let me know!

(Yes, I found it odd that I couldn’t make it through filing taxes without writing some code. Not that odd, though.)

Update: Heh. Yeah. I could have used Automator. Of course, it took me a while to figure out that I had to use a “New Folder” action to actually cause the resulting PDF to be saved somewhere. And it doesn’t seem to go in any kind of a sort order (oh, wait, there is a “sort finder items” action). Actually, Automator would have been the quick-and-dirty way to go. Wait, my script is quick and dirty too. Sigh. If all you have is a text editor, all the world’s problems can be addressed with code…

End result; I like my command line solution better, but I’ll send my mom an Automator workflow if she ever needs to solve a similar problem (and she likely will, knowing her).



8 Responses to “pdfcat — concatenate PDF documents”

  1. Ashley Clark says:

    I had to concatenate some PDFs a while back and used Automator to make a workflow action that took the selected Finder items and passed them through the Combine PDF Pages action. I saved it as a Finder Workflow action and use it from the Finder’s context menu.

    I’m still surprised that this isn’t in Preview though. Maybe I’ll file a bug, I forgot to do that last time…

  2. Zachery Bir says:

    When I’ve needed this recently, I just created a Pages document with one image per page. Print the resulting document as a PDF.

  3. Haris Skiadas says:

    I use PDFLab (http://www.iconus.ch/fabien/pdflab/) for such things, I find it works wonders. YMMV of course.

    In general, for the main pdf/image preview application, I find Preview.app lacking in a dozen differfent ways.

  4. Vincent Noel says:

    There is also Combine PDFs which works very well.

  5. Charles says:

    I do this all the time in Adobe Acrobat Pro 7, it’s very convenient for assembling multiple scans into PDFs. I love Acrobat because it has a PDF Optimizer routine that does an incredible job reducing fille size, and also has wonderful gadget s to straighten crooked scans and remove stray pixels in backgrounds. I scan all text documents as 300dpi 1-bit, then assemble in Acrobat, and run PDF Optimizer, the resulting files are less than 10% of the size of the original scans. I use Acrobat more than any other single application except Safari.
    Of course Acrobat Pro costs money, but I think you get what you pay for.

  6. Will says:

    Last time I needed to do this I used the tool pdftk;
    http://www.accesspdf.com/pdftk/
    Maybe overkill but I’m sure I’ll find a use for its other features some day…

  7. blog.seriot.ch » Blog Archive » Concaténer des fichiers PDF says:

    […] le script Python pdfcat, chez bbum […]

  8. Jeremy W. Sherman says:

    For small jobs, the easiest method I’ve found is:
    * Open all PDFs in Preview.
    * Ensure the Sidebar is shown.
    * Select PDF page thumbnails in one Sidebar.
    * Click and drag the selected thumbnails to another Sidebar.
    * The selected pages are copied to the drop location within the target Sidebar’s associated document.

    This works at least as of Mac OS X 10.5.8; I haven’t tried it with any earlier versions.

Leave a Reply

Line and paragraph breaks automatic.
XHTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>