pdfcat — concatenate PDF documents
Today, I was doing my taxes and I ended up with a directory full of JPG images that were scans of documents and a handful of PDFs. I wanted a single document that I could print and/or send to my accountant.
You’d think that an OS that uses PDF — a multipage document format — would make this easy. I assumed it would. My first thought was to open all the JPG images in Preview and save as PDF (single page documents with encapsulated bitmaps).
OK. Great. Got a bunch of PDF documents. Nothing struck me as capable of saving a multipage PDF from said documents. Nor did a search of MacUpdate or Google reveal solutions beyond some random Java app, a TeX based hack that involved installing a huge pile of stuff and re-learning LaTeX, and a handful of apps that might have been able to solve the problem. Maybe.
In the end, I decided it would be easier just to write some code to do the concatenation. It was easy. I grabbed the pdf2png hack from a while back and modified it to scan a bunch of PDF documents, looking for maximum dimensions, and then images all pages of all documents into a new document with that maximum dimension.
The end result is pdfcat.
To use:
[albbum:/tmp/pinball] bbum% pdfcat part* whole.pdf Processing 3 input files Output whole.pdf Size: 612 x 792 Processing part1.pdf Processing part2.pdf Processing part3.pdf Wrote 170 pages
The script is a totally stupid, one-off, hack. It served my current purpose, which may be entirely coincidental. If you find it useful, great! If you improve it, please let me know!
(Yes, I found it odd that I couldn’t make it through filing taxes without writing some code. Not that odd, though.)
Update: Heh. Yeah. I could have used Automator. Of course, it took me a while to figure out that I had to use a “New Folder” action to actually cause the resulting PDF to be saved somewhere. And it doesn’t seem to go in any kind of a sort order (oh, wait, there is a “sort finder items” action). Actually, Automator would have been the quick-and-dirty way to go. Wait, my script is quick and dirty too. Sigh. If all you have is a text editor, all the world’s problems can be addressed with code…
End result; I like my command line solution better, but I’ll send my mom an Automator workflow if she ever needs to solve a similar problem (and she likely will, knowing her).


April 12th, 2006 at 12:40 am
I had to concatenate some PDFs a while back and used Automator to make a workflow action that took the selected Finder items and passed them through the Combine PDF Pages action. I saved it as a Finder Workflow action and use it from the Finder’s context menu.
I’m still surprised that this isn’t in Preview though. Maybe I’ll file a bug, I forgot to do that last time…
April 12th, 2006 at 6:31 am
When I’ve needed this recently, I just created a Pages document with one image per page. Print the resulting document as a PDF.
April 12th, 2006 at 7:33 am
I use PDFLab (http://www.iconus.ch/fabien/pdflab/) for such things, I find it works wonders. YMMV of course.
In general, for the main pdf/image preview application, I find Preview.app lacking in a dozen differfent ways.
April 13th, 2006 at 9:05 am
There is also Combine PDFs which works very well.
April 13th, 2006 at 4:16 pm
I do this all the time in Adobe Acrobat Pro 7, it’s very convenient for assembling multiple scans into PDFs. I love Acrobat because it has a PDF Optimizer routine that does an incredible job reducing fille size, and also has wonderful gadget s to straighten crooked scans and remove stray pixels in backgrounds. I scan all text documents as 300dpi 1-bit, then assemble in Acrobat, and run PDF Optimizer, the resulting files are less than 10% of the size of the original scans. I use Acrobat more than any other single application except Safari.
Of course Acrobat Pro costs money, but I think you get what you pay for.
April 20th, 2006 at 7:27 am
Last time I needed to do this I used the tool pdftk;
http://www.accesspdf.com/pdftk/
Maybe overkill but I’m sure I’ll find a use for its other features some day…
April 26th, 2006 at 12:02 pm
[...] le script Python pdfcat, chez bbum [...]