Signaturize – or how to get on with PDF::API2

Signaturize is a perl script that processes a PDF file. The output is a new PDF file that contains the pages of the original PDF, put in a layout that allows the printed pages to be folded such that signatures (booklets, or ‘katernen’ in dutch) are created, as they are used for bookbinding.

I wrote a similar program before, in C, making calls to the ‘pstools’ command-line toolset. That wasn’t very flexible and not portable at all. The only alternative program on the web is Quantum Elephant’s Bookbinder. I couldn’t get it to work on some PDF’s, but it has as a great advantage that it’s relatively easy to install and that it has a GUI (how many bookbinders will know how to install Perl and run a command-line script?).

Perl has a couple of PDF libraries at CPAN to work with PDF. Signaturize uses PDF::API2. The documentation for this library is very, very minimalistic, so I hope that by making signaturize public, it can be of use for anyone who’s going to use PDF::API2, even more than for people with a passion for bookbinding ;).

The PDF::API2 api is very simple, but one have to know the magic sequence of commands to do the trick…

First, create a new pdf object and populate it with some pages. The page size must be defined for each separate page.

my $outPdf = PDF::API2->new;
my @outPage;
for ($nr=1; $nr<=$outPages; $nr++) {
 $outPage[$nr] = $outPdf->page;
 $outPage[$nr]->mediabox($outXMediaSize, $outYMediaSize);

Now you can start to work. For example, drawing a dashed line on one of the pages:

my $line = $outPage[$nr]->gfx;

Or you can write some text:

my %font = (
 Helvetica => {
 Bold => $outPdf->corefont( 'Helvetica-Bold', -encoding => 'latin1' ),
 Roman => $outPdf->corefont( 'Helvetica', -encoding => 'latin1' ),
 Italic => $outPdf->corefont( 'Helvetica-Oblique', -encoding => 'latin1' ),
my $text = $outPage[$nr]->text;
$text->font($font{'Helvetica'}{'Roman'}, 0.2/cm );

Or you can copy the contents of some page from another PDF:

my $insertedpageXobject = $outPdf->importPageIntoForm($inPdf, $nr); #fetch the page as Xobject
my $insertedpageGfx = $outPage[$outPageNr]->gfx; #create a new graphics object
$insertedpageGfx->save; #otherwise the previous tranformation is added
 -translate => [$x,$y],
 -rotate => $rot,
 -scale => [$scale,$scale],
); #define a transformation on the graphics object
$insertedpageGfx->formimage($insertedpageXobject, 0, 0, 1); #drop the page Xobject in the graphics object

There’s another great hands-on tutorial here, where you’ll learn how to write body text and do more graphical stuff.

What took me very long was to understand the behaviour of the gfx objects and text objects. They have a common methodset that’s part of PDF::API2::Content. To my feeling, the gfx-object is attached to the page, and each page only has got one gfx object, so calling the gfx-method multiple times on a single page doesn’t make sense. The behaviour of the save and restore methods is not intuitive at all, but I learnt to always call ‘save’ before starting a drawing operation and ‘restore’ after finishing it. If you don’t, you’ll see that transformation behaviour is carried over to subsequent drawing operations, or that weird ‘restore’ errors are displayed when reading the PDF.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s