Zend_Pdf Wrapper and Sample Code

If you were shopping around for a free, open-source PDF creation library for PHP about 4-5 years ago, chances are you would have discovered - and probably chosen - FPDF, which is a truly awesome library. And then you probably outgrew FPDF because you needed UTF-8 support so you moved on to TCPDF, which is even more awesome!

And then, as Zend Framework matured and started being used by a growing number of PHP developers, you probably noticed that it came with its own PDF creation class and thought, “Yay! Now I can ditch my other PDF creation library and just have one framework in my projects that does the lot!”.

And then you probably fired up your favourite IDE and attempted to actually use Zend_Pdf. This experience - especially for ex-FPDF/TCPDF users - typically results in feelings of frustration and disappointment, followed rather quickly by a wave of anger and a tirade of, “WTF? I can’t write text and have it auto-wrap onto the next line? You’re f*^%ing kidding me, right?”.

No, they’re not kidding you - it is not possible to write a paragraph of wrapped text with Zend_Pdf out of the box. But if you Google the problem you’ll undoubtedly find numerous solutions on the Zend Framework forums, stackoverflow.com, etc, within a matter of minutes, so this hurdle, in and of itself, isn’t a biggy.

However, as you move on with Zend_Pdf you will find there are numerous other challenges in working with this class. But that’s not because the class is bad. The class is simply working at a lower level than what you probably expected and/or have experienced with other libraries.

Take rendering of paragraphs, for example. It turns out that rendering a paragraph isn’t just about wrapping the text. If you want a nice, clean Paragraph($text) method then the class implementing that method needs to know all sorts of things, like where the "cursor" currently is, what the page margins are, how to add a new page if the rendered text won’t fit onto the current page, what the current font, font size and text colour are, etc. So the more you think about it, the more you come to realise why Zend_Pdf draws the line where it does (no pun intended). Zend_Pdf provides the primitive functions for dealing with the fundamental entities within a PDF document. It should be thought of in terms of providing one layer - and a fairly low layer at that - of a multi-layer PDF generation stack. So on that basis it would be bad design for Zend_Pdf to start concerning itself with higher level considerations like page sizes, etc. In short, I’d say the developers did an excellent job at deciding what the class should and shouldn’t do.

But where does that leave you, the ambitious young PHP hacker with a shiny new library and a deadline?

I think it leaves you in the same place I was: in need of a wrapper class that provides a set of higher level functions that do "useful" stuff like writing paragraphs, adding pages, etc, which you can then call from within your application in much the same way you were probably using FPDF or TCPDF.

The good news is that after several months of stubbornly working through the challenges of Zend_Pdf myself, I have (courtesy of a million other blogs, Q&A sites, etc) not only discovered solutions to most of the common challenges, but have written a wrapper class that I am happy to share with the world.

The rest of this article summarises the challenges I hit and the solutions I devised for overcoming those challenges.

Location, Location, Location

As it happens, the first challenge I encountered with Zend_Pdf wasn’t actually the wrapping of the text but simply figuring out the co-ordinates of where to place the text. That’s because, unlike the other PDF libraries I’ve used, Zend_Pdf places the origin of the co-ordinate system at the bottom-left corner of the page and uses "points" as its unit of measurement. This is most likely in keeping with the native PDF co-ordinate system.

Given that my preference is to work in millimetres, I simply implemented a couple of conversion functions that do the mundane work of converting to and from points.

private function pointsToMm( $points )
{
    return $points / 72 * 25.4;
}

private function mmToPoints( $mm )
{
    return $mm / 25.4 * 72;
}

It is also my preference to have the origin of the co-ordinate system at the top left, as I have found that this makes it easier to write code that produces A4 and Letter versions of the same documents. To enable that I have simply flipped the Y co-ordinate in all calls to native Zend_Pdf methods that require co-ordinates. This really comes down to personal preference, I suppose, and you cane make up your own mind about this, but the point I’m trying to make with both the measurement units and the origin of the co-ordinate system is that if an API doesn’t work quite the way you want it to then change it! It’s not re-inventing the wheel, it’s simply creating a different sized wheel. (Or, to look at it another way, its creating a bridge between the model you have in your head and the model that Zend_Pdf expects you to work with).

Stylin’ it

As mentioned earlier, as soon as you start working with paragraphs you’ll realise that there are a lot of considerations, such as fonts, font sizes, font colours, etc, so you’ll need a system for keeping track of the current settings and switching between settings easily.

It is also worth noting that things like the current font, font size, etc, are specified on a per-page basis. So each time you add a new page you will need to specify what font, font size, etc, you are using.

My wrapper class demonstrates a very rudimentary approach to keeping track of these settings. But to be honest, although this system works fine for fairly simple documents, you will soon find that it becomes quite unweildy when you have very rich documents with loads of different styles used throughout. At that point you’ll probably want to implement a more sophisticated approach to managing styles. The save/restore graphics state methods (saveGS and restoreGS) of the Zend_Pdf_Page class might be helpful in this regard.

RGB Colours Gotcha

Although there’s a comment in the code, this one caught me out so badly that I just have to re-iterate it here.

Unlike probably every other API you have ever used that requires three numbers to specify an RGB colour, Zend_Pdf_Color_Rgb() expects those numbers to have values between 0 and 1, NOT values between 0 and 255. And before you ask, yes, I did RTFM. But it wasn’t documented - I eventually discovered this important fact by reading the source for Zend_Pdf_Color_Rgb(). Probably one of the most basic advantages of open source I suppose - being able to read it!

0.104719755 Radians of Separation

And while we’re on the topic of important things that should be mentioned in the API docs that aren’t, please be informed that although your brain probably works in degrees when measuring angles, the Zend_Pdf_Page rotate() method (which is implented in Zend_Pdf_Canvas_Abstract) expects the angle of rotation to be supplied in radians, not degrees. And before you ask, yes, I have lodged a ticket requesting that this tiny yet crucial bit of information be added to the PHPDoc but, alas, it has not yet been added.

(Update: I just noticed there is a comment in the sample code for the drawCircle() method that appears on the Zend Pdf Drawing page of the Zend Manual that DOES mention using radians, although it is still not mentioned in the manual for the rotate() function or in the PHPDoc for either function.)

Images

Trying to render images directly with Zend_Pdf is highly frustrating yet hysterically funny at the same time. The reason being that you have to provide the drawImage() function with the co-ordinates of all four corners of the area in which you want the image placed and trying to figure out those co-ordinates using points and with the origin in the lower left hand corner is a process that you’re going to fail at on your first few attempts, and probably with some highly amusing results. You wanted it upside down and really, really stretched out, didn’t you? Probably not, so use my image() method instead, which has the added benefit of caching the image files so that if you have a logo or some other image that is included multiple times within the document, you’ll only end up with one copy of the actual graphic embedded within the PDF file.

private function image( $filename, $x_mm, $y_mm, $w_mm = 0 )
{
    $size = getimagesize( $filename );
    $width = $size[0];
    $height = $size[1];

    if ( $w_mm == 0 )
    {
        $w_mm = $this->pointsToMm( $width );
    }

    $h_mm = $height / $width * $w_mm;

    $x1 = $this->mmToPoints( $x_mm );
    $x2 = $this->mmToPoints( $x_mm + $w_mm );
    $y1 = $this->mmToPoints( $this->paperHeight - $y_mm - $h_mm );
    $y2 = $this->mmToPoints( $this->paperHeight - $y_mm );

    if ( !isset( $this->imageCache[$filename] ))
    {
        $this->imageCache[$filename] = Zend_Pdf_Image::imageWithPath( $filename );
    }

    $this->zpdf->pages[$this->currentPage]->drawImage( $this->imageCache[$filename], $x1, $y1, $x2, $y2 );

    return $h_mm;
}

A Note About Hyperlinks

Unfortunately, the only option available within Zend_Pdf for creating a clickable link requires that you specify an area of the page that will be clickable. I do not believe this is a limitation of the PDF format, because other libraries have the ability to render chunks of text, for example, as clickable links.

The main thing I don’t like about this approach is that when you are viewing the document on screen (which is obviously the only time you can actually click any links!) you will notice a black border outlining the area that has been made clickable. The good news is that this black border doesn’t appear when you print the document out (which makes sense) - it is only rendered by the document viewing application to identify the clickable area. The bad news is that it seems to be impossible to remove this border or even change the colour of it.

I spent a long time reading PDF specs to try and figure out how to implement clickable text but didn’t really get anywhere. Hopefully somebody smarter than myself will be able to make this improvement to Zend_Pdf at some point in the future. (And no, I haven’t raised a ticket for this yet).

Table of Contents

Although the API for the Table of Contents was a little tricky to work out intially, I got there in the end and was able to devise a nice little strategy for automatically generating this just prior to calling the render() function.

Final Thoughts

I have other methods in my "production" wrapper class that do things like fit an image to a pre-defined area, fit text within predefined areas, etc, but in the interests of clarity and simplicity I decided not to include those methods in the published wrapper class. If you have a need for a particular method then please feel free to add a comment below and if I have something that works I will add it to the published code.

Over a period of time I went from loathing Zend_Pdf to loving it. The fact that existing PDF documents can be imported and then modified is super cool and provides a great option in some cases. Hopefully others will come to appreciate the benefits of Zend_Pdf and, of course, the rest of the Zend Framework and will share their experiences and expertise as well.

And in case you missed it, the code for the Wrap_Pdf class is available on github.

Yet Another Programming Blog

Where James Gordon rambles about PHP, Laravel and web development in general.

Find me on Google Plus Find me on Twitter Find me on Stack Exchange Find me on Github Subscribe