At Packnet, one of our systems has to generate invoices that should be sending to customers on a monthly basis. We decided to automate this and generate the invoices in PDF format, as it is deemed as a standard format for invoicing purposes in the industry. The system was written mainly in Perl – so we decided to use some Perl library from CPAN to generate these invoices. After considering a couple of modules, we found that PDF::API2 is one of the most popular and feature-rich module available for this purpose. But we also found that, it is also the one with not much documentation available and many users have commented on the complexity of understanding various options available and finding about how to use those. As an indicator to this aspect, there are many wrapper modules written by various authors that simplifies the options available in PDF::API2.
One of them called ‘PDF::API2::Simple’ looked, as the name suggests, pretty simple. So we tried that one first and generated some sample PDFs. All looked great until we did some extended testing by using a sample set of the real data and as soon as the invoices started growing to multiple pages, the size of the PDF rocketed! We had a PDF file with 40 pages and the size ran up to 2MB. We used some PDF analysers and found that there are far more objects in it, than there should be. We got in touch with Red Tree Systems who has created this module. They were thrilled to know that we were trialling their module for a production environment; however they were not quite sure about why it is causing the size to increase dramatically as the invoices grow to multiple pages. They directed us to the big man himself – Alfred Reibenschuh, who wrote PDF::API2. He was kind enough to look at the module and confirmed what we have suspected – the wrapper module was not optimizing the resources and font objects were not re-used. We passed this information back to Red Tree Systems. Then we decided to do the work in the hard way – by learning and using PDF::API2.
It took some time to get a grip with the functions and parameters of PDF::API2 – but it worked like a dream. We tested it by creating PDF files with thousands of pages and the size was no problem.
NB: Red Tree Systems appeared to have fixed the optimization issues since and released a new version of PDF::API2::Simple – although we haven’t tested it ourselves. You can see that they have credited us in the ‘Thanks’ section in their CPAN page ( Pradeep N Menon @ Packnet Ltd) for pointing out the optimization issues and helping to resolve them.