PDFs For All

Let’s say you have a really great form that your employees fill out every time they do something for a customer. In this example, we’ll say it’s an order for widgets. Your company has made and remade Widget Order Form several times over the years, and it’s looking just right now. Great job!form2

If we take this form to the next level, though, and say you want to digitize your form: your employees fill it out on the computer, and then your perfectly-completed, form-meets-function document gets emailed to them, just like the paper version but without all the paper. This is really cool! Saving money, saving paper, and allowing your staff to type instead of write makes the whole process cleaner and faster. Also, with your staff entering each order detail into a database, you can start to collect a lot of information about your customer base: which customers order which products, which zip codes make the most re-orders, which products are re-ordered the most frequently, and which advertising channel brought in a new customer. How cool! With your employees entering this data every time there’s an order, the work does itself.

But what happens when you want to auto-generate these documents for customers? You’re going to use PDF. Why? Because everyone uses PDF!

PDF is one the web’s oldest formats and has many uses. For the most part, PDFs are used for static content, that are generated in the outside world, they are for a scanned document or generated by other software. Frequently, PDFs are secured documents can be easily transported to a user as a simple email attachment without the worry that the end user’s operating systems is not compatible. One the reasons why PDF is used for documents is that Adobe (the company that created PDF),  allowed the PDF Reader program to become free. This cut down competing file types, since there was no reason to pay for a program when you could use PDF for free. Adobe even decided to take one more step and allow PDF to become an open source format, which not only allowed people to use it for free but also allowed programmers to create applications and extensions with PDF for free. This ultimately made PDF the de facto standard for fixed type documents on the computer.form_image

Unfortunately, there’s a catch to PDF: most of the usages that I described above are for visual programs, like your word processor or vector and raster editors, where you can manually set position of your content then process it out in PDF format. This is fine if you are creating an image or a document that you do not want readers to alter, or if you want to control who views it via password protection.

The issue is that PDF is inherently a visual format, and is does not play well with automated data or data that gets pulled out of a database and is prepopulated on to page. For one, PDF in this environment lack style control. Unless you want your data pulled from the data bates and just thrown on to PDF one after another, it has to be styled. On the web, the HTML to PDF libraries lack CSS support. PDFs need to be styled and themed, just like any other document viewer. Unfortunately, PDF does not support CSS natively. Without being able to control the layout of the PDF with CSS, documentswith unexpected styles and layouts.form_3

This is because CSS is not natively supported by PDF, but usually coded into the HTML-to-PDF libraries. This means someone that worked on this project had to sit down and code the individual CSS commands into the PDF library. Of course, this means that the PDF library is not going to work with CSS perfectly, and for the most part is going to only work with CSS 2. CSS3 incompatibility creates multiple issues with using PDFs with CSS since the CSS is buggy and lacks the modern changes. Many themes will not be able to print their styles in CSS2, so your PDF won’t look anything like your webpages until you add styles manually!

With newer,  web-based applications there are much better ways of having forms and data applications than using PDF. For the most part, PDF is meant to be used as pages-to-print, not a web based platform. Usually a PDF has been created by an outside system then uploaded to the web for distribution. For auto-generation of a PDF, since the inputs of the form are not optimized for a PDF page, the formatting will be thrown off, which is a big issue. A better solution is to build the application to output straight to an HTML print page then take that HTML print page and then convert to an PDF, if the page has to be in PDF. An even better solution is to just use the HTML page and skip PDF altogether when possible.