I didn’t receive a response from Citimortgage about their ghastly PDF files, but on my next visit I was again able to view my statements in Evince, the GNOME PDF viewer. (Although they were still obnoxiously large files for the amount of data represented.)

But then on my next visit after that, the following month’s statement was again not viewable. Come on, Citimortgage, this shouldn’t be that difficult.

However, I had already invested some time in learning a couple of things about free software programs for working with PDF files in GNU/Linux, hoping to either shrink the files down or convert them into another format with less storage overhead. I’m a small man with small ambitions, and it had become a mission to not waste so much space on these records.

(You’d think it would be worthwhile for Citimortgage to deal with the situation, as it wastes a lot more space and bandwidth for them spread out over all their customers, but I guess not. Maybe that’s why they only keep the last three month’s worth of statements around, as opposed to years of history you can get at other places.)

ImageMagick

I first found ImageMagick, which is a GPL-compatible command line graphics program that is loaded with goodies. Easy install with:

sudo apt-get install imagemagick

And then it’s easy to convert a pdf file to png or jpg:

convert some.pdf some.png    #(or some.jpg)

The image quality was poor, however. I discovered that ImageMagick is using Ghostscript (gs) for PDF conversions, and found a nice example command to run it and get a higher resolution. I don’t know if gs was already on my system or if it was included in the ImageMagick install, but it was ready and willing, and I was able to come up with:

Ghostscript

gs -q -sDEVICE=pngmono -dBATCH -dNOPAUSE -dFirstPage=1 -dLastPage=1 -r300 -sOutputFile=test.png test.pdf

pngmono is an option I found by using gs -h to list available devices.

Using resolution (r) = 300 produced a relatively small (< 100KB) png file that prints reasonably well. Nifty. And Ghostscript is licensed under the GNU GPL, which is of course the best free software license.

Another benefit in this situation is that gs splits things out by page. The second page of the statement is always the same thing, so I don’t have to bother saving that more than once a year. One of the things ImageMagick does for you is to split up multipage PDFs in to numbered image files. I’m not sure what options are available for Ghostscript on its own. My example will have you running it once for each page. Might need to do some scripting to make things more convenient when running it directly. I didn’t experiment much, having a very narrow objective.

Citimortgage Tomfoolery

I ran the older Citimortgage statements that were only 100KB through this command with no complaint, although didn’t save much on file size. Running the newer, larger files through Ghostscript results in a message like this:

   **** Warning: File has a corrupted %%EOF marker, or garbage after %%EOF.
   **** Warning: stream Length incorrect.

   **** This file had errors that were repaired or ignored.
   **** The file was produced by:
   **** >>>> Xenos D2eVision v2 <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.

Clearly, something went wrong somewhere along the line in the Citimortgage PDF manufacturing department.

In Conclusion

I don’t know if it was worth spending the effort on this quest, but I was happy to add two more tools to my free software toolbox. Of course, I suppose now you’ll tell me that the latest version of Evince will save PDFs as image files.

Related