Previous Page
PCLinuxOS Magazine
Article List
Next Page

PDF Part Two: Editing The Universal Document

by Paul Arnote (parnote)

Last month we focused on the various ways that you can create a PDF document with ease. This month, we'll take a look at how you can edit a PDF file.

Rewind: Two More Ways To Create PDF Files

Before we get too into talking about how to edit PDF files, I need to talk about two more ways to create PDF files. A post by PCLinuxOS forum member dm+ showed me a "new" (to me, anyway) way to create a PDF file, which then reminded me of yet another way to create a PDF file that I had tried several years ago, but couldn't satisfactorily nor reliably get it to create a PDF file.

Granted, when I first tried the "second" way alluded to above (and couldn't get it to work), I was a pretty green Linux noob. So, I abandoned it and pretty much forgot about it until I read dm+'s forum post. This time, with considerably more Linux "experience" under my belt, the "second" method worked, and works easily and reliably. So, it's quite likely (as in, highly probable) that I didn't have the command line parameters set appropriately when I made my first attempt.

Both of these "new" methods of creating a PDF file are linked to using LibreOffice. If you have LibreOffice installed, then both of these methods are available for you to use.

In the first method, dm+ points out that you can create a PDF file by running the following LibreOffice command at a command prompt to convert any file "understood" by LO to a PDF file.

First, however, you will need to know which version of LO you have installed. To do that, open a terminal window, type "libre" (without the quotes) at a command line prompt, and hit the "Tab" key. Note the version of "libreoffice" that shows up. In his example, he uses "libreoffice7.3" as his version of LO. On my home laptop, I'm still using LO 6.0, as indicated by the "libreoffice6.0" that shows up on the command line. On my "travel" laptop, I have "libreoffice6.1" installed. Don't judge ... I just don't have a need to "update" LO very often. I feel safe (enough) in the "if-it-ain't-broke-don't-fix-it" approach. So, just replace "libreofficeX.X" in the command below with the version of LO that is installed on your computer. Also, keep in mind that whenever you DO update LO, you will have to slightly alter the command to reflect the newer version of LO that you have installed.

So here's the command from dm+:

libreofficeX.X --headless --invisible --convert-to pdf

This entire conversion can happen from the command line, without ever opening up LO to perform the conversion ... at least visibly on your computer's screen. The "--headless" parameter tells LO to run without a GUI, and the "--invisible" parameter tells LO to run without displaying the LO logo or application "startup" page. The "--convert-to pdf" parameter directs LO to convert the input file(s) to a PDF document. Yes, you can list multiple input documents. The command will process through the list of input files in the order presented, converting each file listed to a PDF file.

BUT, because the command has the version of LO that you have installed in it, the command wouldn't necessarily be the best command to put into a bash script. Every time you updated LO, you would also have to update the command in the bash scripts. Actually, I'm sure there is a workaround to this issue, but I'm not aware of it. Just because I'm not aware of it doesn't mean it doesn't exist.

Of course, it was dm+'s use of the libreofficeX.X command that prompted me to make another attempt at using another utility that comes with LO, called unoconv. Like I mentioned earlier, I tried the command years ago as a very green Linux noob, but couldn't figure out back then how to get it to work properly, consistently or reliably. So I revisited the command, and yes, it does work properly.

Here's the command:

unoconv -f pdf -o <output-file.pdf> <input-files(s)>

The "-f pdf" parameter sets the output file format to PDF, while the "-o <output-file.pdf>" parameter sets the filename of the output file. Just like with the "libreofficeX.X" command, you can list multiple input files, and each input file will be converted into its own PDF file.

Now, here's the advantage to using the unoconv command, versus the libreofficeX.X command: you don't have to ever change the command when you update/upgrade your version of LO. The unoconv command stays the same, regardless of which version of LO you have installed. Thus, the unoconv command IS a great candidate for inclusion in a bash file. Other than that, it works exactly like the libreofficeX.X command from dm+, converting from any file format that LO "understands."

The PDF files created by these LO "tools" are quite a bit smaller than the same PDF file created with pandoc. In some cases, the PDF files created with these LO tools are less than half of the size of the PDF files created with pandoc.

Now ... On With Editing PDF Files

One of the main attractions to the PDF file format is that it is not easily edited. Compare that with a word document or even the lowly plain text file, which are easily edited by virtually anyone, anywhere, at any time.

As a result, many people don't even know that a PDF file CAN be edited. The average computer user is even less aware of where to find the special tools for editing a PDF file. They may have some of these tools already installed on their computer, and not be aware of possessing the ability to edit PDF files.

While some of the command line tools for manipulating PDF files (part three of this short article series) can also be used to help edit PDF files, to keep things "simple," we'll just talk about actual editing tools for PDF files.

Obviously, the absolute best way to edit a PDF file is to make the edits in the program that was used to create the PDF file. That could be in a bona-fide word processor (LO Writer, MS Word, etc.), a desktop publishing program (like Scribus), or even a simple plain text file that gets converted by pandoc.

But sometimes, you don't have access to the "original" files that would allow you to make edits in the resulting PDF. THAT is when you will need a "special" software program to edit PDF files.

Caveats Of A Legal Nature

Please, be sure to respect the copyrights of copyright owners. Not only is it respectful, but it could also save you a literal TON of legal hassles that you might wish to avoid, should the copyright holder decide to pursue legal action against you for using their intellectual property without permission or compensation.

Copyright laws vary widely between jurisdictions, so be sure to maintain compliance with the copyright laws of your jurisdiction (even if you disagree with them). This goes for ALL intellectual property, not just photos. Asking for permission is not only the RIGHT WAY, but it is also legally mandated in most cases where the copyright holder has not explicitly granted the right to republish or reuse their intellectual property.

Here at The PCLinuxOS Magazine, for example, all of the articles are copyrighted, but we explicitly grant republication/reuse without prior consent, following first publication by The PCLinuxOS Magazine, provided that there is a link back to the original article, and the original author is given credit in the byline.

LibreOffice Tools

When you mention LibreOffice, most people immediately think of LO Writer, the word processor program in the FOSS office suite. But, in fact, all of the programs that make up the LibreOffice office suite are capable of producing PDF files.

However, when it comes to editing PDF files, that is performed with a pair of programs of that office suite, including one that you might not have considered: LibreOffice Draw. The other program is LibreOffice Writer. Both programs have similar functionality for editing PDF files, but we'll focus mostly on LibreOffice Draw.

LO Draw window before loading a document

LO Draw window with recipe PDF file loaded

Make sure your "Select" pointer (the arrow on the toolbar at the left of the image, near the word "Pages") is selected. Then, click the mouse over the element you want to change or edit. The page element will appear with brackets around it. Click your mouse a second time within the brackets of the selected item to edit it. Make your changes, and then either select "Export as PDF" from the top toolbar (has a PDF symbol on the button), or select "Export as PDF" from the File menu. DO NOT resave the file with a PDF file extension. It will still be "assigned" a *.odg file extension, to indicate a LO Draw file. You MUST export the file as a PDF to maintain the PDF's integrity and identity as a PDF file.

Recipe loaded into LO Draw, with document element selected

After clicking on document element, preparing for edit

Pay particular attention to the properties of the document element you selected. The "Properties" window should appear on the right side of the LO Draw window. Notice in our image below that the document element is recognized as text, and the selected font is Liberation Sans. You will want to be sure that the selected font matches the font of the text you are attempting to edit. Otherwise, your edit will stand out like a sore thumb. In our case (and probably in most cases, since fonts are most likely embedded within the PDF), the selected font is the correct one for this document. Still, it's a good idea ... just to be sure!

Properties for selected document element

When we compile the magazine PDF every month with Scribus, we typically select the option to embed the fonts within the PDF. That helps ensure that the magazine appears as we intended it to appear, and font substitutions don't change that appearance or mess up the layout design. So, with that in mind, exercise great care should you decide to change the font or font properties. It could have unforeseen consequences that will/could totally mess with the layout design.

Of course, you can use LO Draw to "extract" or save images from a PDF file. Just load in the PDF file into LO Draw, then select the image. Right click on the selected image, and then select "Save" from the menu that appears. From there, it's just a matter of directing LO Draw where you want to save the file, just as you would if you were attempting to save data from any other file.

Save images from the PDF file.

This is an excellent way to recover images where the original image has "disappeared" and can no longer be accessed (failed hard drive, reformatted hard drive, data misplaced, etc.). Still, it bears repeating ... be respectful/aware of copyrights!

Master PDF Editor

Another tool in the PCLinuxOS repository is Master PDF Editor. The version in the repository is the free version, so there are a few limitations on what you can do with it. For example, the free version is not able to save "optimized" PDF files. Optimized PDF files typically display certain document elements at a lower resolution, saving space and creating a smaller PDF file. But, for basic PDF editing, it probably doesn't get much easier than with Master PDF Editor.

The "paid" version is downloadable from the Master PDF Editor website for $69.95 per each license for one to nine licenses. I've not attempted to download it for three reasons. First, I don't want to spend $70 on the program. Most of my needs are covered by the "free" version. Second, for no more than I use it, it's difficult to justify spending the money. I might consider spending the money if I used it more often. Third, I'm not sure one of the Linux "versions" is installable on PCLinuxOS (although my "guess" would be to install the OpenSUSE RPM package or to install from the tar.gz file). I didn't try them, mostly because I didn't want to risk borking my installation by installing something from outside the repository. Plus, it makes me nervous of running into "dependency hell" during the installation process.

First view upon starting Master PDF Editor

Master PDF Editor with PDF loaded

Just as you did in LO Draw, click your mouse cursor on the document element you want to edit. Then, click your mouse cursor a second time in the document element that you selected to edit that element.

Click on the document element you want to edit to highlight it

Click a second time on the selected document element to begin editing

However, once you're done with your editing of the PDF file, you don't have to re-export the document as a PDF, as you do with the LibreOffice tools. Instead, with Master PDF Editor, all you have to do is select "Save" or "Save As..." from the "File" menu. Personally, I wouldn't use "Save," opting instead for going the "Save As..." route. That way, you don't risk overwriting the original file. I prefer keeping my originals just that ... original. I usually append a lowercase "a" or "b" to the name of the file I'm editing, which allows me to save my edits, but to also keep the "original" intact and unaltered.

Properties of selected document element

Much like with LO Draw, the right side of the Master PDF Editor window shows the "Properties" of the selected document item. As the image above shows, the correct font for the document element is shown. The difference (at least in this particular case) is where LO Draw shows you the font point size down to 10ths of a point, Master PDF Editor uses an odd size designation. In the above case, it's "Size 1," while under LO Draw the font size was labeled as "9.8 points." While the LO Draw way is easier to understand (after all, most people are used to sizing fonts in points), I did discover that the number in the "Size" field does NOT have to be a whole number. You could, for example, put in 1.25 if you wanted the element 25% larger, or 0.75 if you wanted the element 25% smaller. It's not the most intuitive way to express font size, but it works, as long as you can put the work in on figuring out what number goes in the "Size" box (or you could get lucky by taking a W.A.G. at it).

So ... Which Editing Tool Should You Use?

You might think that you can just choose one of these tools and you'd have everything covered. Unfortunately, things just aren't that simple.

LO Draw "feels" a bit clunkier to use for editing PDF files. Now, I realize that's a bit subjective, but that's how it feels to me.

Both LO Draw and Master PDF Editor work in similar ways, and both have their pluses and minuses. Master PDF Editor may or may not allow you to extract images from the PDF. I've found that ability to be rather hit-or-miss. I cannot find a pattern to discern when it will let you save/extract images and when it won't. Sometimes it will allow you to save images found in the PDF, and other times the option to save images is MIA. With LO Draw, the option seems to be available for EVERY image in a PDF file (and I checked multiple PDF files when I was writing this article).

With Master PDF Editor, you don't have to re-export the edited file to a PDF file. Instead, you simply select "Save As..." from the "File" menu. With LO Draw, you have to be sure to re-export the document as a PDF.

LO Draw has the advantage when dealing with text size. Everyone on the planet is used to expressing the size of fonts as points, and have been doing so ever since Windows 3.1 introduced True Type fonts. Master PDF Editor abandons using points to size text, replacing it with a much more awkward and foreign sizing method that seems to be based more on percentages of the existing text size. Once you get used to it, it may get easier, but it is more complicated. Say you want to change the font size from 12 points to 10 points in a document. So now you have to go through these confusing and complicated mathematical aerobics to figure out that 10 points is 5/6 the size of 12 points. Thus, 1/6 is about 17%, so you would have to enter the percentage as 0.83, or 83%.

Both programs work perfectly fine to edit PDF files. But overall, I have to give a slight edge to Master PDF Editor for its "feel" and ease of use. If I know that I will want to be extracting images from a PDF file, LO Draw will be my go-to tool.

Having both tools in your PDF editing arsenal is probably a good idea. You will find uses for them both.

Do you need to join, combine, split, merge or otherwise work magic on PDF files? Well, stay tuned. Next month we'll cover how to manipulate PDF files in the third and final article in the PDF series, performing all of these tasks ... and probably more.

Previous Page              Top              Next Page

Jupiter Broadcasting Linux Action News Linux Unplugged Linux Headlines Tech Snap Choose Linux BSD Now Jupiter Broadcasting