banner
Previous Page
PCLinuxOS Magazine
PCLinuxOS
Article List
Disclaimer
Next Page

Unoconv: A Handy Tool For Converting Between Competing Office Document Formats


by Paul Arnote (parnote)


Can you imagine what an utter mess things would be if you could only read CDs or DVDs on the brand of device that created them? Thanks to a formalized standard, CDs and DVDs, for the most part, are readable on every CD or DVD player.

Those “standards” don't exist for office document formats, unfortunately. You may be able to open a document created in Microsoft Word or WordPerfect, for example, but the document formats are not held to any industry-wide standard. Among office suites, which proclaim a focus on productivity, it's the wild, wild west.

In fact, it's anything but “productive” to be tied to one vendor, one software suite. You can be “productive,” just so long as you use/buy THEIR product(s) and use their “approved” third-party vendors.

It's only in recent history have things like Google Docs, Zoho Office and other online “office suites” made it possible for computer users to experience anything that even approaches interoperability between office document formats. Both Google Docs and Zoho Office, for example, offer the ability to import and export documents created on “other” office suites. Both Google Docs and Zoho Office offer up their cloud-based office productivity tools for free.

Even more recently, Linux users have been able to access things like Microsoft Office, via Microsoft's online subscription office suite, named Microsoft 365. Of course, many Linux users have a “bad taste” in their mouth when it comes to using ANYTHING from Microsoft, and cringe even more at the prospect of handing over their hard-earned money to Microsoft. The “subscription” to Microsoft 365 will run an individual $70 (U.S.) per year for a single-user plan, or $100 (U.S.) per year for a family plan that accommodates up to six family members.

There is ONE (and, sadly, only one) document format (other than plain text files) that even comes close to “universal” access: PDF files. And, if you're merely sharing information with an audience (small or large), the PDF file is universally accessible across most computing platforms and operating systems.

But what happens when you share a document with another user far, far away, and you need them to be able to edit or contribute to that document? In that case, a PDF file is NOT what you want to use.


LibreOffice Logo

In a more “perfect” world, you and that far, far away person could just use LibreOffice, to help ensure equal access to shared documents. Or, you could both use Google Docs or Zoho Office. But either way, it usually requires that you both are using the same platform.

Or does it?

What about a situation where everyone else is using a Microsoft Office product, but you are using LibreOffice because it's what you're more familiar and comfortable with, and maybe that's all you can afford? Or what if you're having to write a term paper/thesis/dissertation and are required to submit your paper in *.docx format? I'm sure you can think of a thousand other scenarios where users are “locked in” to submitting documents in the format(s) supported by (and created by) the closed-source, proprietary, commercial software packages.

Without a doubt, you could open each *.odt file in LibreOffice Writer and resave it in the Microsoft Word default format of *.docx. And that works very well, if you have one or three documents to convert. But that can be quite time-consuming, especially if you have a lot of documents to convert (maybe you have three or four dozen files to convert). Fortunately, there IS a tool that allows the bulk conversion of files, without opening each one and resaving it.

This utility, named unoconv, is NOT installed by default when you install LibreOffice. It is, however, installable from the PCLinuxOS repository, via Synaptic. Unoconv does one thing, exceptionally well: it reads files in one format, and writes them out to another. It automates document conversion by leveraging an existing installation of LibreOffice to do most of the work. Thus, you will need LibreOffice installed to run unoconv.

Unoconv is a command line utility that can convert any file format that LibreOffice can import, to any file format that LibreOffice is capable of exporting. If you tend to shy away from command line utilities, this may be a good time to relax your command line resistance. The command line to convert files with unoconv is actually pretty easy to implement. Plus, using this utility will save you a TON of time, especially when compared to opening each file in LO and performing the conversions one file at a time.

The “help” listing (image below) for unoconv is relatively short, especially when you compare the number of options to something like MPlayer (who's listing of options goes on overwhelmingly for pages and pages and pages).


Unoconv Help

If you want extra explanations of the options, I'll refer you to the man page for Unoconv.

For most users, the vast majority of these options will NEVER be used. There are just a few options that a “normal user” will ever need or use. Most likely, you're going to find yourself using just a small handful of these options. Those are (in no particular order) -e (--export), -f (--format=format), -o (--output=name), -d (--doctype=type), --password, -v (--verbose), and --show. When applicable, the “shortened” version of the option is given first (not all options have a shortened version), with the full version of the option shown in parentheses. Those will be the options that we'll cover in this article. The rest of them are left for those with special use cases to figure out on their own.

Ordinarily, unoconv goes about doing its job very quietly, meaning you won't see much output in a terminal session. If you want to see more information about the conversion process, use the “-v” option as your first command line option. Personally, I'm quite happy with the lack of details, as I'm mostly interested in the results, rather than the hoops it jumps through to perform the conversions.

If you type “unoconv --show” at a command prompt, unoconv will show you all of the different file formats it is capable of converting. I've broken the listing up in categories, since showing it all here in one contiguous listing would be too long. The image below shows all of the document formats (translate that into word processing document formats) unoconv can convert from and to.


Unoconv Document Formats

The image below shows all of the graphic file formats that unoconv can convert from/to.


Unoconv Graphics File Formats

The next two images (I had to use two images to show them all) show the presentation formats that unoconv can convert to/from.


Unoconv Presentation File Formats 1


Unoconv Presentation File Formats 2

Finally, the image below shows all of the spreadsheet formats that unoconv can convert to/from.


Unoconv Spreadsheet File Formats 2

You have to admit that the list of file formats that unoconv can convert between is long and impressive. As you can see, it's not “just” office document formats, but also graphic file formats.

In its most basic form, the command line for running unoconv is “unoconv -f odt [name-of-file-to-convert]”. That command line command will convert the ONE file listed at the end. Even if it's just one file, unoconv is much faster to use to perform file conversions than opening LO and resaving the file in the desired file format. The “-f [file-extension]” parameter tells unoconv what format you want for the final, converted file.

But, you can also use wildcards for the filenames. For example, typing “unoconv -f docx7 *.odt” will cause all of the *.odt (LO Writer) files in the current directory to be converted to the latest version of the MS Word *.docx format.

Sometimes, unoconv can't resolve what format a file is in. If you happen to know, you can help unoconv out by providing the “-d [document-type]” parameter. There are four document types that are valid: document, graphics, presentation, and spreadsheet. So, specifying “-d document” will specify a word processing document, while “-d presentation” will specify a presentation document. If you use this parameter, make sure it matches up with the type of file you're trying to convert. For example, I tried using “-d document” for an MS PowerPoint presentation, being converted to a LO Impress file (*.odp). Doing so totally confused unoconv, which exited with an error message. Once I changed it to “-d presentation”, unoconv made the proper and appropriate file conversion.

Watch. Your. Filename. Spellings. If you misspell a filename, unoconv will not forgive you. Instead, it will exit with an error, and without doing anything but displaying a cryptic error message.

If you are using unoconv to convert files that are password protected, you can use the “--password=[password]” option to provide the password to decrypt the file prior to conversion.

Similarly, you can use the “-o [directory-name]” to specify the output directory for your converted files. Without it, the new files will appear in the same directory as your original files. If your destination directory is a subdirectory of the parent directory where your original files are stored, precede the directory name with “./”. Otherwise, you can also provide a fully qualified path name, if that destination directory is somewhere else in your /home directory. Also, the destination directory must already exist. Unoconv will not, as far as I can tell, create the destination directory for you “on the fly.”

For what it's worth, your original files are retained and should be unchanged as a result of your conversions, if you're careful in your use of unoconv. As far as I can tell, there is no error checking or checking to see if a file already exists with the file name you're attempting to convert to. But using a bit of discipline and common sense here will go a long way towards preserving your original files exactly as they were when they were created.

Using the “-e PageRange=[start-page]-[end-page]” parameter may (or may not) work as you intend. For example, using “-e PageRange=1-3” to convert only the first three slides of a presentation does not work as you might expect. Instead, the entire file is converted. At that point, the best thing to do will be to open up the converted presentation file and manually delete the slides you do not want.

Keep in mind that if you have MS Office files that use VBA (Visual Basic Assistant) macros, those macros most likely will not be converted. Fortunately for the “rest of us” non-Microsoft users, VBA is an MS-only albatross curse thing. In fact, I've been reading reports recently of Microsoft's plans to deprecate Visual Basic by the end of 2024. And, to be perfectly honest, I don't know if that extends to VBA, which is sometimes used to encode macros in MS Office.


Summary

So, you may be wondering what brought this topic up. Meemaw retired from her job in mid-June. She was commenting to me that she had some files for work that she needed to convert before handing off her work laptop to her successor. Meemaw, preferring LO over MS Office, used LO to create these files. There's no guarantee that her successor will even know what to do with the *.odp, *.odt, and *.ods files that LO creates. So, she wanted to convert the LO files to their MS Office equivalents, before turning in her work laptop.

Her conversation made me remember seeing unoconv in the PCLinuxOS repository. I quickly installed it, and had her send me a handful of files. On my first “test,” all of the files she sent me were successfully converted to MS Office files, and those converted files opened perfectly in MS Office.

Unoconv will help bridge the office file format gap for a LOT of people. Not everyone can afford to use things like MS Office. Even back in my computer-infancy, I was one of those users who could not afford MS Office. Instead, I discovered (and used) OpenOffice, which provided everything I needed. Plus, it also allowed me the ability to save files in the MS Word format d'jour, *.doc files, as well as being able to load up *.doc files sent to me.



Previous Page              Top              Next Page