How to convert PDF to text with format kept on Linux?

ByEric Ma Mar 24, 2018Mar 24, 2018

Many of the formatting in PDF will not be available in text. But better keep the text’s relative positions as the same. For example, the table columns should be kept.

The pdftotext tool can convert PDF to text pretty well:

pdftotext – Portable Document Format (PDF) to text converter

with the -layout option:

-layout

Maintain (as best as possible) the original physical layout of the text. The default is to 'undo' physical layout (columns, hyphenation,

etc.) and output the text in reading order.

$ pdftotext -layout file.pdf file.txt

and file.txt will contain the text version of the main text content of the PDF with layout kept as best as possible.

QA

How to get an environment variable in Java?
ByQ A Mar 24, 2018Mar 11, 2019

In Java, how to get the value (string) of an environment variable? You may call the System.getenv(name) library function in Java to get the environment variable value. public static String getenv(String name) Parameters: name – the name of the environment variable Returns: the string value of the variable, or null if the variable is not…

Read More How to get an environment variable in Java?
QA

How to install JRE for Chrome on Linux x86-64
ByQ A Mar 24, 2018

How to install JRE for Chrome on Linux x86-64? Use JRE from Oracle on Fedora Linux x86-64 as the example: Download jre from http://java.com/en/download/manual.jsp?locale=en#lin . Select Linux x64. Install it by # yum install jre-7u40-linux-x64.rpm (the downloaded rpm). Make a softlink of the plugin:$ cd ~/.mozilla/plugins/; ln -s /usr/java/jre1.7.0_40/lib/amd64/libnpjp2.so ./ Restart Chrome and browse chrome://plugins/…

Read More How to install JRE for Chrome on Linux x86-64
QA

How to change the DPI of tiff images exported by PowerPoint?
ByEric Ma Mar 24, 2018Mar 24, 2018

The DPI of the tiff images exported by PowerPoint seems 96. For posters, larger DPIs like 150 or 300 are needed. Is it possible to change the DPI of tiff images exported by PowerPoint? In the options of PowerPoint, there is a setting for choosing DPIs. However, it have no effect. Check the post on…

Read More How to change the DPI of tiff images exported by PowerPoint?
Linux | Programming

Vim Howtos and Tips
ByEric Ma Mar 25, 2014Aug 30, 2020

Vim is a fast and handy editor on *nix systems. Like Emacs, Vim has a steep learning curve as you get constantly get new things. However, the effort deserves it as you efficiency is highly improved. Here, I summarize the tips and howtos I learned using Vim. Some previous posts on vim are tagged with…

Read More Vim Howtos and Tips
QA

how to install older version gcc/g++ in Ubuntu (other linux distro are similar)
ByWeiwei Jia Mar 24, 2018Jan 7, 2020

When we compile some project, it needs older version gcc/g++ version. So how to insall older ones in ubuntu 16? Install older gcc/g++ version sudo apt-get install gcc-4.4 g++-4.4 g++-4.4-multilib gcc-4.4-multilib Set gcc/g++ version to be used automatically sudo update-alternatives –install /usr/bin/gcc gcc /usr/bin/gcc-4.4 50 sudo update-alternatives –install /usr/bin/g++ gcc /usr/bin/g++-4.4 50 You may need…

Read More how to install older version gcc/g++ in Ubuntu (other linux distro are similar)
QA

How to clone a snapshot of a remote repository at a specific branch?
ByEric Ma Mar 24, 2018Mar 24, 2018

I know that one can make a zip of the current branch by: git archive -o archive.zip HEAD However, at situations, one may want to clone a copy/snapshot of remote repository at a specific branch because: 1) The repository is large with long history and cloning the whole history takes too much time. 2) What…

Read More How to clone a snapshot of a remote repository at a specific branch?

One Comment

Larry Bradley says:

Jun 2, 2018 at 7:08 am

For linux users, nothing works better than using Calibre to convert pdf files to docx (or any other number of other formats). After conversion, clean up the docx by using LibreOffice Writer with the Advanced Search and Replace plug-in installed. https://calibre-ebook.com/download_linux

Reply

Similar Posts

One Comment

Leave a Reply Cancel reply