How to convert PDF to text with format kept on Linux?
Many of the formatting in PDF will not be available in text. But better keep the text’s relative positions as the same. For example, the table columns should be kept.
pdftotext tool can convert PDF to text pretty well:
pdftotext – Portable Document Format (PDF) to text converter
Maintain (as best as possible) the original physical layout of the text. The default is to 'undo' physical layout (columns, hyphenation,
etc.) and output the text in reading order.
$ pdftotext -layout file.pdf file.txt
and file.txt will contain the text version of the main text content of the PDF with layout kept as best as possible.