PDF to Text Conversion

JPedal offers automated Java PDF to text conversion. It will take care of all the Encoding issues and give you text or XML with font information, color and spacing information if required.

Text can be extracted from an entire PDF document, a single PDF page, from within page co-ordinates or from tables. PDF Font information and PDF metadata can also be extracted. If a PDF contains text, JPedal can extract it.

PDF text can be extracted as text or as XML content including font, colour and spacing information.

Key Features

  • Convert text in PDF to XML or UTF8 text
  • 100% Java and multi-platform
  • Fully automated
  • Structure content extraction from Structured PDF files
  • Convert all document pages or specific page range
  • Highly configurable
  • Single Jar
  • Lots of tutorials and monthly new release
  • XFA support available in XFA version


PDF to TextExample of PDF to Text Extraction


Quick Start

Here is a quick snippet of code which will allow you to extract text from a PDF file(if there is no structure present a blank file will be returned):

See the full Javadoc or Download Trial Jar

Q & A

Q: Do you offer any PDF to text examples?

Yes, we offer a large range of PDF to text examples.

Q: Is there any Java Code examples?

A guide on Structured content extraction is available if your PDF contains this optional metadata and also a guide on PDF metadata if you are extracting metadata from PDF files.

Q: Is there a Free Trial I can download to try the JPedal Java PDF Library?

Yes, you can download the 30 Day Free Trial JAR for the JPedal PDF Library by clicking here.

Q: Do I need to purchase features separately?

No, everything you need to develop an application that requires a world class Java PDF SDK is included in one complete package - you do not need to purchase additional modules.