SEARCH PDF FILES WITH JAVA
JPedal makes it easy for Java Developers to search PDF files for textual content including advanced regular expressions
Why do Java Developers use JPedal for text search?
JPedal is a Java PDF library which can search the text in even the most complex PDF files. It is able to extract specify complex search requests including regular expressions across multiple lines.
Support for PDF 2.0 Specification
JPedal supports all the features in the latest PDF Specification including structure tags, complex fonts, and multiple languages.
Text Handled As Unicode
JPedal removes all the complexity of PDF text encoding. As far as the Developer is concerned, everything is just Unicode.
Text Metrics Returned
JPedal returns the exact bounding boxes of any text found.
Multi-Line Search With Regular Expressions
JPedal is able search across multiple lines using regular expressions search terms.
JPedal Text Search Key Features
JPedal allows developers to search the textual content inside a PDF Document and return the bounds of any matches.
JPedal can search whole Documents, specific pages or specific parts of pages.
Search value can include standard regex expressions.
Multiple Search options
JPedal can search for case sensitive (or insensitive) and whole words only (or any text).
Extract Word Positioning
JPedal is able to return the co-ordinates of the bounding boxes for any text matches found.
First Instance Or All Matches
JPedal can search PDF Documents for just the first match or scan the entire document for every match.
JPedal searches in Unicode so ALL languages are supported regardless of any Encoding issues.
Documentation and Code Examples
Support section showing how to search PDF Documents for text.