How do I run the software directly from the jar?
The single jar file can be placed anywhere - you will just need to ensure Java is installed (we recommend Java 7). You will also need to be able to write to the output location.
You can run the PDF to HTML5 conversion program directly from the jar - for example if you wanted to use it from some other language or script.
Quickstart (if your PDF files do not contain JPEG2000 or Tif images)
To run it, use:
java -jar $path_to_jar/JPDF2HTML.jar $dir_with_pdf_files $output
$path_to_jar is the location of the jar, $dir_with_pdf_files is the location of the PDF files to convert, and $output is where to place the converted files. If a directory does not exist it will be created.
All PDF files
You will need the additional JAI jars - you can download them from here.
java -Dorg.jpedal.jai=true -cp $path_to_jar/JPDF2HTML.jar:$path_to_jai/jai_codec.jar:$path_to_jai/jai_core.jar.jar:$path_to_jai/imageio.jar org/jpedal/examples/html/ExtractPagesAsHTML $dir_with_pdf_files $output
Important note for Windows developers: separate the jars with a semi-colon ; rather than a colon :
My Example
Here is my test example:
java -Xmx512M -Dorg.jpedal.jai=true -jar JPDF2HTML.jar /Users/markee/pdfs/ /Users/markee/output/
You may also want to increase the memory used.
What sort of output will I get?
The code can generate several types of output including:-
- a total conversion of the PDF to HTML5 (with text using embedded fonts) and shapes rendered on Canvas (TEXT_AS_TEXT) - default setting
- All content rendered to Canvas (TEXT_AS_SHAPE).
- All content rendered to an image with visible text to allow text selection (TEXT_VISIBLE_ON_IMAGE)
- All content rendered to an image with invisible text to allow text selection (TEXT_INVISIBLE_ON_IMAGE)
You can set them in the ExtractPagesAsHTML example or see them all by setting the JVM flag to
- -Dorg.jpedal.pdf2html.textMode="all" (all modes)
- -Dorg.jpedal.pdf2html.textMode="shape"
- -Dorg.jpedal.pdf2html.textMode="visible"
- -Dorg.jpedal.pdf2html.textMode="invisible"
Do I need any additional jars?
Possibly. If the PDF is encrypted, or contains Tiff data. Full details are here.
How do I generate SVG instead?
There are separate tutorial here.
How do I include the HTML5 conversion software in my Java code?
There is a documented Java PDF to HTML5 example written which is included in the jar. Click to view the source code.
Click here to view the key code
How do I get assistance?
Our developers are on the forums to answer your questions
Will you do the coding for us?
We are happy to provide coding on a commercial basis only.
How do I find the version number of the software?
You can see the version number of the PDF to HTML5 library by running the ExtractPagesAsHTML with no parameters or you can access it directly from the static string variableHTMLDisplay.HTMLversion