Convert PDF to HTML or Embed a PDF Viewer?
A guide for development teams choosing how to display PDF content in the browser
Two Ways to Display a PDF in the Browser
If you're building a web application that needs to display PDF content, there are two approaches to choose from.
Approach 1: A client-side viewer sends the PDF file to the browser, where JavaScript or WebAssembly parses and renders it on the spot. PDF.js works this way, as do commercial SDKs from Apryse and Nutrient.
Approach 2: Server-side conversion parses the PDF on a server and produces HTML, CSS, and SVG. The browser never sees the PDF file. It receives web content and renders it like any other page. BuildVu works this way.
Which approach fits depends on what you're building. Most use cases fall into one of three categories.
Which Approach Fits Your Use Case?
1. Viewing files
If your application stores files and users need to view them in the browser, a client-side viewer is the simplest fit. Jira, Salesforce, Dropbox, and Basecamp all work this way. The PDF is sent to the browser and rendered there, which is fine because it's the user's file. Access control happens through authentication.
PDF.js handles this well for most standard PDFs. It's free, open-source, and easy to integrate.
2. Document workflows
If users need to annotate, redline, sign, or otherwise mark up PDF documents in the browser, you need a commercial viewer SDK. Apryse (formerly PDFTron) and Nutrient (formerly PSPDFKit) are the main options here. They're designed to move workflows that used to require desktop software (like Adobe Acrobat) into a web browser.
Client-side viewers also matter when the file must never leave the local device. In some legal and compliance contexts, documents cannot be uploaded to a server for processing. A client-side viewer keeps everything in the browser, with no server round-trip.
3. Publishing PDF content on the web
If the content of a PDF needs to be part of a web experience you control, server-side conversion is the right approach. A conversion engine turns the PDF into HTML, CSS, and SVG on your server. The browser receives standard web content. The original PDF file never reaches the client.
For many organisations, keeping the PDF off the client is the primary reason to convert. Education publishers with subscription-based access to textbooks and journals use this approach. So do companies publishing financial reports and newsletters, where the content should be viewable online but not easily downloaded or redistributed. Because only HTML reaches the browser, you can use standard web techniques to disable printing, right-clicking, downloading, and text selection.
Converted HTML is also indexable by search engines, which matters for publishers, government agencies, and anyone whose documents need to be found online. Academia.edu chose conversion specifically so their research papers would be indexed and discoverable by Google. And because the output is real HTML rather than canvas-rendered pixels or rasterised page images, it works with screen readers and keyboard navigation without extra effort.
Other Advantages of Conversion
Bandwidth and delivery
Conversion means individual pages can be served on demand rather than transferring an entire document upfront. One education provider serves material to remote parts of Africa where bandwidth is expensive, and the ability to serve single HTML pages instead of full PDF files has significantly reduced their costs. The converted pages are also standard web assets, so they cache naturally in the browser and at CDN level, unlike a viewer which must re-parse and re-render the PDF on every visit.
Control over the output
The converted HTML can be used directly, styled and integrated into your own web application. The output is standard HTML, and you can modify it as needed. It has been used for things like financial tagging (XBRL), which would be difficult with a canvas-based viewer where the content isn't part of the DOM.
Licensing and lock-in
With a client-side viewer SDK, you pay a license fee for as long as documents need to be viewable. Stop paying and the viewer stops working. With server-side conversion, you pay to convert the documents, but the resulting HTML is yours to host and serve for as long as you need. If you have a fixed set of documents, you can convert them once and never pay again.
Replacing image-based viewers
Many teams adopting conversion are replacing older solutions that displayed PDF pages as rasterised images. Those image-based viewers lack text search, accessibility support, and text selection. Converting to HTML fixes all three.
Viewer and Conversion Tools
PDF.js
PDF.js is Mozilla's open-source PDF renderer for JavaScript, licensed under Apache 2.0. It is developed primarily as the built-in PDF viewer for Firefox; usage outside Firefox is not officially supported by Mozilla. It works well for standard, well-formed PDFs. For production use, the main limitations are the absence of commercial support, inconsistent handling of malformed or unusual PDFs, limited form and annotation support, and canvas-based rendering where content is drawn to a canvas element rather than the DOM.
pdf2htmlEX
pdf2htmlEX is an open-source C++ tool that converts PDFs to HTML. The original project has been abandoned. A community fork exists but receives sporadic maintenance.
The bigger issue is conversion quality. Many PDFs produce broken or poorly rendered output, and there's nobody to fix it. The project is also dual-licensed GPLv2/GPLv3 (which needs legal review for commercial use) and requires compiling from source or running via Docker.
BuildVu and FormVu
BuildVu is a commercial server-side conversion engine from IDRsolutions. It converts PDF files to HTML, CSS, and SVG. The output includes a lightweight viewer component for displaying the converted documents, but developers can also use the raw HTML directly in their own applications. It's Java-based and runs on-premise or in the cloud.
FormVu (also from IDRsolutions) handles the specific case of converting interactive PDF forms into native HTML5 forms.
Apryse and Nutrient
Apryse (formerly PDFTron) and Nutrient (formerly PSPDFKit) are commercial SDK platforms built for the document-workflow category described above: annotation, redlining, digital signatures, and form filling in the browser. Both also offer server-side rendering capabilities. They solve a different problem from BuildVu, which is focused on publishing and displaying PDF content as HTML.
Conversion vs Viewer: Side-by-Side
| Criteria | Server-Side Conversion | Client-Side Viewer |
|---|---|---|
| SEO | Converted HTML is indexed by search engines like any web page. | Canvas-rendered content is not in the DOM and cannot be indexed. |
| Accessibility | The output is real HTML. Screen readers and keyboard navigation work without extra effort. | Canvas rendering requires additional engineering to meet WCAG requirements. |
| Document protection | The PDF stays on the server. Only HTML reaches the browser. You can restrict printing, downloading, and text selection. | The PDF file is downloaded to the browser. Users can save, copy, or redistribute the original document. |
| Control over output | You receive HTML, CSS, and SVG that you can style, modify, and integrate into your own application. | The viewer controls the rendering. Customisation depends on the SDK's API. |
| Content manipulation | The HTML output can be processed, tagged (e.g. XBRL), or fed into other systems. | Content is rendered to canvas. Extracting or manipulating it programmatically is limited. |
| Deployment simplicity | Requires running a conversion step and storing the converted output. | Serve the PDF file and include a JavaScript library. No server-side processing. |
| Annotation creation | Can display existing annotations. Creating new annotations is not the intended use. | Annotation creation, redlining, and digital signatures are the core use case for commercial viewer SDKs. |
| Licensing lock-in | Convert once, serve the HTML indefinitely. No ongoing runtime dependency on the vendor. | The viewer SDK must be present for as long as documents need to be viewable. Stop paying and the viewer stops working. |
| Real-time preview | Conversion takes time. Not suited to instant preview of newly uploaded files. | Renders in the browser immediately. |
Common Scenarios and Recommendations
Publishing research papers, reports, or whitepapers online? Convert. The content needs to be discoverable by search engines.
Subscription-based access to documents (education, publishing)? Convert. You need DRM, control over the viewing experience, and the ability to integrate the content into your platform.
PDF forms that users fill in online? Convert. HTML5 forms work reliably across browsers and devices. PDF form rendering in browsers does not.
Replacing an image-based PDF viewer? Convert. You get searchable, accessible HTML instead of rasterised images.
File attachments in a SaaS product? Viewer. The files belong to users. PDF.js handles this with no server-side processing.
Legal review with annotation and signatures? Viewer SDK (Apryse or Nutrient). These workflows are what commercial viewer SDKs are built for.
Sensitive documents that cannot leave the local device? Viewer. A client-side viewer can process everything in the browser with no server round-trip.
Quick preview of user-uploaded PDFs? Viewer. A client-side viewer renders the file immediately without waiting for server-side conversion.
Try BuildVu With Your Own PDFs
We offer evaluation licenses for BuildVu and FormVu. Test with your own documents. Contact our technical team if you'd like to discuss your requirements first.
Not received an email? Check your spam. Click here if you still haven't received it