How to Start Convert PDF Read PDF Edit PDF PDF Report Builder Work with PDF Modules PDF Document PDF Pages Text Image Graph & Path Annotation, Markup & Drawing Redaction Security Digital Signature Forms Watermark Bookmark Link File Attachment File Metadata Printing Work with Other SDKs Barcode read Barcode create OCR Twain

C# PDF to HTML Conversion Library
How to read, convert PDF file to html files programmatically with formated text using C#.net


Create html web files (html, css, javascript files) from PDF document in C# .NET Program. Online Free Download.





In this page, you will learn how to convert, export PDF content to html files in the .NET Windows and ASP.NET application using C#

  • Convert a PDF page or all pages to html files
  • Convert all PDF pages to one single html file
  • Export multiple PDF files to html files

How to convert PDF to html files using C#

  1. Download XDoc.PDF html converter C# library
  2. Install C# library to convert PDF to html text and image files
  3. Step by Step Tutorial












  • Best C#.NET PDF Converter SDK for converting PDF to HTML in Visual Studio .NET
  • Free .NET framework library for converting PDF to HTML in both C#.NET WinForms and ASP.NET application
  • Complete sample source code for quick integration and converting pdf to htm in C#.NET class
  • Support .NET WinForms, ASP.NET MVC in IIS, ASP.NET Ajax, Azure cloud service, DNN (DotNetNuke), SharePoint
  • Embed converted HTML files in HTML page or iframe
  • Use Javascript (jquery) to control PDF page navigation
  • Cross browser supported, like chrome, firefox, ie, edge, safari
  • Embed zoom setting (fit page, fit width)
  • Turn PDF form data to HTML form
  • Export PDF images to HTML images
  • Auto conversion hyperlinks (url links) inside PDF document to html format
  • Full featured online tools for pdf to html conversion without email required, and no watermark embeded


Our PDF to HTML converter library control is a 100% clean .NET document image solution, which is designed to help .NET developers convert PDF to HTML webpage using simple C# code. This Visual C#.NET PDF to HTML conversion control component makes it extremely easy for C# developers to convert and transform a multi-page PDF document and save each PDF page as a separate HTML file in .NET class application.

The HTML document file, converted by C#.NET PDF to HTML converter toolkit SDK, preserves all the original anchors, links, bookmarks and font style that are included in target PDF document file. Besides, the converted HTML webpage will have original formatting and interrelation of text and graphical elements of the PDF.

This C#.NET PDF to HTML conversion library can eliminate the crashing issue of web browser when it is trying to display a PDF document file inside a browser window. Besides, this PDF converting library also makes PDF document visible and searchable on the Internet by converting PDF document file into HTML webpage.





How to convert PDF to HTML Files in C#.NET Class


This is a C# example source code to convert a PDF document to html files. One pdf page will produce one html file. All web resources (such as web font .woff, images, css) will be generated under other folders.



String inputFilePath = "C:\\1.pdf";

PDFDocument doc = new PDFDocument(inputFilePath);
//  Path of the output folder for all HTML files.
String outputFolder = "C:\\Html";
//  Prefix of all output HTML file names. 
String fileNamePrefix = "File-";
//  Convert each page of PDF to a HTML file with file name: [fileNamePrefix][Page Index].html
//  Eg.: File-0.html, File-1.html, ...
doc.ConvertToVectorImages(ContextType.HTML, outputFolder, fileNamePrefix, RelativeType.HTML);






PDF to HTML Converter Options


You can utilize method "PDFDocument.ConvertToVectorImages()" or "PDFPage.ConvertToVectorImage()" to convert multi-pages PDF file into html web files using C#.
You can define the output html files through method ConvertToVectorImages parameters.


  1. The 1st parameter of the method MUST BE ContextType.HTML.

  2. Valid RelativeType for converting a document to HTML file(s).
    HTML: Output HTML file in the standard format.
    HTMLNF: Output HTML file without embed font.
    MOSS: Output HTML file is compatible with SharePoint application.

  3. All font resource files required by the output HTML files are put in a folder (with folder name "font") in the same directory of those HTML files.

  4. All image resource files required by the output HTML files are put in a folder (with folder name "image") in the same directory of those HTML files.









How to convert a PDF page to one html file in C#.NET code?


Below are the steps and C# example source code to convert a single PDF page to html file programmatically using C#.

  1. Create a new PDFDocument object from an existing PDF file
  2. Get a PDFPage object from the second PDF page
  3. Call method PDFPage.ConvertToVectorImage() to convert the PDF page to html file with options applied.



String inputFilePath = "C:\\1.pdf";

PDFDocument doc = new PDFDocument(inputFilePath);
//  Path of the output folder for the HTML file.
String outputFolder = "C:\\Html";
//  Prefix of the output file name. 
String fileNamePrefix = "File-";
//  Convert the 2nd page to a HTML file with file name: File-1.svg
PDFPage page = (PDFPage)doc.GetPage(1);
page.ConvertToVectorImage(ContextType.HTML, outputFolder, fileNamePrefix, RelativeType.HTML);






How to convert a PDF file to one single html file using C# code?


Below are the steps and C# sample code to convert a multipage PDF file to one single html file programmatically using C#.

  1. Create a new PDFDocument object from an existing PDF file
  2. Call method PDFDocument.ConvertToVectorImage() to convert the all PDF pages to one html file.



String inputFilePath = "C:\\1.pdf";

PDFDocument doc = new PDFDocument(inputFilePath);
//  Path of the output folder for the HTML file.
String outputFolder = "C:\\Html";
//  Output file name.
String fileName = "output";
//  Convert the whole PDF document to a HTML file: output.html
doc.ConvertToHtml(outputFolder, fileName, RelativeType.HTML);






C# converting two or multiple PDF files to html (batch conversion)


        #region pdf to html (batch files and single tread)
        internal static void pdfFilesToHtml()
        {
            String inputDirectory = @"C:\input\";
            String outputDirectory = @"C:\output\";
            String[] files = Directory.GetFiles(inputDirectory, "*.pdf");
            foreach (String filePath in files)
            {
                int startIdx = filePath.LastIndexOf("\\");
                int endIdx = filePath.LastIndexOf(".");
                String docName = filePath.Substring(startIdx + 1, endIdx - startIdx - 1);
                PDFDocument doc = new PDFDocument(filePath);
                doc.ConvertToVectorImages(ContextType.HTML, outputDirectory, docName, RelativeType.HTML);
            }
        }
        #endregion