XDoc.PDF
Features
Tech Specs
How-to VB.NET
Pricing
How to Start Convert PDF Work with PDF Modules PDF Document PDF Pages Text Image Graph & Path Annotation, Markup & Drawing Redaction Security Digital Signature Forms Watermark Bookmark Link File Attachment File Metadata Printing Work with Other SDKs Barcode read Barcode create OCR Twain

VB.NET PDF - Extract Text from PDF in VB.NET


How to Extract Text from PDF with VB.NET Sample Codes in VB.NET Application





Look for HTML5 PDF Editor?

EdgePDF: ASP.NET PDF Editor is the best HTML5 PDF Editor and ASP.NET PDF Viewer based on XDoc.PDF, JQuery, HTML5. It supports ASP.NET MVC and WebForms projects.


Advanced Visual Studio .NET PDF text extraction control, built in .NET framework 2.0 and compatible with Windows system


Support .NET WinForms, ASP.NET MVC in IIS, ASP.NET Ajax, Azure cloud service, DNN (DotNetNuke), SharePoint


Extract text from adobe PDF document in VB.NET Program


Extract and get partial and all text content from PDF file


Extract highlighted text out of PDF document


Image text extraction control provides text extraction from PDF images and image files


Enable extracting PDF text to another PDF file, and other formats such as TXT and SVG form


OCR text from scanned PDF by working with XImage.OCR SDK for .NET


Best VB.NET PDF text extraction SDK library and component are easy to be integrated in .NET WinForms and ASP.NET


Online Visual Basic .NET class source code for quick evaluation


If you want to extract text from a PDF document using Visual Basic .NET programming language, you may use this PDF Document Add-On for VB.NET. With this advanced PDF Add-On, developers are able to extract target text content from source PDF document and save extracted text to other file formats through VB.NET programming.


This page will supply users with tutorial for extracting text from PDF using VB. Please refer to demo code below. Furthermore, if you are a Visual C# .NET programmer, you can go to this Visual C# tutorial for PDF text extraction in .NET project.




Extract Text Content from PDF File in VB.NET



Add necessary references:


  RasterEdge.Imaging.Basic.dll


  RasterEdge.Imaging.Basic.Codec.dll


  RasterEdge.Imaging.Drawing.dll


  RasterEdge.Imaging.Font.dll


  RasterEdge.Imaging.Processing.dll


  RasterEdge.XDoc.Raster.dll


  RasterEdge.XDoc.Raster.Core.dll


  RasterEdge.XDoc.PDF.dll


Use corresponding namespaces;


  using RasterEdge.Imaging.Basic;


  using RasterEdge.XDoc.PDF;


'Please have a quick test by using the following example code for text extraction from PDF file in VB.NET program.




' Open a document.
Dim doc As PDFDocument = New PDFDocument(inputFilePath)

' Initial a text extraction manager from the document object.
Dim textMgr As PDFTextMgr = PDFTextHandler.ExportPDFTextManager(doc)

' Extract text content from page 3.
Dim pageIndex As Integer = 3

' Get all characters in the page.
Dim allChars = textMgr.ExtractTextCharacter(pageIndex)

' Get all words in the page.
Dim allWords = textMgr.ExtractTextWord(pageIndex)

' Get all lines in the page.
Dim allLines = textMgr.ExtractTextLine(pageIndex)

' ...