PDF Text Reader VB.NET Library
How to read text with coordinates from PDF file in vb.net WinForms, WPF
How to Extract Text from PDF with VB.NET Sample Codes in VB.NET Application
In this VB.NET tutorial, you will learn how to read, extract text from PDF file using Visual Basic in WinForms, WPF Windows application and ASP.NET MVC Web app.
- Read text characters, words, lines, paragraphs from PDF file
- Get text from PDF with coordinates information
- Search and extract specified text from PDF document
- Extract text from PDF specified pages, from target page region
- Scan, extract text from scanned PDF document
How to read, extract text from PDF file using VB.NET
- Advanced Visual Studio .NET PDF text extraction control, built in .NET framework 2.0 and compatible with Windows system
- View C# sample source code: How to read, extract text from pdf in C# asp.net, windows apps
- Support .NET WinForms, ASP.NET MVC in IIS, ASP.NET Ajax, Azure cloud service, DNN (DotNetNuke), SharePoint
- Extract text from adobe PDF document in VB.NET Program
- Extract and get partial and all text content from PDF file
- Extract highlighted text out of PDF document
- Image text extraction control provides text extraction from PDF images and image files
- Enable extracting PDF text to another PDF file, and other formats such as TXT and SVG form
- OCR text from scanned PDF by working with XImage.OCR SDK for .NET
- Best VB.NET PDF text extraction SDK library and component are easy to be integrated in .NET WinForms and ASP.NET
- Online Visual Basic .NET class source code for quick evaluation
If you want to extract text from a PDF document using Visual Basic .NET programming language, you may use this PDF Document Add-On for VB.NET. With this advanced PDF Add-On, developers are able to extract target text content from source PDF document and save extracted text to other file formats through VB.NET programming.
display image in asp.net core mvc,
asp net show word document in browser,
asp.net core pdf preview,
asp net remove text from pdf javascript,
pdf editor in asp net mvc,
asp.net c# pdf viewer,
asp. net mvc pdf viewer.
If you are a C# .NET developer, you can go to this C# tutorial page for detaisl:
How to read, extract text from PDF using C#
How to read, extract text from a PDF page using VB.NET?
PDF Text Manager class (PDFTextMgr) will help you easily read, extract text information from a PDF page in VB.NET. You can read all the following text information from a PDF document or pages using VB code.
- Characters: call function ExtractTextCharacter() to get a list of PDFTextCharacter objects.
- Words: call function ExtractTextWord() to get a list of PDFTextWord objects.
- Lines: call function ExtractTextLine() to get a list of PDFTextLine objects.
- Paragraphs: call function ExtractTextParagraph() to get a list of PDFTextParagraph objects.
' open a document
Dim inputFilePath As String = "C:\2.pdf"
Dim doc As PDFDocument = New PDFDocument(inputFilePath)
' get text manager from the document
Dim textMgr As PDFTextMgr = PDFTextHandler.ExportPDFTextManager(doc)
' extract different text content from the first page
Dim pageIndex As Integer = 0
Dim page As PDFPage = doc.GetPage(pageIndex)
' get all characters in the page
Dim allChars As List(Of PDFTextCharacter) = textMgr.ExtractTextCharacter(page)
' report characters
For Each obj As PDFTextCharacter In allChars
Console.WriteLine("Char: {0}; Boundary: {1}", obj.GetChar(), obj.GetBoundary().ToString())
Next
' get all words in the page
Dim allWords As List(Of PDFTextWord) = textMgr.ExtractTextWord(page)
' report characters
For Each obj As PDFTextWord In allWords
Console.WriteLine("Word: {0}; Boundary: {1}", obj.GetContent(), obj.GetBoundary().ToString())
Next
' get all lines in the page
Dim allLines As List(Of PDFTextLine) = textMgr.ExtractTextLine(page)
' report characters
For Each obj As PDFTextLine In allLines
Console.WriteLine("Line: {0}; Boundary: {1}", obj.GetContent(), obj.GetBoundary().ToString())
Next
How to read, get text coordinates from PDF using VB.NET?
After reading, extracting text content from a pdf document or pdf pages, you will get a list of PDFTextParagraph, PDFTextLine, PDFTextWord, or PDFTextCharacter objects.
There is one common function for all of the four classes, GetBoundary(). You can call the function to read text coorinates inside the PDF page.
' get all words in the page
Dim allWords As List(Of PDFTextWord) = textMgr.ExtractTextWord(page)
' report characters
For Each obj As PDFTextWord In allWords
Dim textBoundary As RectangleF = obj.GetBoundary
Console.WriteLine("Text coorinates: left top point X: " + textBoundary.X)
Console.WriteLine("Text coorinates: left top point Y: " + textBoundary.Y)
Console.WriteLine("Text coorinates: area width: " + textBoundary.Width)
Console.WriteLine("Text coorinates: area height: " + textBoundary.Height)
Next
How to read text from scanned PDF using VB.NET?
If you want to read, extract text from images inside PDF file or scanned PDF document,
you need use XImage.OCR to ocr the images inside PDF document. Please go to page
How to read, extract text from scanned PDF file using vb.net for details.