C# PDF Reader Library
How to read, extract images from PDF file in C# ASP.NET MVC, WinForms, WPF application
C# Demo Code to read, extract image from PDF file
In this C# tutorial, you will learn how to read, extrac all images from PDF pages, or specified page using C# in ASP.NET MVC Web, Windows applications.
- Read, extract all images from PDF
- Read, select an image by page position in PDF
Extract all image from a pdf file using C#
The below steps and C# source code will show how to read all images from PDF using C# in ASP.NET MVC Web, Windows applications.
- Create a new PDFDocument object with an existing PDF file loaded
- Utilize method PDFImageHandler.ExtractImages() to extract all images from the PDF document
- Or use the same method to get all images from the specified PDF page (the first page in the source code below)
// open a document
String inputFilePath = Program.RootPath + "\\" + "3.pdf";
PDFDocument doc = new PDFDocument(inputFilePath);
// extract all images in the document
List<PDFImage> allImages = PDFImageHandler.ExtractImages(doc);
// show information of these images
foreach (PDFImage image in allImages)
{
Console.WriteLine("Image: page index = " + image.PageIndex);
Console.WriteLine(" : X = " + image.Position.X + ", Y = " + image.Position.Y);
Console.WriteLine(" : X = " + image.GetBoundary().X + ", Y = " + image.GetBoundary().Y);
Console.WriteLine(" : Width = " + image.GetBoundary().Width);
Console.WriteLine(" : Height = " + image.GetBoundary().Height);
}
// extract all images in the first page
int pageIndex = 0;
PDFPage page = (PDFPage)doc.GetPage(pageIndex);
List<PDFImage> allImagesInPage = PDFImageHandler.ExtractImages(page);
// show information of these images
foreach (PDFImage image in allImagesInPage)
{
Console.WriteLine("Image: page index = " + image.PageIndex);
Console.WriteLine(" : X = " + image.Position.X + ", Y = " + image.Position.Y);
Console.WriteLine(" : X = " + image.GetBoundary().X + ", Y = " + image.GetBoundary().Y);
Console.WriteLine(" : Width = " + image.GetBoundary().Width);
Console.WriteLine(" : Height = " + image.GetBoundary().Height);
}
Read, select an image in a PDF page in C# code
// open a document
String inputFilePath = Program.RootPath + "\\" + "3.pdf";
PDFDocument doc = new PDFDocument(inputFilePath);
// get the first page
int pageIndex = 0;
PDFPage page = (PDFPage)doc.GetPage(pageIndex);
// select image at the position (100F, 100F) in the page
PointF cursorPos = new PointF(100F, 100F);
PDFImage image = PDFImageHandler.SelectImage(page, cursorPos);
if (image == null)
{
Console.WriteLine("No image has been found!");
}
else
{
Console.WriteLine("Image: boundary = " + image.GetBoundary().ToString());
}
// open a document
String inputFilePath = Program.RootPath + "\\" + "3.pdf";
PDFDocument doc = new PDFDocument(inputFilePath);
// get the first page
int pageIndex = 0;
PDFPage page = (PDFPage)doc.GetPage(pageIndex);
// define the region (Rectangle [50F, 50F, 300F, 400F]) of the page
RectangleF region = new RectangleF(50F, 50F, 300F, 400F);
// get all images in the region in sequence (from bottom to top)
List<PDFImage> images = PDFImageHandler.SelectImages(page, region);
// select the top image in the region
PDFImage image1 = PDFImageHandler.SelectImage(page, region);
// select the bottom image in the region
int sequenceIndex = 0;
PDFImage image2 = PDFImageHandler.SelectImage(page, region, sequenceIndex);