XDoc.PDF
Features
Tech Specs
How-to C#
Pricing
How to Start Convert PDF Read PDF Build PDF Work with PDF Modules PDF Document PDF Pages Text Image Graph & Path Annotation, Markup & Drawing Redaction Security Digital Signature Forms Watermark Bookmark Link File Attachment File Metadata Printing Work with Other SDKs Barcode read Barcode create OCR Twain

C# PDF Image Reader Library
How to read, extract, copy, paste PDF file images using c# .net


A .NET Library Support PDF Image Extraction from a Page, a Region on a Page, and PDF Document in C#










  • Best C#.NET library for extracting image from adobe PDF page in Visual Studio .NET framework project
  • Provide trial SDK components for quick integration in Visual C#.NET WinForms and ASP.NET project for PDF image extraction
  • Free C# source code for extracting image from specified PDF page position in .NET class
  • Support .NET WinForms, ASP.NET MVC in IIS, ASP.NET Ajax, Azure cloud service, DNN (DotNetNuke), SharePoint
  • Extract various types of image from PDF file, like XObject Image, XObject Form, Inline Image, etc
  • Get JPG, JPEG and other high quality image files from PDF document
  • Able to extract vector images from PDF in .NET console application
  • Extract all images from whole PDF or a specified PDF page
  • Capture image from whole PDF based on special characteristics
  • Scan image to PDF, tiff and various image formats
  • Get image information, such as its location, zonal information, metadata, and so on






About REImage


Using XDoc.PDF for .NET SDK, you can easily read, extract images from pdf document, page, page region. The extracted images are stored in PDFImage objects.

Class PDFImage includes the following properties and methods

  1. Position: Position of the image in the page. Unit: pixel (in 96 dpi)
  2. IsRotated: Indicate if the image is rotated.
  3. Image: Get the embedded image resource related to this object.
  4. IsInlineImage: Indicate if the resource is an Inline Image.
  5. IsForm: Indicate if the resource is an XObject Form. Return false if the resource is an XObject Image or Inline Image.
  6. IsXObjectForm: Same to IsForm
  7. IsRGB: Indicate if the image is an XObject Image with ColorSpace DeviceRGB.
  8. IsCMYK: Indicate if the image is an XObject Image with ColorSpace DeviceCMYK.
  9. IsGray: Indicate if the image is an XObject Image with ColorSpace DeviceGray.
  10. IsIndexed: Indicate if the image is an XObject Image with ColorSpace Indexed.
  11. IsCIEBased: Indicate if the image is an XObject Image with CIE-based ColorSpace.


  12. RectangleF GetBoundary(): Get boundary of the item in the Device Space (Windows-like coordinate system). Unit: pixel (in 96 dpi)
  13. GetBitmap(): Get appearence of the page item in the Device Space (Windows-like coordinate system).
  14. GetColorSpaceName(): Get Color Space type of the image. Only valid for PDFImageType.XObjImage.




C# extract images from whole pdf document


        #region extract images from whole pdf document
        internal static void extractImagesFromPdfFile()
        {
            String inputFilePath = @"C:\demo.pdf";
            // Open a document.
            PDFDocument doc = new PDFDocument(inputFilePath);

            // Extract all images in the document.
            List<PDFImage> allImages = PDFImageHandler.ExtractImages(doc);
        }
        #endregion




C# extract images from specified PDF page


        #region extract images from one pdf page
        internal static void extractImagesFromPdfPage()
        {
            String inputFilePath = @"C:\demo.pdf";
            // Open a document.
            PDFDocument doc = new PDFDocument(inputFilePath);
            PDFPage page = (PDFPage)doc.GetPage(0);

            // Extract all images on one pdf page.
            List<PDFImage> allImages = PDFImageHandler.ExtractImages(page);
        }
        #endregion




C# read the image from specified position (coordinates) inside pdf document


XDoc.PDF SDK is using Window Coordinate System.
Points on the screen are described by x- and y-coordinate pairs. The x-coordinates increase to the right; y-coordinates increase from top to bottom. The origin (0,0) is the most left top point on the pdf page.



        #region read the image from specified position (coordinates) inside pdf document
        internal static void extractImagesFromSpecifiedPosition()
        {
            String inputFilePath = @"C:\demo.pdf";
            // Open a document.
            PDFDocument doc = new PDFDocument(inputFilePath);

            // Get page 3 from the document.
            PDFPage page = (PDFPage)doc.GetPage(3);

            // Select image by the point (50F, 100F).
            PDFImage img = PDFImageHandler.SelectImage(page, new PointF(50F, 100F));

            // ...
        }
        #endregion




Read images list from specified area in pdf page


For api: public RectangleF(float x, float y, float width, float height)
(x,y) is the Rectangle left top corner coordicate, width, height are the rectangle's width and height.



//  open a document
String inputFilePath = Program.RootPath + "\\" + "3.pdf";
PDFDocument doc = new PDFDocument(inputFilePath);

//  get the first page
int pageIndex = 0;
PDFPage page = (PDFPage)doc.GetPage(pageIndex);

//  define the region (Rectangle [50F, 50F, 300F, 400F]) of the page
RectangleF region = new RectangleF(50F, 50F, 300F, 400F);

//  get all images in the region in sequence (from bottom to top)
List<PDFImage> images = PDFImageHandler.SelectImages(page, region);

//  select the top image in the region
PDFImage image1 = PDFImageHandler.SelectImage(page, region);

//  select the bottom image in the region
int sequenceIndex = 0;
PDFImage image2 = PDFImageHandler.SelectImage(page, region, sequenceIndex);