C# OCR Library
How to scan image files and get text content in .NET ASP.NET, Windows application
C#.NET Online Tutorial for How to Extract Text from Tiff, Jpeg, Png, Gif, Bmp, and Scanned PDF Files
RasterEdge provides users with the most standard and comprehensive Optical Character Recognition SDK that is fully developed,
highly accurate and easy to work within C#.NET, VB.NET, ASP.NET web and .NET WinForms program development environments.
This online tutorial mainly talks about high level OCR toolkit in C# class programming. With this C# imaging OCR SDK,
users are supposed to extract text from various images like Jpeg, Png, Bmp, Gif, Tiff and scanned PDF document, and output to text file,
SVG image or PDF file rapidly. So, if you want to deploy OCR recognition, RasterEdge
XImage.OCR for .NET
is your best choice. Flexible C# OCR recognition, detecting and setting options are provided for better performance.
Related .net document control helps:
asp.net sharepoint document viewer free: ASP.NET SharePoint Document Viewer: view, annotate, redact documents in SharePoint
asp.net edit pdf image control:
ASP.NET PDF Image Edit Control: online insert, edit PDF images in C#
asp.net edit pdf page:
ASP.NET PDF Pages Edit Control: add, remove, sort, replace PDF pages online using C#
asp.net excel view: ASP.NET Excel Viewer in C# Control (MVC & WebForms): view Office Excel document in web browser.
asp.net display tiff images: ASP.NET Tiff Viewer: view, annotate multipage Tiff images in ASP.NET MVC, WebForms using C# Control
mvc pdf editor: ASP.NET MVC PDF Viewer & Editor: view, annotate, redact, edit PDF document in C# ASP.NET MVC
asp.net dicom web viewer: ASP.NET Dicom Document Viewer Control: view, annotate dicom imaging files online in ASP.NET
Major Features
RasterEdge XImage.OCR for .NET provides you with mature functions to recognize characters out of images and documents types that are supported by RasterEdge .NET Document Imaging SDK.
Free to implement reliable and high performance Optical Character Recognition in any .NET development environment
Simple to integrate .NET Imaging OCR Software into C# and VB.NET programming applications
Support scanning and recognizing MICR E-13B, OCR-A, and OCR-B fonts from check at fast speed
Support using this OCR SDK to extract image and document text content that in various popular languages
Able to recognize images captured by a digital camera, scanned document or image-only PDF using C# OCR toolkit
Support both monochrome and bitonal color image recognition for scanned documents and pictures in C#
Complete and rapid report of extracted text, including size, font, location, character attribute, etc.
Steps to Extract Text from Image
Initialize the language resources.
Set enabed languages, the default language is english.
Load an image file or page.(Bitmap or BasePage object.)
Recognize characters from input image.
If it can't get text successfully, please try as follows:
Cleanup the image or Convert BasePage to Bitmap with higher resolution by calling method ConvertToImage(int resolution).
// Set the training data path. Please put eng.traineddata (for English) under the directory you specified.
OCRHandler.SetTrainResourcePath(@"c:\source\");
//Load an image.
Bitmap img = new Bitmap(@"C:\page.jpeg");
// Modify the resolution of image, and make it clear enough.
img.SetResolution(192f, 192f);
//If you are loading a document(PDF, TIFF, Office, or more formats).
//Get BasePage and Convert it to bitmap with high resolution.
BaseDocument doc = newTIFFDocument(@"C:\input.tif");
BasePage page = doc.GetPage(0);
//The default resolution is 96, if set higher, it will be better to recognize the text.
Bitmap image = page.ConvertToImage(192);
// Recognize characters from this image. Default language is English.
OCRPage ocrPage = OCRHandler.Import(img);
ocrPage.Recognize();
Console.WriteLine(ocrPage.GetText());
|
Save the recongnized text to .txt file.
Sample Code
RasterEdge.com provides free sample code for using our .NET OCR SDK. You may click below to see an example of using Visual C# programming code to extract text from Jpeg, and output to text file and PDF file. Please note, you need to firstly integrate four assemblies into your C#.NET project as references.
Add References
RasterEdge.XImage.OCR.dll
RasterEdge.XImage.OCR.Tesseract.dll
RasterEdge.Imaging.Basic.dll
RasterEdge.Imaging.Basic.Codec.dll
RasterEdge.Imaging.Drawing.dll
RasterEdge.Imaging.Font.dll
RasterEdge.Imaging.Processing.dll
RasterEdge.XImage.AdvancedCleanup.Core.dll
Using Namespaces
using RasterEdge.Imaging.Raster.Core;
using RasterEdge.XImage.OCR;
Note: When you get the error "Could not load file or assembly 'RasterEdge.Imaging.Basic' or any other assembly or one of its dependencies. An attempt to load a program with an incorrect format", please check your configure as follows:
If you are using x64 libraries/dlls, Right click the project -> Properties -> Build -> Platform target: x64.
If using x86, the platform target should be x86.
// Set the training data path. Please put eng.traineddata (for English) under the directory you specified.
OCRHandler.SetTrainResourcePath(@"c:\source\");
//Load an image.
Bitmap img = new Bitmap(@"C:\page.jpeg");
// Resize image to improve accuracy. If the image is clear enough, skip this.
img.SetResoulution(192f, 192f);
// Recognize characters from this image. Default language is English.
OCRPage page = OCRHandler.Import(img);
page.Recognize();
Console.WriteLine(page.GetText());
|
How To List
Install, Deploy and Distribute SDK
1. System requirements
2. How to install SDK into Visual Studio
3. How to deploy SDK into IIS server
4. How to distribute SDK with your Windows application
|
Quick to Start
You may now start to download free trial of RasterEdge XImage.OCR for .NET, integrate corresponding dll libraries into your C# application, and then use free sample code provided to have a quick evaluation.
|
Basic SDK Concept
1. OCRHandle Class
2. OCRRecSetting Class
3. OCRDoucument and OCRPage Classes
4. OCRZone Class
|
Supported Languages
RasterEdge OCR module supports recognizing various language types, like English, Spanish, French, German, Italian, Russian, etc. You may click to see all languages and corresponding abbreviations.
|
Extract Text from Tiff
On this Visual C# tutorial page, you will see how to use RaterEdge .NET OCR SDK in your application to extract and get text from Tiff image file. Extracted text can be output to Word or PDF document.
|
Extract Text from Scanned PDF
In addition to raster image files, text extraction from PDF is also supported by our OCR toolkit. For instance, you may get text content from whole PDF file, single PDF page and specified zone in page.
|
Extract Text from Jpeg, Png, Bitmap Images
This online C# tutorial will tell you how to use OCR technology to extract text from raster image file. You may also perform OCR on specified zone in your loaded image. Raster image files like Jpeg, Png, Bitmap, and Gif are supported.
|
Extract Content from Image
Free Visual C# programming codes are provided. You can directly copy demos to your .NET application to extract content from image (Tiff, scanned PDF, Jpeg, Png, Bmp, ...) and output to text or PDF file.
|
Recognize MICR E-13B, OCR-A, and OCR-B Fonts
XImage.OCR for .NET allows C# users to scan and recognize common OCR fonts from check, including MICR E-13B, OCR-A, and OCR-B. If you want to have a quick evaluation of this functionality, please use free C# demo code on this page.
|
|