Removing white space from a scanned page


I have a scanned page which contains an image and a lot of white space. I am trying to somehow use your library to just get the image. For example, if there is a tiff or pdf which is the size of a regular A4 page, but the actual image in it which is a drivers license is only a small portion, I would like to get only the portion of the image that has the actual image in it, and remove all the white space. The goal is to reduce the size but retain the quality by removing the white space. Can your product help with this? It will be great if you can point me to an example.



Thank you for your inquiry. Please note that Aspose.OCR APIs can be used to perform OCR and OMR operations on the scanned images. You may use our Aspose.Imaging and Aspose.Pdf libraries for image and PDF manipulation. For example you can use following code to extract/convert image from PDF to TIFF format.


// Set license
Aspose.Pdf.License _lic = new Aspose.Pdf.License();

// Create Resolution object
Aspose.Pdf.Devices.Resolution resolution = new Aspose.Pdf.Devices.Resolution(300);

// Create TiffSettings object
Aspose.Pdf.Devices.TiffSettings tiffSettings = new TiffSettings();
tiffSettings.Compression = CompressionType.None;
tiffSettings.Depth = Aspose.Pdf.Devices.ColorDepth.Format1bpp;
tiffSettings.Shape = ShapeType.None;
tiffSettings.SkipBlankPages = false;

// Create TIFF device
TiffDevice tiffDevice = new TiffDevice(resolution, tiffSettings);

//Open the document
Document pdfDocument = new Document(@"Scanned.pdf");

// Convert a particular page and save the image to stream
tiffDevice.Process(pdfDocument, 1, 1, @"Scanned.tif");

using (var image = Aspose.Imaging.Image.Load(@"Scanned.tif"))

Please visit the link Convert PDF Pages for details to convert PDF to other formats. You can also post your inquiry directly on the respective support forums where my colleagues from the respective support team will guide you accordingly. Link to support forums:

Aspose.Imaging Support Forum
Aspose.Pdf Support Forum