Hello,
I have need to detect blank page in a pdf document., Is there an API call that I can use to detect blank page.
Thanks
Soujanya Kumar
Hello,
I have need to detect blank page in a pdf document., Is there an API call that I can use to detect blank page.
Thanks
Soujanya Kumar
Hello Soujanya,
Thanks for using our products.
In order to accomplish your requirement, you may traverse through all the pages of PDF document and try to extract Text, Images, Annotations, Attachments, watermarks and in case the page is empty, nothing will be returned. Please visit the following links for further details on
In case you need any further information, please feel free to contact.
And what if the page contains an empty (= white) image?
I consider this as a blank page but your detection method will not.
Hi Corne,<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />
Thank you for using our product.
Well, technically, if a PDF contains an image (which is blank) means it is not a blank PDF because it has image contents. However, regarding this specific scenario, I have requested our development team to share further details if it is possible to handle such a scenario. Once I get a response, I will update you. I have registered an investigation issue in our issue tracking system with issue id: PDFNEWNET-34418.
Thank You & Best Regards,
Hi Corne,
Document doc = new Document(“d:/pdftest/34418.pdf”);<o:p></o:p>
foreach (Page page in doc.Pages)
{
Console.WriteLine("Page {0} is {1}", page.Number, IsBlankPage(page));
}
static private bool HasOnlyWhiteColor(Page page)
{
foreach (Operator op in page.Contents)
if (op is Operator.SetColorOperator)
{
Operator.SetColorOperator opSC = op as Operator.SetColorOperator;
System.Drawing.Color color = opSC.getColor();
if (color.R != 255 || color.G != 255 || color.B != 255)
return false;
}
return true;
}
static private bool IsWhiteImage(XImage image)
{
MemoryStream ms = new MemoryStream();
image.Save(ms);
System.Drawing.Bitmap bmp = new System.Drawing.Bitmap(ms);
for (int j = 0; j < bmp.Height; j++)
for (int i = 0; i < bmp.Width; i++)
{
System.Drawing.Color color = bmp.GetPixel(i, j);
if (color.R != 255 || color.G != 255 || color.B != 255)
return false;
}
return true;
}
static private bool HasOnlyWhiteImages(Page page)
{
// return true if no images exist or all images are white
if (page.Resources.Images.Count == 0)
return true;
foreach (XImage image in page.Resources.Images)
if (!IsWhiteImage(image))
return false;
return true;
}
static private bool IsBlankPage(Page page)
{
if ((page.Contents.Count == 0 && page.Annotations.Count == 0) ||
(HasOnlyWhiteColor(page) && HasOnlyWhiteImages(page)))
return true;
return false;
}
Please try using it and in case you face any problem or you have any further query, please feel free to contact.
The issues you have found earlier (filed as PDFNEWNET-34418) have been fixed in Aspose.Pdf for .NET 7.5.0.
aspose.notifier:
The issues you have found earlier (filed as PDFNEWNET-34418) have been fixed in Aspose.Pdf for .NET 7.5.0.
This message was posted using Notification2Forum from Downloads module by aspose.notifier.
Hi Rick,
Hi
Hi Aravind,bpanchu:
HiAny new feature added in latest pdf api to detect blank page ? not using TextAbsorber
Is this file need to OCR first ?
Hi Aravind,
I am getting this error
Error CS0426 The type name ‘SetColorOperator’ does not exist in the type ‘Operator’
In the latest versions of the API the operators Classes and methods have been moved under the namespace Aspose.Pdf.Operators
. You can please use it like Aspose.Pdf.Operators.SetColorOperator
.