Hi,
One of our customers had an issue where some pdfs were not converted properly and one of the layers (containing text) was lost. I managed to reproduce this issue when attempting to convert this pdf to jpeg, tiff and even to pdf/a. The resulting image was always the same.
Attached is a sample pdf
For jpeg conversion I used below code:
// Open document
Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(srcFile);
pdfDocument.Flatten();
for (int pageCount = 1; pageCount <= pdfDocument.Pages.Count; pageCount++)
{
String jpegName = Path.Combine(Path.GetDirectoryName(destFile), Path.GetFileNameWithoutExtension(destFile) + “_” + pageCount + “.jpg”);
using (FileStream imageStream = new FileStream(jpegName, FileMode.Create))
{
// Create Resolution object
Resolution resolution = new Resolution(settings.JpegSettings.Resolution);
// Create JPEG device with specified attributes (Width, Height, Resolution, Quality)
// where Quality [0-100], 100 is Maximum
JpegDevice jpegDevice = new JpegDevice(resolution, settings.JpegSettings.JpegQuality);
// Convert a particular page and save the image to stream
jpegDevice.Process(pdfDocument.Pages[pageCount], imageStream);
// Close stream
imageStream.Close();
}
files.Add(jpegName);
}
I have attempted pdfDocument.Flatten() which made no difference.
When checking the layers property (pdfDocument.Pages[1].Layers) it returns null.
When playing around in Adobe Reader after converting this pdf to pdf/a it seems the text is placed behind the background. It is still selectable but not visible.
How could this be resolved?
Best Regards