Losing layers when converting

Hi,


One of our customers had an issue where some pdfs were not converted properly and one of the layers (containing text) was lost. I managed to reproduce this issue when attempting to convert this pdf to jpeg, tiff and even to pdf/a. The resulting image was always the same.

Attached is a sample pdf

For jpeg conversion I used below code:

// Open document
Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(srcFile);
pdfDocument.Flatten();

for (int pageCount = 1; pageCount <= pdfDocument.Pages.Count; pageCount++)
{
String jpegName = Path.Combine(Path.GetDirectoryName(destFile), Path.GetFileNameWithoutExtension(destFile) + “_” + pageCount + “.jpg”);
using (FileStream imageStream = new FileStream(jpegName, FileMode.Create))
{
// Create Resolution object
Resolution resolution = new Resolution(settings.JpegSettings.Resolution);
// Create JPEG device with specified attributes (Width, Height, Resolution, Quality)
// where Quality [0-100], 100 is Maximum
JpegDevice jpegDevice = new JpegDevice(resolution, settings.JpegSettings.JpegQuality);

// Convert a particular page and save the image to stream
jpegDevice.Process(pdfDocument.Pages[pageCount], imageStream);
// Close stream
imageStream.Close();
}
files.Add(jpegName);
}
I have attempted pdfDocument.Flatten() which made no difference.
When checking the layers property (pdfDocument.Pages[1].Layers) it returns null.

When playing around in Adobe Reader after converting this pdf to pdf/a it seems the text is placed behind the background. It is still selectable but not visible.

How could this be resolved?

Best Regards

Hi John,

Thanks for your inquiry. I have tested your scenario with shared document using Aspose.Pdf for .NET 10.6.0 and managed to observe the reported issue. For further investigation, I have logged an issue in our issue tracking system as PDFNEWNET-39094 and also linked your request to it. We will keep you updated via this thread regarding the issue status.

Please feel free to contact us for any further assistance.

Best Regards,

Hi,


Has there been any progress on this issue?

If it can’t be solved easely, is there a way I can either test the pdf to see if it was properly converted or check a property before converting to determine whether I should use another conversion method for this kind of pdf?

Best Regards

Hi John,


Thanks for your inquiry. I am afraid your above reported issue is still not resolved as product team is busy in resolving other issues in the queue, reported earlier. However I have raised the priority of your issue and requested our team to share an ETA at their earliest. We will notify you as soon as we made some significant progress towards issue resolution.

We are sorry for the inconvenience caused.

Best Regards,

Any update on this bug? I have a similar problem with some image objects being lost from a pdf when converting to any other format (tiff, xps, etc).

ccymbor:
Any update on this bug? I have a similar problem with some image objects being lost from a pdf when converting to any other format (tiff, xps, etc).
Hi Charles,

Thanks for contacting support.

The problem reported earlier is still not resolved. However concerning to issues you are facing, can you please share the resource files, so that we can test the scenario in our environment. We are sorry for this inconvenience.

I’ve made my own post here:

PDF conversion to any format drops linked images

ccymbor:
I've made my own post here:
https://forum.aspose.com/t/24667
Hi Charles,

We have managed to reproduce the issue while using the resource files which you have shared and this problem has been logged in our issue tracking system. We will keep you posted on the status of correction on above stated forum thread.

Should you have any further query, please feel free to contact.

The issues you have found earlier (filed as PDFNET-39094) have been fixed in this update.