Issue extracting transparent png image from PDF

louis.higgs · March 2, 2017, 3:54am

Hi

I have been testing Aspose.pdf for .Net, as my company is looking to purchase several copies of Aspose for manipulation of images within PDFs.

It all works fine, apart from when I have to extract a specific image that is a png with a transparent background. In the provided attachment it is the third image down and is labelled as “info.png”.

When I save this image to file as a jpg, the transparent background turns black, whilst when it is saved as a png, the whole image becomes transparent.

For my company to use Aspose, we would require this to work.

Could you please advise?

Below is the code that I use for the extraction of the images.

private void ExtractImages(string filename, string path)

{

ImageFormat format = ImageFormat.Png;

string extension = “.png”;

using (Document pdfDocument = new Document(filename))

{

for (int i = 1; i <= pdfDocument.Pages.Count; ++i)

{

using (Page page = pdfDocument.Pages[i])

{

for (int y = 1; y <= page.Resources.Images.Count; ++y)

{

string savefile = Path.Combine(path, string.Format(“image_{0}_{1}{2}”, i, y, extension));

bool success = true;

using (FileStream fstream = new FileStream(savefile, FileMode.Create))

{

XImage image = page.Resources.Images[y];

try

{

image.Save(fstream, format);

}

catch (Exception)

{

Console.WriteLine(“Failed to handle image at {0}-{1}”, i, y);

success = false;

}

if (!success)

{

File.Delete(savefile);

}

tilal.ahmad · March 2, 2017, 11:38pm

Hi there,

Thanks for your inquriy. I have tested the scenario using Aspose.Pdf for .NET 17.2.0 and noticed info.png is being extracted correctly, please find attached output for the reference. Please share some more details about the issue you are facing, your API version and output image. So we will further investigate the issue.

Furthermore, I have noticed that second image(map.jp2) on first page, second image(butterfly.wmf) on second page and third image(first barcode) on third page are not being extracted, so logged a ticket PDFNET-42365 in our issue tracking system for further investigation and rectification.

We are sorry for the inconvenience.

Best Regards,

louis.higgs · March 3, 2017, 2:32am

Hi there

It’s good news that you managed to get the image extracted correctly.

I made sure that I am also using version 17.2.0 on Visual Studio 2015. However I am still getting an issue with the info.png. When you tested it, did you use code that differs from mine?

When I add the image back into the document, I just see a transparent image where it is supposed to be.

I have attached both the extracted image and the pdf with the images reinserted.

If there are any further details you need, please let me know.

Also thanks for investigating the images that don’t extract.

Thanks

Louis

tilal.ahmad · March 5, 2017, 9:18pm

Hi Louis,

Thanks for your feedback. Please note I have used your shared code without for testing. I tested the scenario with VS2015 as well but I am afraid still unable to replicate the issue. I will appreciate it if you please share a sample console project along with your environment details, so we will further investigate the issue and will guide you accordingly.

We are sorry for the inconvenience.

Best Regards,

louis.higgs · March 6, 2017, 3:00am

Hi

Thanks for your help so far.

I am running on a Windows 7 x64 Enterprise Service Pack 1 machine with 16GB Ram and an Intel Core i5-3570 3.4GHz processor.

I have tried running my application on another Windows 7 PC, as well as on an XP machine, but in both cases I ended with the same result.

The project that I have been testing with is attached. Hopefully you can spot something from this.

Thanks

Louis

Edit: The version of Visual Studio 2015 is Version 14.0.24720.00 Update 1

tilal.ahmad · March 6, 2017, 10:49pm

Hi Louis,

Thanks for sharing your source project. I have tested the code and noticed that issue is causing due to image replacement code, so logged a ticket PDFNET-42377 in our issue tracking system for further investigation and rectification. We will keep you updated about the issue resolution progress within this forum thread.

However as a workaround till the issue is further investigated and resolved, you may comment out the image replacement code. It will help you to extract image successfully.

using (Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(filename))
{
    for (int i = 1; i <= pdfDocument.Pages.Count; ++i)
    {
        using (Page page = pdfDocument.Pages[i])
        {
            for (int y = 1; y <= page.Resources.Images.Count; ++y)
            {
                string savefile = Path.Combine("E:/Data", string.Format("image_{0}_{1}{2}", i, y, _extenstion));
                bool success = true;

                using (FileStream fstream = new FileStream(savefile, FileMode.Create))
                {
                    XImage image = page.Resources.Images[y];

                    try
                    {
                        Console.WriteLine("{0} - Colour: {1}, Transparency: {2}", savefile, image.GetColorType(), image.ContainsTransparency);
                        image.Save(fstream, _format);
                    }
                    catch (Exception)
                    {
                        Console.WriteLine("Failed to handle image at {0}-{1}", i, y);
                        success = false;
                    }
                }

                if (!success)
                {
                    File.Delete(savefile);
                }
            }
        }
    }

    pdfDocument.Save("E:/Data/ReplaceImages.pdf");
}

We are sorry for the inconvenience.

Best Regards,

louis.higgs · March 7, 2017, 1:52am

Hi Tilal

Thanks for the workaround. I have tested it out myself and that works.

Thanks

Louis

tilal.ahmad · March 7, 2017, 9:14pm

Hi Louis,

Thanks for your feedback. It is good to know that workaround helped to accomplish the task. However, we will keep you updated about the issue resolution progress of above reported issues.

Best Regards,