Replacing images in PDF

With the attached PDF:
test_pdf_with_composite_image.pdf (98.6 KB)

- ImagePlacementAbsorber.Count equals 1.

- pdf.Pages.First().Resources.Images.Count equals 2.

- The following code produces a PDF that loses the grayscale content

        var pdf = new Aspose.Pdf.Document("pdf_with_composite_image.pdf");
        Aspose.Pdf.ImagePlacementAbsorber abs = new();
        pdf.Pages.First().Accept(abs);
        using (var s = File.Create("image.jpeg"))
            abs.ImagePlacements.First().Image.Save(s);
        using (var s = File.OpenRead("image.jpeg"))
            abs.ImagePlacements.First().Replace(s);
        pdf.Save("absorber.pdf");

absorber.pdf (179.0 KB)

- The following code produces a PDF that loses the image altogether

        foreach (var image in pdf.Pages.First().Resources.Images)
        {
            using (var s = File.Create("image.jpeg"))
                image.Save(s);

            using (var s = File.OpenRead("image.jpeg"))
                pdf.Pages.First().Resources.Images.Replace(index++, s);
            break;
        }
        pdf.Save(@"resources.pdf");

resources.pdf (183.5 KB)

- Changing only the first image produces a viewable PDF:

   var page = pdf.Pages.First();
   using (var s = File.Create("image.jpeg"))
       page.Resources.Images[1].Save(s);

   using (var s = File.OpenRead("image.jpeg"))
       pdf.Pages.First().Resources.Images.Replace(0, s);

   pdf.Save("second_image_only.pdf");

first_image_only.pdf (183.3 KB)

What I am trying to do is process and replace both images and retain the same functionality and visibility in PDFs. I did see distinct difference in the output of the PDF. For example, here are the image objects in test_pdf_with_composite_image.pdf :

18 0 obj
<</Type/XObject/Subtype/Image/Width 244/Height 80/ColorSpace/DeviceGray/BitsPerComponent 8/Filter/FlateDecode/DecodeParms<</Predictor 15/Colors 1/BitsPerComponent 8/Columns 244>>/Length 4934>>stream
...
endstream
endobj
19 0 obj
<</Type/XObject/Subtype/Image/Width 244/Height 80/SMask 18 0 R/ColorSpace/DeviceRGB/BitsPerComponent 8/Filter/FlateDecode/DecodeParms<</Predictor 15/Colors 3/BitsPerComponent 8/Columns 244>>/Length 9461>>stream
...
endstream
endobj
...

In Aspose output, the images are always defined this way:

18 0 obj
<</Type/XObject/Subtype/Image/Width 244/Height 80/ColorSpace/DeviceGray/BitsPerComponent 8/Filter/FlateDecode/DecodeParms<</Predictor 15/Colors 1/BitsPerComponent 8/Columns 244>>/Length 4934>>stream
...
endstream
endobj
19 0 obj
<</Filter/DCTDecode/Length 5206/Type/XObject/Subtype/Image/Width 244/Height 80/BitsPerComponent 8/ColorSpace/DeviceRGB>>stream

Thanks for your help!

@Buffer2018
If you open the attached document in Acrobat Pro, it will show only one image (like ImagePlacementAbsorber)
4-Image.png (111.0 KB)

The thing is that one XImage (object 18) is used as SMask for another XImage (object 19).
In the description you attached for object 19, this corresponds to the substring SMask 18 0 R

Is there an interface to changing this in Aspose.PDF api?

@Buffer2018
Code for replacing the image, if I understood your intentions correctly.

var pdf = new Document(dataDir + "test.pdf");

using (Stream imageStream = File.OpenRead(dataDir + "2.jpg"))
{
    pdf.Pages[1].Resources.Images.Replace(2, imageStream);
}                       

2.jpg (17.4 KB)
In this case, the input required an image in jpg format.
The main image (obj 19, corresponding to index 2) must be changed, since the image with index 1 (obj 18) is auxiliary and is used by the main image. And its replacement does not lead to a result.

By the way, some hint for XImage type can get FilterType property

pdf.Pages[1].Resources.Images[2].FilterType

No, that’s not exactly what I meant…
What I want to achieve is to process all the images in the PDF and replace them. My requirement is that the SMask relations between them are retained after all these processing.
Imagine the following imaginary scenario: I want to add a single black pixel to all images, including the mask images and retain the pdf image relations when saving it. can this be done in code? this is what I was trying to achieve in the code posted in this section.

I could add this directly to the PDF file with a simple regex… but… I really prefer not to :slight_smile:

@Buffer2018
Here the approach suggests itself:

  • read XImage into an image
  • make edits to the image
  • write the modified image back to XImage

Could you please give me a sample how to do this for this document?
Let’s say our goal is to take test_pdf_with_composite_image,
I want the output PDF to have the exact same image in it, but change one random pixel in the rgb image and one random pixel in its mask image.

@Buffer2018
I will look into this and write to you tomorrow.

@Buffer2018
I wrote code to solve this issue.

var pdf = new Document(dataDir + "test.pdf");
XImageCollection XImages = pdf.Pages[1].Resources.Images;
Bitmap bmpImage;
for (int i = 2; i <= XImages.Count; i++)
{
    using (var imageStream = new MemoryStream())
    {
        XImages[i].Save(imageStream, 200);
        bmpImage = new Bitmap(imageStream);
    }

    for (int x = 0; x < 10; x++)
        for (int y = 0; y < 10; y++)
            bmpImage.SetPixel(x, y, System.Drawing.Color.Aqua);

    using (var ms = new MemoryStream())
    {
        bmpImage.Save(ms, System.Drawing.Imaging.ImageFormat.Png);
        XImages.Replace(i, ms);
    }
}

pdf.Save(dataDir + "test-out.pdf");

However:

  • as you can see, I do not touch the XImage corresponding to the SMask, otherwise an exception is thrown (which was your original question). And how to programmatically determine that the XImage corresponds to the SMask, whether to process it and how - I can not say.
  • the image in the resulting document is noticeably damaged.

I will create a task for the development team about this.

@Buffer2018
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-57540

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

@sergei.shibanov
Your code does not produce the required output…
If you examine test-out.pdf, you can see the images lose the SMask attribute:

<</Type/XObject/Subtype/Image/Width 244/Height 80/ColorSpace/DeviceGray/BitsPerComponent 8/Filter/FlateDecode/DecodeParms<</Predictor 15/Colors 1/BitsPerComponent 8/Columns 244>>/Length 4934>>stream
<</Filter/DCTDecode/Length 14749/Type/XObject/Subtype/Image/Width 244/Height 80/BitsPerComponent 8/ColorSpace/DeviceRGB>>stream

What I am trying to achieve is to replace both the image and its smask image and keep the relation between them in the output. therefore the image definitions in the output file should match the image definitions in the input file (with the stream data modified, obviously):

	<</Type/XObject/Subtype/Image/Width 244/Height 80/ColorSpace/DeviceGray/BitsPerComponent 8/Filter/FlateDecode/DecodeParms<</Predictor 15/Colors 1/BitsPerComponent 8/Columns 244>>/Length 4934>>stream
	<</Type/XObject/Subtype/Image/Width 244/Height 80/SMask 18 0 R/ColorSpace/DeviceRGB/BitsPerComponent 8/Filter/FlateDecode/DecodeParms<</Predictor 15/Colors 3/BitsPerComponent 8/Columns 244>>/Length 9461>>stream

@Buffer2018

Yes, that’s right. I was only looking for the visual representation. Although you clearly stated that you need SMask as well.
I will add to the description of the created task.

I’m sorry for not understanding, what ticket did you open?
I still don’t understand if the following workflow possible with Aspose’s current version:

  • Load a PDF document that has an Image SMask’ing another image,

  • Modify the stream of both the color image and its smask image

  • Save the PDF to a new file, so if the INPUT PDF had the following objects:
    18 0 obj
    <</Type/XObject/Subtype/Image/Width 244/Height 80/ColorSpace/DeviceGray/BitsPerComponent 8/Filter/FlateDecode/DecodeParms<</Predictor 15/Colors 1/BitsPerComponent 8/Columns 244>>/Length 4934>>stream …data…
    endstream
    endobject
    19 0 obj
    <</Type/XObject/Subtype/Image/Width 244/Height 80/SMask 18 0 R/ColorSpace/DeviceRGB/BitsPerComponent 8/Filter/FlateDecode/DecodeParms<</Predictor 15/Colors 3/BitsPerComponent 8/Columns 244>>/Length 9461>>stream …data…
    endstream
    endobject

  • The OUTPUT PDF will have the following objects:
    18 0 obj
    <</Type/XObject/Subtype/Image/Width 244/Height 80/ColorSpace/DeviceGray/BitsPerComponent 8/Filter/FlateDecode/DecodeParms<</Predictor 15/Colors 1/BitsPerComponent 8/Columns 244>>/Length adjusted length>>stream …some other data…
    endstream
    endobject
    19 0 obj
    <</Type/XObject/Subtype/Image/Width 244/Height 80/SMask 18 0 R/ColorSpace/DeviceRGB/BitsPerComponent 8/Filter/FlateDecode/DecodeParms<</Predictor 15/Colors 3/BitsPerComponent 8/Columns 244>>/adjusted length>>stream … some other data…
    endstream
    endobject

@Buffer2018
I didn’t find a solution to your question - maybe there is no such solution, maybe I just don’t know it. And I created a task for the development team.
I hope that I described the request properly, if not - write.
InTracker.png (45.1 KB)

@sergei.shibanov
Ok, understood! Thanks :slight_smile:

I think it would maybe be helpful for the development team to understand the required input and output as described in my previous comment, and to include the attached file from this issue, test_with_composite_image.pdf:

  • Load a PDF document that has an Image SMask’ing another image,
  • Modify the stream of both the color image and its smask image
  • Save the PDF to a new file, so if the INPUT PDF had the following objects:
    18 0 obj
    <</Type/XObject/Subtype/Image/Width 244/Height 80/ColorSpace/DeviceGray/BitsPerComponent 8/Filter/FlateDecode/DecodeParms<</Predictor 15/Colors 1/BitsPerComponent 8/Columns 244>>/Length 4934>>stream …data…
    endstream
    endobject
    19 0 obj
    <</Type/XObject/Subtype/Image/Width 244/Height 80/SMask 18 0 R/ColorSpace/DeviceRGB/BitsPerComponent 8/Filter/FlateDecode/DecodeParms<</Predictor 15/Colors 3/BitsPerComponent 8/Columns 244>>/Length 9461>>stream …data…
    endstream
    endobject
  • The OUTPUT PDF will have the following objects:
    18 0 obj
    <</Type/XObject/Subtype/Image/Width 244/Height 80/ColorSpace/DeviceGray/BitsPerComponent 8/Filter/FlateDecode/DecodeParms<</Predictor 15/Colors 1/BitsPerComponent 8/Columns 244>>/Length adjusted length>>stream …some other data…
    endstream
    endobject
    19 0 obj
    <</Type/XObject/Subtype/Image/Width 244/Height 80/SMask 18 0 R/ColorSpace/DeviceRGB/BitsPerComponent 8/Filter/FlateDecode/DecodeParms<</Predictor 15/Colors 3/BitsPerComponent 8/Columns 244>>/adjusted length>>stream … some other data…
    endstream
    endobject

@Buffer2018
Added to the task description.

@Buffer2018

We can get the desired result using the following code snippet:

var pdf = new Document(dir + "test.pdf");
//via image placement absorber we can find images being painted on the page (avoid getting softmasks)
var abs = new ImagePlacementAbsorber();
abs.Visit(pdf.Pages[1]);
Bitmap bmpImage;
var imagePlacement = abs.ImagePlacements[1];
using (var imageStream = new MemoryStream())
{
    //if the image have a mask, it's important to save it in the Png format
    if (imagePlacement.Image.ContainsTransparency)
    {
        imagePlacement.Image.Save(imageStream, ImageFormat.Png, 200);
    }
    else
    {
        imagePlacement.Image.Save(imageStream, ImageFormat.Jpeg, 200);
    }
    bmpImage = new Bitmap(imageStream);
}
for (int x = 0; x < 10; x++)
    for (int y = 0; y < 10; y++)
        bmpImage.SetPixel(x, y, System.Drawing.Color.Aqua);

using (var ms = new MemoryStream())
{
    bmpImage.Save(ms, System.Drawing.Imaging.ImageFormat.Png);
    imagePlacement.Replace(ms);
}
//The old Ximage object has been replaced, but the softmask has not. It is stored in the document but there is no reference on it
//We can get rid off it with the following code
var opt = new OptimizationOptions()
{
    RemoveUnusedStreams = true,
};

pdf.OptimizeResources(opt);
pdf.Save(57540.pdf);
~~