What is the best way to convert a customer PDF with a watermark that is shared in multiple pages, to a PDF with separate watermarks for each page?
I want this, because I need to allow redaction on PDF files that do not affect multiple pages, but only the page that gets redacted by the user. We use a viewer for redaction, that redacts all instances of the watermark on all the pages, when the watermark is shared in multiple pages.
I guess that copying the PDF pages one by one into a new document MIGHT work, but I am not sure it will, and I am not sure if it is the best way. Do you have any suggestions?
Thank you.
@kgk2000
Could you please share a sample source and an expected output PDF for our reference so that we can try to better understand your requirements? We will also test the scenario in our environment and address it accordingly.
Essentially, I want to convert the document produced by this code (shared watermark):
var pdfDocument = ....; // Apose PDF document object.
var formattedText = new FormattedText();
formattedText.AddNewLineText("watermark text");
var textStamp = new TextStamp(formattedText);
textStamp.Background = Options.IsBackground;
textStamp.XIndent = Convert.ToDouble(Options.XIndent);
textStamp.YIndent = Convert.ToDouble(Options.YIndent);
textStamp.TextState.Font = Aspose.Pdf.Text.FontRepository.FindFont(Options.FontName);
textStamp.RotateAngle = Options.RotateAngle;
textStamp.Opacity = Options.Opacity;
textStamp.TextState.FontSize = Options.FontSize;
textStamp.TextState.ForegroundColor = Aspose.Pdf.Color.FromRgb(Options.ForegroundColor);
foreach (Page pdfPage in pdfDocument.Pages)
{
pdfPage.AddStamp(textStamp);
}
…into the document produced by this code (duplicated watermarks per page):
var pdfDocument = ....; // Apose PDF document object.
foreach (Page pdfPage in pdfDocument.Pages)
{
var formattedText = new FormattedText();
formattedText.AddNewLineText("watermark text");
var textStamp = new TextStamp(formattedText);
textStamp.Background = Options.IsBackground;
textStamp.XIndent = Convert.ToDouble(Options.XIndent);
textStamp.YIndent = Convert.ToDouble(Options.YIndent);
textStamp.TextState.Font = Aspose.Pdf.Text.FontRepository.FindFont(Options.FontName);
textStamp.RotateAngle = Options.RotateAngle;
textStamp.Opacity = Options.Opacity;
textStamp.TextState.FontSize = Options.FontSize;
textStamp.TextState.ForegroundColor = Aspose.Pdf.Color.FromRgb(Options.ForegroundColor);
pdfPage.AddStamp(textStamp);
}
My current way of doing it, which seems to be working, is this (copy pages, one by one, into a new document):
var pdfFinal = new Aspose.Pdf.Document();
foreach (var page in pdfOriginal.Pages)
{
pdfFinal.Pages.Add(page);
}
pdfFinal.Save(outMs);
If you know any better way of doing it, let me know.
Also, if you know of any way of detecting the shared watermarks the 1st code snippet generates, let me know.
Thanks.
@kgk2000
In your scenario, do you already know about the watermark text? OR you want to detect it dynamically? Also, in your code snippets, it looks like the position and rotation angle values are different in both cases? Could you please specify them as well for our reference so that the scenario can be tested accordingly and we can share some better approach if we find any?
I want to detect it or, if that is not possible, treat the file as having one anyway.
The main goal is to NOT share the same watermark object in all pages, but force a different watermark object to be used for each page. That way, any redactions applied to one page and touching the watermark, do not affect the watermark in other pages.
Also, it seems that my fix above does not work after all.
Regarding the rotation angles, they can be anything. And the watermark can be anything. It shouldn’t make any difference.
@kgk2000
Have you faced such issue in any of your PDFs? It looks like the main issue is happening while applying redactions at your end. It would be helpful for us if you can please share your sample source PDF document that has shared watermark on each page of it. Also, please share the code for applying redactions. We will test the scenario in our environment and address it accordingly.
Hello.
I don’t have any code for redaction. A 3rd-party HTML viewer that we use does the redaction.
But I now have some code that converts the PDF file in a form that does not create the issue, I just need to improve the performance of it a bit:
var pagesFileList = new List<string>(pdfOriginal.Pages.Count);
for (int i = 0; i < pdfOriginal.Pages.Count; i++)
{
var pdfPage= new Aspose.Pdf.Document();
pdfPage.Pages.Add(pdfOriginal.Pages[i+1]);
var path = GetNewTempFilePathWithExtension(".pdf");
pdfPage.Save(path);
pagesFileList.Add(path);
}
var pdfEditor = new PdfFileEditor();
var mergePath = GetNewTempFilePathWithExtension(".pdf");
var documentMergingResult = pdfEditor.Concatenate(pagesFileList.ToArray(), mergePath);
if (!documentMergingResult)
{
throw new Exception("Failed in document merging");
}
I appears that copying the pages one by one into a new document does not enforce creating a separate watermark object, so I have to save each page into a new PDF and read the page from there.
I think I will try to skip using the disk and do things in memory and try to see how good this fix will be in terms of performance.
I attach a file that recreates the problem, just for making this post complete, but I guess you will not be able to redact the file:
1234567.pdf (159.3 KB)
But it’s OK if you can’t help me because of it.
I think I found a quite good fix that should probably do the job.
@kgk2000
Thanks for providing the detail of complete scenario. The approach that you are using looks like a good workaround. We are afraid that we do not have any direct approach in the API to make the existing watermark as separate object for each page inside a PDF. Therefore, we have logged an investigation ticket as PDFNET-52388 in our issue tracking system. We will further analyze it in more details and let you know as soon as we have some feedback to share in this regard. Please spare us some time.