Issues with text replacements (formatting- missing letters-...)

Hi Aspose team,

I am currently testing how replacement of text works.
Current issues:

  • original document “Online Payment Processing.pdf”
  • document with replacements “Online Payment Processing_NEW.pdf”

On page 2 you can see that original phrase “Payment Processing” is in italic and in new document you can see that formatting is gone. Probably similar across the document.

On page 3 replacement text is with missing characters - full text is “NEW LONGER TEXT TO CHECK” and in document you can see “NEW LONGER TE T TO CHECK”. Weird that when I do copy paste of not ok text I see that complete text is there so this is some issue with display.

On page 4 tittle is in original document in bold in new document normal.
All in all formatting is gone.

Text that I wanted to replace is “payment processing” if it is split in 2 rows what would be your recommendation to make replacement in this case. If you change changed document you can see that if this phrase is split in 2 lines now changes are done.

To run application just press Start button as everything is hard-coded.

P.S. I attached project and test documents (there are in bin/release folder). License file is removed

Thanks in advance,
Oliver

Hi Oliver,


Thanks for your inquiry. While testing your scenario , we have noticed the formatting and missing character issue. So logged following issues for further investigation and resolution.

PDFNEWNET-39020: Text style lost issue
PDFNEWNET-39021: missing character issue

Moreover regarding searching and replacing multi-line text, I am afraid it is not supported at the moment. We have already logged a ticket PDFNEWNET-33625 for the purpose. We have linked your post to the issue id and will notify you as soon as it is resolved.

Best Regards,

Hi,

one remarks - I am using .NET version and my understanding is that you opened bugs for Java (PDFNEWJAVA).

Can you please confirm that this will be fixed also for .NET version?

Thanks,
Oliver

Hi Oliver,


I am sorry for the confusion, it is a typo error. The issues are logged in .NET version of Aspose.Pdf and fixed in above reply.

Best Regards,
Hi,

any news here?

Hi Oliver,


Thanks for your patience.

The product team started investigating earlier reported issues but I am afraid due to large number of other priority issues, they are not yet resolved. However they have a plan to investigate the issues in following months and as soon as we have some further updates, we will let you know.

Please be patient and spare us little time.

@dr_oli

Thank you for being patient.

We have investigated PDFNET-36746 and found it is not to be occurring with Aspose.PDF for .NET 18.7. With reference to regular expressions, “2013-2014” and @“2013-NewLine 2014” are two different strings. Regular expression “\d{4}-\d{4}” matches the first but not the second one. Please use another regex “\d{4}(\r\n)?-(\r\n)?\d{4}”. It matches all samples. Below is a code snippet for your kind reference.

//open document
Document pdfDocument = new Document(myDir + @"Demo32.pdf");
//create TextAbsorber object to find all instances of the input search phrase
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber(@"\d{4}(\r\n)?-(\r\n)?\d{4}"); //like 1999-2000
//set text search option to specify regular expression usage
TextSearchOptions textSearchOptions = new TextSearchOptions(true);
textFragmentAbsorber.TextSearchOptions = textSearchOptions;
//accept the absorber for all the pages
pdfDocument.Pages.Accept(textFragmentAbsorber);
//get the extracted text fragments
TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;
Console.WriteLine(textFragmentCollection.Count);
//loop through the fragments
foreach (TextFragment textFragment in textFragmentCollection)
{
    //update text and other properties
    textFragment.Text = "xxxx-xxxx";
    //set to an instance of an object.
    textFragment.TextState.Font = FontRepository.FindFont("Calibri");
    //textFragment.TextState.FontSize = 22;
    //textFragment.TextState.ForegroundColor = Aspose.Pdf.Color.FromRgb(System.Drawing.Color.Blue);
    //textFragment.TextState.BackgroundColor = Aspose.Pdf.Color.FromRgb(System.Drawing.Color.Green);
}
pdfDocument.Save(myDir + @"Demo32_out.pdf");

Generated file has been attached for your kind reference. Demo32_out.pdf. Please feel free to contact us if you need any further assistance.