Replacing text using regular expression

I have the following code to replace any text in my document between ^^ and ^^ with some specific text.

        Document pdfDocument = new Document("c:\\source.pdf");
        TextSearchOptions textSearchOptions = new TextSearchOptions(true);
        TextFragmentCollection textFragmentCollection;

        for (int i = 1; i <= pdfDocument.Pages.Count; i++)
        {
            Page page = pdfDocument.Pages[i];

            TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber(@"\^\^[^\^]+\^\^");
            textFragmentAbsorber.TextSearchOptions = textSearchOptions;
            page.Accept(textFragmentAbsorber);

            textFragmentCollection = textFragmentAbsorber.TextFragments;

            for (int j = 1; j <= textFragmentCollection.Count; j++)
            {
                TextFragment textFragment = textFragmentCollection[j];
                textFragment.Text = "my replacement";
            }
        }
        pdfDocument.Save("c:\\modified.pdf");

This almost works ok, but for some reason it messes up the formatting of the document. Please check modified.pdf file. Replacement only works with the first match. It replaces the other ones but it seems to add some extra indentation.

Could you please let me know what I’m doing wrong?

Thanks
modified.pdf (32.8 KB)
source.pdf (29.8 KB)

@moonlit,

Thank you for contacting support.

I like to inform that sample code shared by you contain some undefined variables. Would you please share SSCCE code that you are using for this , so that we may try to reproduce and investigate it in our environment.

I have just modified the code so it should be fixed now.

Please let me know if you need some additional information.

@moonlit

We are looking into it and will get back to you shortly.

1 Like

Do you guys have an update about this? Thanks.

@moonlit,

We have worked with source file and sample code shared by you and generated result. For further investigation can you please share desired result along with comparison screenshot of issue so that we may help you out.

It’s very simple. I just need to replace the text between ^^ with the text “my replacement” without altering the formatting of the file. You can see in the modified.pdf file that is all messed up. Please check how the WebItems_Time column has been shifted to the right. I thought you guys after 3 days were already aware of the issue and was investigating it. Not sure if your reply means that you don’t see a problem?

@moonlit

We have tested the scenario in our environment while using Aspose.PDF for .NET 19.12 and were unable to notice any issue. For your kind reference, an output PDF is also attached. Would you please try using latest version of the API and if you face any issue, please let us know.

modified_out.pdf (27.7 KB)

I was able to get a temporary license to test it with your latest version and it works, so it was a problem with the version that we were using (10.9.0.0)

@espinosaluis

It is good to know that your issue has been resolved while using latest version of the API. Please keep using our API and in case of any further assistance, please feel free to let us know.