Good morning,
Hi Ortal,
Hi,
Hi,
Hi Ortal,
Thanks for sharing the details.
Hi,
I am trying to replace text between two string ( start text and end text )to be ***(anonymize a document). To do so I use Replace Text Based on a Regular Expression . Currently my regular expression does not address new line. I’m not sure if my regular expression will return the same result for Text1 and for Text 2 because of a line decline:
regular expression =(?<=“approved by”)(\w)((.|(\r\n))?)[ \t]*(?=“approved by”)
Text1 = " report text.report text .report text .report text approved by ortal approved by"
Text 2= " report text approved by ortal approved by"
My code :
I have byte[] pdfDocumentByte, string endText and string startTextas input.
I use it to creat new Aspose.Pdf.Document
Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(new MemoryStream(pdfDocumentByte));
string regular = string.Empty;
if (string.IsNullOrEmpty(endText))
{
regular = string.Format(@"(?<={0})(\w)*((.|(\r\n))*?).*$", startText);
}
else
{
regular = string.Format(@"(?<={0})(\w)*((.|(\r\n))*?)[ \t]*(?={1})", startText, endText);
}
// Create TextAbsorber object to find all the phrases matching the regular expression
Aspose.Pdf.Text.TextFragmentAbsorber textFragmentAbsorber = new Aspose.Pdf.Text.TextFragmentAbsorber(regular);
// Set text search option to specify regular expression usage
Aspose.Pdf.Text.TextOptions.TextSearchOptions textSearchOptions = new Aspose.Pdf.Text.TextOptions.TextSearchOptions(true);
textFragmentAbsorber.TextSearchOptions = textSearchOptions;
// Accept the absorber for a single page
pdfDocument.Pages[1].Accept(textFragmentAbsorber);
// Get the extracted text fragments
Aspose.Pdf.Text.TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;
foreach (Aspose.Pdf.Text.TextFragment textFragment in textFragmentCollection)
{
// Update text and other properties
text Fragment.Text = " *** ";
}
Thanks,
Ortal
Hi Ortal,
Attached an exampleץ
Hi Ortal,
Thanks for contacting support.
In order to test the scenario of findings string between Start and End text pattern, I have used following regular expression and code snippet and as per my observations, the text is properly being replaced. However if your requirement is different, please share some further details and a sample project which can help us in reproducing the issue in our environment. We are sorry for this inconvenience.
[C#]
Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document("c:/pdftest/Example.pdf");
string regular = string.Empty;
// if (string.IsNullOrEmpty(endText))
{
// regular = string.Format(@"(?<={0})(\w)*((.|(\r\n))*?).*$", "approve by");
// regular = (?<="approved by")(\w)*((.|(\r\n))*?)[ \t]*(?="approved by");
}
// else
{
regular = string.Format(@"(?<={0})(\w)*((.|(\r\n))*?)[ \t]*(?={1})", "Quick Styles gallery on the Home", "Ortal A");
}
// Create TextAbsorber object to find all the phrases matching the regular expression
Aspose.Pdf.Text.TextFragmentAbsorber textFragmentAbsorber = new Aspose.Pdf.Text.TextFragmentAbsorber(regular);
// Set text search option to specify regular expression usage
Aspose.Pdf.Text.TextSearchOptions textSearchOptions = new Aspose.Pdf.Text.TextSearchOptions(true);
textFragmentAbsorber.TextSearchOptions = textSearchOptions;
// Accept the absorber for a single page
pdfDocument.Pages[1].Accept(textFragmentAbsorber);
// Get the extracted text fragments
Aspose.Pdf.Text.TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;
foreach (Aspose.Pdf.Text.TextFragment textFragment in textFragmentCollection)
{
// Update text and other properties
textFragment.Text = " *** ";
}
Hi Ortal,