The new line symbol \r\n

ortasa · April 29, 2017, 11:19pm

Good morning,

Will I see \r\n in a pdf document if the text decrease a line because the place was not enough and not because the user click enter ?

Thanks,

Ortal

codewarior · April 30, 2017, 10:21am

Hi Ortal,

Thanks for contacting support.

Can you please share some details on how you are creating PDF document. Please share if you are creating PDF document from scratch or trying to manipulate any existing document. Once we have related information, we will be able to reply accordingly.

ortasa · April 30, 2017, 11:55am

Hi,

I have byte[] pdfDocumentByte as input.

//I use it to creat new Aspose.Pdf.Document

Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(new MemoryStream(pdfDocumentByte));

Thanks,

Ortal

ortasa · April 30, 2017, 11:56am

Hi,

I have byte[] pdfDocumentByte as input.

//I use it to creat new Aspose.Pdf.Document

Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(new MemoryStream(pdfDocumentByte));

Thanks,

Ortal

codewarior · May 1, 2017, 3:34am

Hi Ortal,

Thanks for sharing the details.

As per your comments above, you are loading any existing PDF file and trying to perform some manipulation. However please share some more details on what operation you would like to perform over the PDF document. Please share some details, so we may reply accordingly.

ortasa · May 1, 2017, 6:23am

Hi,

I am trying to replace text between two string ( start text and end text )to be ***(anonymize a document). To do so I use Replace Text Based on a Regular Expression . Currently my regular expression does not address new line. I’m not sure if my regular expression will return the same result for Text1 and for Text 2 because of a line decline:

regular expression =(?<=“approved by”)(\w)((.|(\r\n))?)[ \t]*(?=“approved by”)

Text1 = " report text.report text .report text .report text approved by ortal approved by"

Text 2= " report text approved by ortal approved by"

My code :

I have byte[] pdfDocumentByte, string endText and string startTextas input.

I use it to creat new Aspose.Pdf.Document

Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(new MemoryStream(pdfDocumentByte));

string regular = string.Empty;

if (string.IsNullOrEmpty(endText))
{
    regular = string.Format(@"(?<={0})(\w)*((.|(\r\n))*?).*$", startText);
}
else
{
    regular = string.Format(@"(?<={0})(\w)*((.|(\r\n))*?)[ \t]*(?={1})", startText, endText);
}

// Create TextAbsorber object to find all the phrases matching the regular expression
Aspose.Pdf.Text.TextFragmentAbsorber textFragmentAbsorber = new Aspose.Pdf.Text.TextFragmentAbsorber(regular);

// Set text search option to specify regular expression usage
Aspose.Pdf.Text.TextOptions.TextSearchOptions textSearchOptions = new Aspose.Pdf.Text.TextOptions.TextSearchOptions(true);
textFragmentAbsorber.TextSearchOptions = textSearchOptions;
// Accept the absorber for a single page
pdfDocument.Pages[1].Accept(textFragmentAbsorber);
// Get the extracted text fragments
Aspose.Pdf.Text.TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;
foreach (Aspose.Pdf.Text.TextFragment textFragment in textFragmentCollection)
{
    // Update text and other properties
    text Fragment.Text = " *** ";
}

Thanks,

Ortal

imran.rafique · May 1, 2017, 6:30pm

Hi Ortal,

Thank you for the inquiry. Kindly prepare the input and expected output PDFs. You can attach an archive of the PDFs to your reply post. It will help us to be more specific. We shall investigate further and reply you appropriately. Your response is awaited.

ortasa · May 3, 2017, 9:50am

Attached an exampleץ

Thanks,

Ortal

codewarior · May 3, 2017, 3:23pm

Hi Ortal,

Thanks for contacting support.

In order to test the scenario of findings string between Start and End text pattern, I have used following regular expression and code snippet and as per my observations, the text is properly being replaced. However if your requirement is different, please share some further details and a sample project which can help us in reproducing the issue in our environment. We are sorry for this inconvenience.

[C#]

Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document("c:/pdftest/Example.pdf");
string regular = string.Empty;
// if (string.IsNullOrEmpty(endText))
{
    // regular = string.Format(@"(?<={0})(\w)*((.|(\r\n))*?).*$", "approve by");
    // regular = (?<="approved by")(\w)*((.|(\r\n))*?)[ \t]*(?="approved by");
}
// else
{
    regular = string.Format(@"(?<={0})(\w)*((.|(\r\n))*?)[ \t]*(?={1})", "Quick Styles gallery on the Home", "Ortal A");
}

// Create TextAbsorber object to find all the phrases matching the regular expression

Aspose.Pdf.Text.TextFragmentAbsorber textFragmentAbsorber = new Aspose.Pdf.Text.TextFragmentAbsorber(regular);

// Set text search option to specify regular expression usage

Aspose.Pdf.Text.TextSearchOptions textSearchOptions = new Aspose.Pdf.Text.TextSearchOptions(true);

textFragmentAbsorber.TextSearchOptions = textSearchOptions;
// Accept the absorber for a single page
pdfDocument.Pages[1].Accept(textFragmentAbsorber);
// Get the extracted text fragments
Aspose.Pdf.Text.TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;

foreach (Aspose.Pdf.Text.TextFragment textFragment in textFragmentCollection)
{
    // Update text and other properties
    textFragment.Text = " *** ";
}

codewarior · May 3, 2017, 3:25pm

Hi Ortal,

For your reference, I have also attached the resultant PDF generated over my end.