Avoid Left Indentations for some Paragraphs during Word to PDF Conversion | C# .NET

Docx output Vs Pdf output.png (11.7 KB)
Hello,

We have a word template and to replace certain key words with our DB values. We were able to perform that same and save the document in DOCX successfully. But we also convert the same document in PDF format, but noticed the format issue with PDF. I have attached the required files to reproduce the same. Can you please let us know how can we fix this?

output.zip (74.1 KB)
Docx output Vs Pdf output.png (11.7 KB)

@srinudhulipalla,

While using the latest (21.8) version of Aspose.Words for .NET, we have also managed to reproduce this issue on our end. We have logged this issue in our bug tracking system with ID WORDSNET-22612. Your thread has also been linked to this issue and you will be notified here as soon as it will get resolved in future. Sorry for the inconvenience.

As a workaround, you may try the following C# code:

...
...
Document oDoc = new Document(inputPath);

FindReplaceOptions options = new FindReplaceOptions();
options.Direction = FindReplaceDirection.Forward;
options.MatchCase = false;
options.FindWholeWordsOnly = true;
options.ReplacingCallback = new WordDocReplaceHandler();

string productName = "first line\r\nsecond line\r\nthird line\r\nfourth line";

oDoc.Range.Replace("Variable.ProductName", productName, options);

oDoc.Save(outputPath, SaveFormat.Docx);

using (MemoryStream stream = new MemoryStream())
{
    oDoc.Save(stream, SaveFormat.Docx);
    stream.Position = 0;

    Document docx = new Document(stream);
    docx.Save(outputPDF, SaveFormat.Pdf);
}

Thank you for your workarround solution, that solves the problem for now.

I have the similar issue while replacing the space character in word. For the same given code the productName variable is defined as below:

string productName = "one\ntwo\n     three\nfour\n five";

You can see that, after word ‘two’ there is a new line character followed by five spaces. The output of the word document is inserting the new line correctly but ignoring the spaces. I am attaching the source code again for your referrence.

Program.zip (1018 Bytes)

Aspose words issue.png (3.9 KB)

Can you please tell me why spaces are not inserted in output? and what would be the workarround for it in case if the product has a bug.

@srinudhulipalla,

To address this problem, we have logged a separate issue with ID WORDSNET-22628. We will further look into the details of this problem and will keep you updated here on the status of linked issues. We apologize for your inconvenience.

Thank you, If you have any workarround solution before it that would be great.

@srinudhulipalla,

Sure, we will inform you here as soon as these issues will get resolved in future or any further workaround may be available.

@awais.hafeez Yes I understand that you will be informing here once they get resolved in new version. But my question is, instead of I wait for issue resolved, is there any workarround for it to overcome the problem temporarly?

@srinudhulipalla,

I am afraid, WORDSNET-22628 is currently pending for analysis and is in the queue. There isn’t any workaround available at the moment. Once the analysis of this issue is completed and the root cause is determined, we may then be able to provide you a workaround. We apologize for any inconvenience.

Sure, thank you for the update. Once the internal analysis done, please let me know the workarround for the same. I will be awaiting for your response.

@srinudhulipalla,

Sure, we will keep you posted here on any further updates/workaround.

@srinudhulipalla,

Regarding WORDSNET-22612, we have completed the analysis of this issue and concluded to close this issue with “not a bug” status. This is the problem of handling \n characters in text. Aspose.Words replaces all \n characters with EmSpace, and the sequence \r\n in the paragraph results in {paragraphinline}{emspace} in Aspose.Words’ layout. This {emspace} is at the start of the line and hence visually it appears as if lines were positioned to the right.

If document, before it is exported into PDF, is saved to DOCX and reloaded, then the problem goes away. This happens due to \r\n exported as <w:cr> which is then loaded as \r.

We believe the problem here is with the code you are using i.e. there is misunderstanding of what \r\n in the text would do in the document. Either builder needs to recognize the intent of the caller and replace the content appropriately or it should provide clear rules on how this is handled.

As a workaround you can re-save the document before PDF export (can be done using memory stream), or can replace \r\n with \r in replacement string.

e.Replacement = Regex.Replace(e.Replacement, @"\r\n?|\n", "\r");