Bottom rows jumps to next page

Hi

When we convert Word file to Pdf, several bottom rows jumps to the next page. Native Word convertion works fine.
That behavior reproduces at Конвертировать DOC В ПДФ Бесплатно - Doc В Pdf and in code with Aspose.Words 22.5.0.

Sample code:

var document = new Document(filename);
var saveOptions = (PdfSaveOptions)SaveOptions.CreateSaveOptions(SaveFormat.Pdf);
var resultFilename = Path.ChangeExtension(filename, ".pdf");
using (var outputFile = File.Create(resultFilename))
{
    document.Save(outputFile, saveOptions);
}

Word file:
Source.docx (19.0 KB)

Converted file to Pdf with Aspose:
Dest_Aspose.pdf (44.6 KB)

Converted file to Pdf with Word:
Dest_Word.pdf (169.7 KB)

@directum MS Word 2019 produces the same output as Aspose.Words on my side. Please see the attached files produced by Aspose.Words 22.8 and MS Word 2019: ms.pdf (78.6 KB) out.pdf (47.0 KB)
As I can see your PDF document was produced by MS Word 2016. You can produce the expected output using the following code:

Document doc = new Document(@"C:\Temp\in.docx");
doc.CompatibilityOptions.OptimizeFor(MsWordVersion.Word2016);
doc.Save(@"C:\Temp\out.pdf");

Here is PDF produced by this code: out_optimized_for_2016.pdf (46.5 KB)

Mine MS Word 2019 converts to Pdf and displays Doc same as MS Word 2016. Look at Source_Word_2019.pdf (77.7 KB) and Word2019.PNG (7.5 KB)

However Aspose should convert Doc to Pdf independent from Word version.

@directum

Actually this is not true because different MS Word versions writes different set of compatibility options into the document and they might affect document layout. Aspose.Words respects the compatibility options and adjusts the document layout accordingly.
In your case the CompatibilityOptions.BalanceSingleByteDoubleByteWidth affects the layout of the document. If reset this option document layout is correct in both MS Word 2019 and in Aspose.Words:

Document doc = new Document(@"C:\Temp\in.docx");
doc.CompatibilityOptions.BalanceSingleByteDoubleByteWidth = false;
doc.Save(@"C:\Temp\out.docx");

Here is input (your original document) and output documents opened in MS Word 2019 on my side:

I think document in your Word 2019 looks not like in mine because compability option “Use Word 97 line-breaking rules for Asian text”/“Разбиение восточноазиатского письма на строки как в Word 97” is Enabled in your Word.

I tried to Set/UnSet compability option “UseWord97LineBreakRules” in code and it doesn’t have any affect. May be problem at this property of CompatibilityOptions class?

@directum As I mentioned the problem is in CompatibilityOptions.BalanceSingleByteDoubleByteWidth, which is set in your document. CompatibilityOptions are set in document, this not MS Word setting.

I think problem somewhere else. Let’s start from another side.
If compare showing 1) Document in Word 2016, 2) Pdf from Word 2016 and 3) Pdf from Aspose with OptimizeFor, than you can see significant difference of alignment in right area of the page.
CompareWordAspose.png (50.2 KB)

OptimizeFor and BalanceSingleByteDoubleByteWidth resolves problem of “jumping” rows, but not fixes spaces convertion.

p.s. I cannot reproduce problem with new document in Word 2016, I agree that problem with concrete document.

@directum Could you please attach your Times New Roman font (from machine where you convert the document using MS Word)? Maybe the problem is with different versions of this font on my and on your side. I will check conversion with your font and let you know the result.
Also, I cannot reproduce the same PDF output as yours on my side in MS Word 2019.

@alexey.noskov times_new_roman_font.zip (4.1 MB)
Attached font from both hosts with Word 2016 and Word 2019. On both hosts test document displaying equally.

@directum Thank you for additional information. I have newer version of the fonts, but it looks like this is not the reason of the problem.
I have tested your document on other two machines with MS Word 2019 and both show the same result as yours. So it looks like my MS Word and Aspose.Words shows incorrect result. For a sake of correction the problem has been logged as WORDSNET-24189. We will keep you updated and let you know once the issue is resolved or we have more information for you.

1 Like