Document structure is incorrect after DOCX to PDF conversion using .NET

Hello!
We have been using Aspose to convert Docx to pdfa-1a.

We have found that paragraph text converted incorrectly. Each word of paragraph text are placed in the different tag. (see “converted_with_aspose” attached image). However, if we save the same file from MS Word, all paragraph text is placed in one tag (see “save_from_word” attached image).

Is this known behavior?
Is there any workaround to get the text in one teg after conversion?
Will this behavior be fixed?

The input file is attached.
Also checked on 19.10 Aspose.Word. Behavior is reproduced.

Code example:

    var inputFilePath = "test_doc.docx";
    var outputFilePath = "result.pdf";

    var inputDocument = new Aspose.Words.Document(inputFilePath);
        
    var pdfSaveOptions = new Aspose.Words.Saving.PdfSaveOptions();
    pdfSaveOptions.OutlineOptions.HeadingsOutlineLevels = 9;
    pdfSaveOptions.DisplayDocTitle = true;
    pdfSaveOptions.DmlRenderingMode = Aspose.Words.Saving.DmlRenderingMode.DrawingML;
    pdfSaveOptions.ExportDocumentStructure = true;
    pdfSaveOptions.Compliance = PdfCompliance.PdfA1a;

    inputDocument.Save(outputFilePath, pdfSaveOptions);

Files
input_file.zip (12.4 KB)
result_from_aspose.pdf (24.5 KB)
result_from_word.pdf (191.2 KB)
converted_with_aspose.png (17.0 KB)
save_from_word.png (8.8 KB)

Thanks.

@uaprogrammer

Please note that exporting document structured is ignored when saving to PDF/A-1a.

We have tested the scenario and have managed to reproduce the same issue at our side. For the sake of correction, we have logged this problem in our issue tracking system as WORDSNET-19458. You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

Thanks! Waiting for resolving

Hi,

Would you be so kind to provide us with status on the WORDSNET-19458 issue?

Best regards,

Oleh

@uaprogrammer

We regret to share with you that your issue (WORDSNET-19458) has been postponed due to missing feature WORDSNET-17510 (Aspose.Words does not mimic MS Word for document structure tags) which is related to your issue.

After the fix of WORDSNET-17510, we will look into your issue. We will be sure to inform you via this forum thread as soon as this feature is available. We apologize for your inconvenience.

Hi,

Would you be so kind to provide us with status on the WORDSNET-19458 issue?

Best regards,

Oleh

@uaprogrammer

Unfortunately, there is no update available on your issue. We will be sure to inform you via this forum thread once this issue is resolved. Thanks for your patience.

Hello,

We are wondering whether there are any updates regarding the issue in Aspose?

Thank you in advance.

Best regards, Oleh

@uaprogrammer

We regret to share with you that there is no update available on this issue. Due to complexity of related feature WORDSNET-17510, we are unable to share any ETA with you. We will inform you via this forum thread once there is an update available.

The issues you have found earlier (filed as WORDSNET-19458) have been fixed in this Aspose.Words for .NET 21.6 update and this Aspose.Words for Java 21.6 update.