Aspose.PDF - PDF to DOCX conversion issues

When using the Aspose.PDF free converter at Convert PDF | Online and Free our PDF converts to DOCX with a nearly identical layout. However, when we attempt to convert using C# on .NET Core in Debian Linux the conversion is missing some of the signature lines and one page is jumbled. Can you help me understand what may be different about the Aspose online conversion tool and our .NET installation to help us achieve similar results?

Original PDF document: original.pdf (278.7 KB)

DOCX received from Aspose online conversion (nearly identical):

DOCX received from Aspose.PDF C# on .NET Core (broken content):

Our .csproj file looks like:

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>netcoreapp3.1</TargetFramework>
    <RuntimeIdentifiers>win10-x64;osx.10.12-x64;debian.10-x64;linux-x64</RuntimeIdentifiers>
    <TargetLatestRuntimePatch>true</TargetLatestRuntimePatch>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="Aspose.PDF" Version="21.7.0" />
    <PackageReference Include="Aspose.Words" Version="21.8.0" />
    <PackageReference Include="SkiaSharp" Version="2.80.3" />
    <PackageReference Include="SkiaSharp.NativeAssets.Linux" Version="2.80.3" />
    <PackageReference Include="System.Collections" Version="4.3.0" />
    <PackageReference Include="System.Diagnostics.Debug" Version="4.3.0" />
    <PackageReference Include="System.IO.FileSystem.Primitives" Version="4.3.0" />
    <PackageReference Include="System.Runtime.Extensions" Version="4.3.0" />
    <PackageReference Include="System.Runtime.InteropServices" Version="4.3.0" />
  </ItemGroup>

</Project>

We’ve tried using both versions of the code samples at Convert PDF to Microsoft Word Documents in Java|Aspose.PDF for Java. Our latest C# implementation looks like:

public static void PDFToWord(string sourceUri, string destUri)
{
    if (File.Exists(@sourceUri))
    {
        var pdfDoc = new Aspose.Pdf.Document(sourceUri);

        Aspose.Pdf.DocSaveOptions saveOptions = new Aspose.Pdf.DocSaveOptions();
        saveOptions.Format = Aspose.Pdf.DocSaveOptions.DocFormat.DocX;
        saveOptions.Mode = Aspose.Pdf.DocSaveOptions.RecognitionMode.Flow;
        saveOptions.RecognizeBullets = true;

        pdfDoc.Save(Path.GetFullPath(destUri), saveOptions);
    }
    else
    {
        Console.Error.WriteLine("Unable to find file");
        Environment.ExitCode = 1;
    }
}

original.pdf (279 KB)
converted-aspose.app.docx (24.6 KB)
converted-NET-core.docx (36.2 KB)

@josh.kersey

I can reproduce that the signatures of Amy Winters are missing on Page 2 but can you please share a snapshot of the jumbled page so that we may proceed further with investigation on our end.

These previews are in Microsoft Word for mac version 16.43. Below I’ve included screenshots of each page as we see them when comparing the Aspose.app conversion (on the left in each screenshot) to the conversion generated locally by Aspose.PDF on .NET Core 3.1 (on the right in each screenshot).

Page 1: page-1-comparison.png (69.0 KB)

  • Missing two signature lines.

Page 2: page-2-comparison.png (71.0 KB)

  • Missing 4 signature lines.
  • Missing Amy Winters signature

Page 3: page-3-comparison.png (90.1 KB)

  • Missing the signature line under Paul Schilpp
  • The text and signature lines in the second and third columns are jumbled.

Page 4: (No screenshot)

  • Renders as expected.

@josh.kersey

A ticket with ID PDFNET-50394 has been created in our issue tracking system to further investigate the issue on our end. This thread has been linked with the issue so that you may be notified once the issue will be fixed.

Would escalating this problem to paid support help us to get a prompt resolution? We are currently unable to move forward with our Aspose.PDF integration until this issue is resolved.

@josh.kersey

The paid support issues are resolved on urgent basis and have the highest priority. Please note that paid support does not guarantee immediate solutions but it does expedite the process of investigation in order to get an ETA. In other words, the issue investigation will be started quickly once you report it in paid support.

The issues you have found earlier (filed as PDFNET-50394) have been fixed in Aspose.PDF for .NET 22.11.