Conversion PDF to Excel - columns behaviour at some pages

Please review the XLSX generated for the attached file.

Even though this document pages are the same format, some pages are being extracted differently.

At this sample, first page, the first left column is together with the central description column. At the second and third page, the left column is apart from the description, which is the desired behavior.

Input file:
Test report.pdf (386.6 KB)

Output file:
Test report_2023.07.26-08.30.30.xlsx.zip (14.1 KB)

My code:

public JsonResult PDFtoXLSX(FormCollection formCollection)
{
    var inputFile = Request.Files[0];

    Aspose.Pdf.License useLicence = new();
    useLicence.SetLicense(_localKey);

    var document = new Document(inputFile.InputStream);

    var setting = new ExcelSaveOptions { MinimizeTheNumberOfWorksheets = true };

    var outputPath = ...;

    // save document in XLS format
    document.Save(outputPath, setting);

    return Json(new
    {
        success = true,
    }, JsonRequestBehavior.AllowGet);
}

@rd1218

We tried to test the scenario in our environment using Aspose.PDF for .NET 23.7 and noticed that the program kept running without producing any output. Can you please share how much time it is taking at your end and how much memory consumption it is taking? We will further proceed to assist you accordingly.

@asad.ali
It actually took <10 seconds to finish this sample.
I’m running .Net 4.8 and Aspose.PDF 23.7.0.

@rd1218

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-55161

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Is it possible to have a preview on when this issue will be addressed?

@rd1218

We would like to share with you that the issue has been resolved in 23.8 version of the API which will be available soon in this month i.e. August 2023. We will notify you once new release is published.

The issues you have found earlier (filed as PDFNET-55161) have been fixed in Aspose.PDF for .NET 23.8.