PDF.Net issue with converting PDF to PPTX - Tables from PDF is not converting to tables in PPTX

PDF.Net issue with converting PDF to PPTX - Tables from PDF is not converting to tables in PPTX. ASPOSE is converting it to individual/combined textboxes and images of the table cells/background. Makes it very difficult/ impossible for end user to manipulate the data after converting into PPTX. Here is the code sample I am using in .net

Dim pdfpath As String = “C:\Test"
Dim FileName As String = “Test_tables.pdf”
Dim PPTXFullpath As String = pdfpath + Replace(FileName, LCase(”.pdf"), “”) + “_Converted.pptx”

Dim myCombinePdfs As Aspose.Pdf.Document
myCombinePdfs = New Aspose.Pdf.Document(pdfpath + FileName)
Dim PptxSaveOptions As PptxSaveOptions = New PptxSaveOptions()
PptxSaveOptions.SlidesAsImages = False
PptxSaveOptions.OptimizeTextBoxes = True
myCombinePdfs.Save(PPTXFullpath, PptxSaveOptions)

@mrJay

When converting a PDF to PPTX using Aspose.PDF, if tables from the PDF are being converted into individual text boxes or images instead of retaining their table structure, you may want to ensure that the PptxSaveOptions are configured correctly.

In your code, you have set PptxSaveOptions.SlidesAsImages to False, which is correct for retaining the text and table structures. However, if the tables are still not converting as expected, you might want to check the following:

  1. Optimize Text Boxes: You have already set PptxSaveOptions.OptimizeTextBoxes to True. This option can help in optimizing the text boxes, but it may not always guarantee that tables will be converted as tables.

  2. Check PDF Structure: The way tables are structured in the original PDF can affect how they are converted. If the tables are complex or not well-defined in the PDF, Aspose may struggle to interpret them correctly.

  3. Update Aspose.PDF Library: Ensure that you are using the latest version of the Aspose.PDF library, as updates often include improvements and bug fixes related to document conversion.

Here is a sample code snippet that demonstrates how to convert a PDF to PPTX while attempting to retain the table structure:

Dim pdfpath As String = "C:\Test\"
Dim FileName As String = "Test_tables.pdf"
Dim PPTXFullpath As String = pdfpath + Replace(FileName, LCase(".pdf"), "") + "_Converted.pptx"

Dim myCombinePdfs As Aspose.Pdf.Document
myCombinePdfs = New Aspose.Pdf.Document(pdfpath + FileName)
Dim PptxSaveOptions As PptxSaveOptions = New PptxSaveOptions()
PptxSaveOptions.SlidesAsImages = False
PptxSaveOptions.OptimizeTextBoxes = True

' Save the document
myCombinePdfs.Save(PPTXFullpath, PptxSaveOptions)

If the issue persists, consider reaching out to Aspose support with specific examples of the PDF files you are working with, as they may provide more tailored assistance based on the content of your documents.

For further reference, you can check out the example code provided in the Aspose GitHub repository for converting PDF to PPTX, which may offer additional insights into the conversion process.

Sources:
[1]: PDFToPPT.cs

Here are the sample files -

  1. Test_tables.pdf - File to be converted
  2. Test_tables_Converted.pptx - incorrectly converted file
    sample_files.zip (92.1 KB)

@mrJay

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-58946

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.