Need to Convert a PDF File (with Embedded Files) to PPTX in C#

Hello,

We have a need of converting PDF file (with embedded files) to pptx using Aspose. We observed that post conversion, aspose is converting attachments as images in the out put PPT file.

Below are the required details:

Aspose 23.12.0.0

OS version on which the conversion was performed

Edition Windows 10 Enterprise
Version 21H2
OS build 19044.2486
Experience Windows Feature Experience Pack 120.2212.4190.0

.NET target platform in your app
.Net Framework 4.8

Could you please check this and provide your inputs/solution?

Please let us know in case other information is needed.

@lpappachen,
Thank you for posting your requirements.

To better understand the issue, could you please share a sample PDF document and describe the expected results?

Aspose ticket.zip (1.3 MB)

Im attaching the sample files FYR and also PFB code with which we are trying. Please let me know of any questions.

Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(dataDir + "input PDF.pdf");

Aspose.Pdf.PptxSaveOptions pptx_save = new Aspose.Pdf.PptxSaveOptions();
pdfDocument.Save(dataDir + "OutPut PPT.pptx", pptx_save);

We have also tried with different PptxSaveOptions options but no luck

Aspose.Pdf.PptxSaveOptions pptx_save = new Aspose.Pdf.PptxSaveOptions
{
    SeparateImages = true,
};

@lpappachen,
You are using Aspose.PDF, so I’ve moved this forum thread to the Aspose.PDF forum. My colleagues will answer you soon.

Hello ,

Any update on this request. Also appreciate if you can share the ticket / link to Aspose.PDF forum for this issue ,if there is any.

Thanks.

@lpappachen

We were able to reproduce the issue in our environment. Therefore, have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-56773

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

@lpappachen

Aspose.PDF does not support conversion docx to images, but it can unload attachments.
We have prepared a code snippet with comments to solve the problem:

var pdfDocument = new Aspose.Pdf.Document(dataDir + "56773.pdf");
var extractedFiles = new List<string>();
//Extract embedded files
var extractor = new Aspose.Pdf.Facades.PdfExtractor(pdfDocument);
extractor.ExtractAttachment();
foreach (var attachedFile in extractor.GetAttachmentInfo())
{
    var attachName = dataDir + "56773_" + attachedFile.Name;
    extractedFiles.Add(attachName);
    using (var fs = new FileStream(attachName, FileMode.Create))
    {
        attachedFile.StreamContents.CopyTo(fs);
    }
}
var extractedFilesAsPdf = new List<string>();
//Convert to Images by Aspose.Words
foreach (var attachName in extractedFiles)
{
    if (attachName.EndsWith(".doc") || attachName.EndsWith(".docx"))
    {
        var attachConvertedName = attachName.Remove(attachName.IndexOf(Path.GetExtension(attachName))) + ".pdf";
        extractedFilesAsPdf.Add(attachConvertedName);
        var doc = new Aspose.Words.Document(attachName);
        doc.Save(attachConvertedName);
    }
}
//Add pages
foreach (var attachName in extractedFilesAsPdf)
{
    using (var attachDoc = new Aspose.Pdf.Document(attachName))
    {
        pdfDocument.Pages.Add(attachDoc.Pages);
    }
}
//Convert
var pptx_save = new Aspose.Pdf.PptxSaveOptions();
pdfDocument.Save(dataDir + "56773.pptx", pptx_save);

It uses combination of Aspose.Pdf and Aspose.Words.
56773_code_snippet_result.zip (11.7 KB)