Convert PDF to PPT in C# using Aspose.PDF for .NET - Problems in output files

@stefan.net.test

We would like to share with you that the issue (logged as PDFNET-46584) has been resolved and its fix will be available in Aspose.PDF for .NET 20.2 version which will be releasing in coming week. We will surely notify you as soon as the fix-in version is published.

Regarding the remaining ticket, we regret to share that they are not yet resolved as they need more time to get investigated and resolved. We will inform you as soon as we have some additional updates regarding their fix. Please spare us some time.

We are sorry for the inconvenience.

1 Like

The issues you have found earlier (filed as PDFNET-46584) have been fixed in Aspose.PDF for .NET 20.2.

We experience a similar problem again. A customer uses the font Equip. The font is installed on our server but when converting (Word -> PDF -> PPTX) the text is not displayed.
Is this font not supported by you?
Using this font is quite important for the customer.

@stefan.net.test

There are certain fonts that require a license in order to use them. However, we need to check about this particular font that you referred. Would you kindly share your sample source PDF file with us. We will test the scenario in our environment and address it accordingly.

The font is correctly used in the word file, but not in PDF or PPT.
It seems that the font can be used in different spellings.
I attached two zip files. Files_v1 is the one where the spelling is used and nothing is displayed in PPT. Files_v2 is a different spelling where the text is displayed but the font is still wrong.
Files_v1.zip (26.4 KB)
Files_v2.zip (63.2 KB)

@stefan.net.test

We have tested the scenario in our environment and were able to replicate the issue. However, could you please provide the font file as well so that we can install it in our system and test again as it is necessary to install used fonts in the system where PDF file is being processed.

Furthermore, regarding PDFNET-46567, please provide MS Mincho font file that you have installed in your system so that we can further try to complete the opened investigation against this ticket.

I asked our customer if I can send you the font files. As the license is only for their company I would like to send the files per mail and not here. To which mail adress should I send the files? The customer asked me to emphasize that the files should only be used for the investigation (should be deleted after that) because the payed for them. Otherwise this will contradict the license.

Regarding PDFNET-46567, we donā€™t use the font MS Mincho. All information regarding this problem are in the comment from Jan 22 2020.

@stefan.net.test

As per our understandings, these comments were in the response to PDFNET-46582 and PDFNET-46583 tickets. The issue related to MS Mincho font was PDFNET-46567 (initially reported by @jayjain). For the complete investigation of this issue, we need sample font file that is installed in user system.

Instead of sending the font file via email, you can please send it in a private message by clicking over username and pressing the Blue Message Button. We assure you that we use the files only for investigation purposes and erase them from our system once the investigation is done.

The comments you linked are related to PDFNET-46567. The problem occurs with every font. I think we used Arial.
Thanks. I will send the font files over the private message.

@stefan.net.test

Thanks for sharing the font files.

We have tested the scenario in our environment while using Aspose.PDF for .NET 20.12 and Aspose.Wrods for .NET 21.1. Following code snippet was used to carry out the conversions:

Aspose.Words.Document doc = new Words.Document(dataDir + "20201218_Admin_DAS.docx");

Aspose.Words.Saving.PdfSaveOptions saveOption = new Words.Saving.PdfSaveOptions();
saveOption.Compliance = Words.Saving.PdfCompliance.PdfA1b;
saveOption.UseHighQualityRendering = true;
doc.Save(dataDir + "20201218_Admin_DAS.pdf", saveOption);
            
Document document = new Document(dataDir + "20201218_Admin_DAS.pdf");
document.Save(dataDir + "Converted.pptx", new PptxSaveOptions());

For the Files_V1, we did not notice any issue both in generated PDF and PPTX output after installing the fonts in our system. Please check the attached output PDF and PPTX that they both contains desired font.

Files_V1.zip (57.8 KB)

For second set of files i.e. Files_v2, output PDF did not have the desired font and so did the output PPTX. The reason seems related to the font name in DOCX input file. The font name used in the Word file was different than the name of installed font in the system. However, the issue is more related to Aspose.Words and we request you create a topic in respective category so that it can further be investigated.

Regarding the other issue i.e. PDFNET-46581, here we have the same situation as in PDFNET-46582 and PDFNET-46583. A new bullet would appear, if you would be formatting a list. And a list is an element of logical structure, which is not recognized by the API. Without this recognition you are formatting just an ordinary paragraph, where those bullets are ordinary characters, not list item markers.

The ticket information has been updated as per your feedback and we will inform you as soon as we have further updates regarding its resolution.

PS: We have uninstalled and removed the fonts from our system after testing the scenario.

I can not download the files you attached because Iā€™m not the owner of this topic. Can you please send me the files in a private message?

@stefan.net.test

We have shared the files with you in a private message.

Are there any updates on PDFNET-46567?

@stefan.net.test

We have performed the investigation against the ticket PDFNET-46567 and found that If the initial DOC file is generated on the same machine then the font most likely is modified along with DOC->PDF conversion which is not performed by Aspose.PDF API. As to Arial in ā€œDetecon_Carr_DE_2020-01-22 (1).pdfā€, you can discover with Adobe Acrobat Reader that the font names are actually ā€œArial-BoldMTā€ and ā€œArialMTā€. We donā€™t have fonts with these names installed in our machine, so we observed the same issues that you described.

Could you please again verify if DOC to PDF Conversion is preserving the actual fonts at your end? Please let us know about your feedback and we will further proceed with the investigation accordingly.

Youā€™re right. the font name in the PDF file is Arial-BoldMT. What do you suggest? What should be the name of the font? Just Arial-Bold?

@stefan.net.test

Yes, the font name should be the same as it is in the DOC file before conversion.

Why are there some characters added in front of the font? These are added after conversion to PPT. Is i because of the already wrong font from the PDF?image.png (748 Bytes)

@stefan.net.test

Yes, the change in font name during DOC to PDF conversion is causing it.

I rechecked one thing. If I have a DOC file with a bold font.After saving it as PDF (directly from Word) the font also gets the MT added after the name.But this does not happen for every font. E.g. for Calibri it is not added but for Arial. So this is not a problem from the software we are using to convert DOC -> PDF.
I also donā€™t think that the other user (that originally reported this bug) uses the same software as we do.
Can you please recheck if there is some fix or workaround for this.

@stefan.net.test

The DOC to PDF conversion is not done using Aspose.PDF. It is carried out with the help of Aspose.Words and right place to get an answer related to DOC to PDF conversion is respective category. Please create a post there with the details of your scenario and you will be assisted there accordingly.