Slides -> PDF -> HTML conversion bad result

Hello Aspose Team,


We are currently evaluating Aspose PDF with Aspose Slides, in order to convert a Powerpoint to HTML, and then back to Powerpoint. We are converting a powerpoint to PDF, from PDF to HTML. We edit parts of the HTML within our application, and then we convert it to PDF using Aspose PDF, and save it back as Powerpoint.

Our issue is that the result HTML file is slight different from the original Powerpoint file after the conversion, and when we convert it back to Powerpoint, the charts are looking bad.

Please find attached, a zip file with the original PPTX file, the resulted PDF, HTML and PPTX files.

Please let us know if we can reach our goal with Aspose PDF and Slides.

Below code samples we use to convert PPTX to HTML, and then HTML to PPTX.

PPTX -> HTML

using (var pres = new Presentation(file))
{
ISlide slide = pres.Slides[0];

var result1 = sdlg.ShowDialog();
if (result1.Value)
{
PdfOptions pdOpts = new PdfOptions
{
SaveMetafilesAsPng = true,
};
MemoryStream stream = new MemoryStream();
var pdf = @"…\PDFFromHtml.pdf";
pres.Save(pdf, Aspose.Slides.Export.SaveFormat.Pdf, pdOpts);
using (Document doc = new Document(pdf))
{
HtmlSaveOptions hso = new HtmlSaveOptions()
{
HtmlMarkupGenerationMode = HtmlSaveOptions.HtmlMarkupGenerationModes.WriteAllHtml,
DocumentType = HtmlDocumentType.Html5,
FontSavingMode = HtmlSaveOptions.FontSavingModes.SaveInAllFormats,
FixedLayout = true,
RasterImagesSavingMode = HtmlSaveOptions.RasterImagesSavingModes.AsExternalPngFilesReferencedViaSvg,
PartsEmbeddingMode = HtmlSaveOptions.PartsEmbeddingModes.NoEmbedding,
RemoveEmptyAreasOnTopAndBottom = true,
PagesFlowTypeDependsOnViewersScreenSize = false,
LettersPositioningMethod = HtmlSaveOptions.LettersPositioningMethods.UseEmUnitsAndCompensationOfRoundingErrorsInCss,
SaveTransparentTexts = true,
SaveShadowedTextsAsTransparentTexts = true,
};
doc.Save(sdlg.FileName, hso);
}
}
}

HTML -> PPTX

HtmlLoadOptions htmloptions = new HtmlLoadOptions(dataDir);
htmloptions.PageInfo = new PageInfo { IsLandscape = Aspose.Pdf.PageSize.A4.IsLandscape = true, Height = Aspose.Pdf.PageSize.A4.Height - 55, Margin = new Aspose.Pdf.MarginInfo { Bottom = 0, Left = 0, Right = 0, Top = 0} };

using (Document pdf = new Document(htmlFile, htmloptions))
{
pdf.PageMode = PageMode.FullScreen;
pdf.CenterWindow = true;
pdf.FitWindow = true;
pdf.PageLayout = PageLayout.SinglePage;
pdf.PageInfo = new PageInfo { IsLandscape = Aspose.Pdf.PageSize.A4.IsLandscape = true, Height = Aspose.Pdf.PageSize.A4.Height - 55, Margin = new Aspose.Pdf.MarginInfo { Bottom = 0, Left = 0, Right = 0, Top = 0 } };
pdf.Save(sdlg.FileName, Aspose.Pdf.SaveFormat.Pptx);
}

Regards,
Vali

Hi Vali,

Thanks for your inquriy. I have tried to save PPTX into a single HTML file with embedded resources and then HTML to PPTX. The contents in final PPTX are rendered as expected but a blank page is included, Please try the scenario at your end with following HtmlSaveOptions Settings and confirm the results along with your Aspose.Slides and Aspose.Pdf versions, so we will log issue in our issue tracking system accordingly.

Aspose.Pdf.HtmlSaveOptions hso = new Aspose.Pdf.HtmlSaveOptions()
{
    // Enable option to embed all resources inside the HTML
    PartsEmbeddingMode = Aspose.Pdf.HtmlSaveOptions.PartsEmbeddingModes.EmbedAllIntoHtml,

    // This is just optimization for IE and can be omitted
    LettersPositioningMethod = Aspose.Pdf.HtmlSaveOptions.LettersPositioningMethods.UseEmUnitsAndCompensationOfRoundingErrorsInCss,

    RasterImagesSavingMode = Aspose.Pdf.HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground,

    FontSavingMode = Aspose.Pdf.HtmlSaveOptions.FontSavingModes.SaveInAllFormats
};

We are sorry for the inconvenience.

Best Regards,

Hello Team,


We used the code mentioned by you in the post, and indeed, the chart looks better.
However, there are some issues on the converted Powerpoint file:

1. Fonts are not loaded into the Powerpoint file with the code mentioned by you.
2. Table header is not displayed with the code mentioned by you.
3. Chart series are overlapping.

Attached screen with the result of the conversion, and highlighting of the issues.

As a note, we are using PowerPoint 2013 for editing and viewing of the files.

Regards,
Vali

Hi Vali,


Thanks for your feedback. I have noticed the reported issues in the resultant PPTX and logged a ticket PDFNET-41404 for further investigation and rectification. We will notify you as soon as it is resolved.

We are sorry for the inconvenience.

Best Regards,

Hello Tilal,


Thanks for your answer.
Barely waiting for an update on this, as we observed another situation where the text is missing from the resulted PPTX file, or the fonts are not loaded.
See attached screens from HTML file and PPTX resulted file.

Regards,
Vali

Hi Vali,


Thanks for your feedback. Apparently it looks same issue as above, however please share your source PPTX here as well, we will test the scenario and will provide you information accordingly.

Best Regards,

Hi Tilal,


Please find attached a slide from one of the PPTX files we are working with.
Additionally to the previous issues, I have noted that text that are having opacity set, are not converted with opacity into HTML.

Regards,
Vali

Hi Vali,


Thanks for sharing your sample document. We have passed on the document and your findings to our product team. The will consider it during the issue investigation. We will notify you as soon as we made some significant progress towards issue resolution.

Best Regards,

Hello,


Thank you for your help. We hope that we receive some feedback soon regarding our issues.
Meanwhile, we have tested with Aspose on a couple of other slides, and found some other issues:

1) Text are left-aligned in resulted slides, instead of being center aligned as in the original.
2) Word spacing is not taking into account, and words are overlapping, or space between words is too small.
3) Images are a bit blury
4) Text is mis-placed

Please find attached archive of testing slides.

Regards,
Vali

Hi Vali,


Thanks for your inquiry. I am looking into the scenario and will update you my findings soon.

Best Regards,

Hi Tilal,


Do you have any update on the issues mentioned above?

Regards,
Vali

Hi Vali,


I am sorry for the inconvenience. I have started the investigation and will update you shortly.

Best Regrads,

Hi Vali,


valentin.uilean:
Hello,

Thank you for your help. We hope that we receive some feedback soon regarding our issues.
Meanwhile, we have tested with Aspose on a couple of other slides, and found some other issues:

1) Text are left-aligned in resulted slides, instead of being center aligned as in the original.
2) Word spacing is not taking into account, and words are overlapping, or space between words is too small.
3) Images are a bit blury
4) Text is mis-placed

Please find attached archive of testing slides.


Thanks for your patience. I have tested the scenario with shared PPTX files and found following issues and also logged accordingly. I am afraid I am unable to notice the text alignment and spacing issue, I will appreciate it if you please share the screen shot of the issues. It will help us to investigate and address your issue exactly.

PDFNET-41654: White text missing in HTML to PPTX( Slide 2, 27,35,57)
PDFNET-41688: White text Z-order in PDF to HTML(Slide 2,27,35)
PDFNET-41689: Blur text in PDF to HTML(Slide 67)
PDFNET-41690: Extra white lines reflecting in PDF to HTML(Slide 63)
PDFNET-41691: Image cropped in HTML to PPTX(Slide 27)

We are sorry for the inconvenience.

Best Regards,

The issues you have found earlier (filed as PDFNET-41404) have been fixed in Aspose.Pdf for .NET 17.1.0.


This message was posted using Notification2Forum from Downloads module by Aspose Notifier.