We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

PDF to HTML : Standalone HTML (embed images and css)

Hello,

My client would like to use ASPOSE PDF To HTML conversion for displaying HTML content from PDF files.

Constraint :
But we have strong security constraints. We can’t just expose on a website all the files needed to get the html, images and css as everyone could access thoses files.
So we need to secure the files depending on connected users. The best way should be to generate a unique HTML file embedding images & css. The security can then be simply handled with a .NET Handler (ashx) managing user session rights.

Needed result :
So my question is : Is it possible to generate with ASPOSE a standalone HTML from a PDF embedding images and css (with base64 content in HTML) ? If so, how ?

Note :
I’m already able to play with the APIs : i choose to generate svg or images, 1 global css or 1 css per page but i’m not able to generate one HTML with all the files in embedded content (base64).

Thank you for your support.

Well, i found a solution based on the renaming of images/svg/css paths in the HTML file. I used this article to do so :

<!–[if gte mso 9]>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves/>
<w:TrackFormatting/>
<w:HyphenationZone>21</w:HyphenationZone>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>FR</w:LidThemeOther>
<w:LidThemeAsian>X-NONE</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
</w:Compatibility>
<w:BrowserLevel>MicrosoftInternetExplorer4</w:BrowserLevel>
<m:mathPr>
<m:mathFont m:val=“Cambria Math”/>
<m:brkBin m:val=“before”/>
<m:brkBinSub m:val="–"/>
<m:smallFrac m:val=“off”/>
<m:dispDef/>
<m:lMargin m:val=“0”/>
<m:rMargin m:val=“0”/>
<m:defJc m:val=“centerGroup”/>
<m:wrapIndent m:val=“1440”/>
<m:intLim m:val=“subSup”/>
<m:naryLim m:val=“undOvr”/>
</m:mathPr></w:WordDocument>
<![endif]–><span style=“font-size:11.0pt;font-family:“Calibri”,“sans-serif”;
mso-ascii-theme-font:minor-latin;mso-fareast-font-family:Calibri;mso-fareast-theme-font:
minor-latin;mso-hansi-theme-font:minor-latin;mso-bidi-font-family:“Times New Roman”;
mso-bidi-theme-font:minor-bidi;color:#1F497D;mso-ansi-language:FR;mso-fareast-language:
EN-US;mso-bidi-language:AR-SA”>[http://www.aspose.com/docs/display/pdfnet/PDF+to+HTML±+Save+output+in+Stream+object](http://www.aspose.com/docs/display/pdfnet/PDF+to+HTML+-+Save+output+in+Stream+object).<!–[if gte mso 10]>

/* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Times New Roman","serif";}

<![endif]–>

I am using an Handler to resolve document like “http://mysite/DocumentHandler.ashx?document=myDoc.html”. This handler returns the content of the document. The same is realized on images/svg/css. That’s my handler that checks if user has a session then returns the document content that he retrieves locally.

So it needs some computing but it finally works.

If someone have a solution to generate a unique and standalone HTML, then i would be glad to know as it is simpler to implement.

Thank you in advance.

Hi there,


Thanks for your interest in Aspose.

It is good to know that you have managed to find solution for your scenario. However we have already logged a feature request in our internal system, PDFNEWNET-36340, to get a single HTML with embedded css/images/fonts in PDF to HTML conversion. We have also linked your thread to the issue id and will notify you as soon as it is implemented.

Best Regards,

The issues you have found earlier (filed as PDFNEWNET-36340) have been fixed in Aspose.Pdf for .NET 9.6.0.


This message was posted using Notification2Forum from Downloads module by Aspose Notifier.
(1)

Hi there,

Thanks for your patience. As stated above your reported issue has been fixed, now you can create a single HTML file output with embedded resources using Aspose.Pdf for .NET 9.6.0. Please download the release and try using it as following.

Document doc = new Document(myDir + "36608.pdf");
HtmlSaveOptions newOptions = new HtmlSaveOptions();
// this is usage of tested feature
newOptions.PartsEmbeddingMode = HtmlSaveOptions.PartsEmbeddingModes.EmbedAllIntoHtml;
// this is just optimozation for IE and can be omitted
newOptions.LettersPositioningMethod =HtmlSaveOptions.LettersPositioningMethods.UseEmUnitsAndCompensationOfRoundingErrorsInCss;
newOptions.RasterImagesSavingMode =HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground;
newOptions.FontSavingMode = HtmlSaveOptions.FontSavingModes.SaveInAllFormats;
//
string outHtmlFile = myDir + "ExternalTestsData/36340.html";
doc.Save(outHtmlFile, newOptions);

Please feel free to contact us for any further assistance.

Best Regards,

The issues you have found earlier (filed as ) have been fixed in this update. This message was posted using BugNotificationTool from Downloads module by MuzammilKhan