Convert PDF to HTML using Aspose.PDF for Java - Generate a single HTML file with all resources

Hi,
We are using aspose.pdf paid version for java. Using it for converting PDF to HTML. But where we have one more scenario to merge all spilt html related file like img, css. woff, eot , html into a complete single HTML doc without any dependent files. So we are trying to do it manually. Can you please suggest is there any aspose api to do this merge process into a single file.

@sravan.matta

Please try using following code snippet and in case you face any issue, let us know.

com.aspose.pdf.HtmlSaveOptions newOptions = new com.aspose.pdf.HtmlSaveOptions();
        // Enable option to embed all resources inside the HTML
        newOptions.PartsEmbeddingMode = com.aspose.pdf.HtmlSaveOptions.PartsEmbeddingModes.EmbedAllIntoHtml;
// This is just optimization for IE and can be omitted
newOptions.LettersPositioningMethod = com.aspose.pdf.LettersPositioningMethods.UseEmUnitsAndCompensationOfRoundingErrorsInCss;
newOptions.RasterImagesSavingMode = com.aspose.pdf.HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground;
newOptions.FontSavingMode = com.aspose.pdf.HtmlSaveOptions.FontSavingModes.SaveInAllFormats;
// Output file path
newOptions.CssClassNamesPrefix = "p";
Document doc = new Document(dataDir + "Sample PDF.pdf");
System.out.println("DOCUMENT OPENED");
doc.save(dataDir + "output.html", newOptions);

Hi but how do i pass here folder having a single html and dependent files like images and .woff and .eot files in it. Can you please give complete solution.

@sravan.matta

The shared code snippet is for generating a single HTML file directly from the PDF. We are afraid that Aspose.PDF does not offer any feature to assemble or merge different HTML resources into single file. You can however, directly generate single HTML file from the PDF using previously shared code snippet. In case you have further inquiry, please feel free to ask.

Yes we are using aspose.pdf for converting pdf to single html doc only currently but we have old pdf to html split dependent files also from our legacy appl so we are trying to migrate these to single html again for our new platform. Here i am looking for any api instead of doing it manually.

@sravan.matta

Regretfully, as shared earlier, Aspose.PDF does not offer such functionality to merge HTML resources into single HTML file. However, could you please share your sample files in .zip format with us. We will further check the details and try to assist you where we can.

Hi,

Sorry i am not asking weather aspose.PDF will to that merging job or not. I am looking for any other aspose family api to do that job so that will purchase that as well. i can’t share these files as belongs to corporate office. You can get an example after pdf to html conversion with dependent files store in separate folder.

Thanks,
sravan

@sravan.matta

We have Aspose.HTML API which is specialized to deal with HTML files. However, the feature you are looking for needs to be investigated at the moment and we have logged a feature request as HTMLJAVA-545 for the purpose in our issue management system. We will investigate the feasibility of your requirement and inform you as soon as it is implemented. Please give us some time.