So It takes longer than other document because of the multiple amount of items that have to be converted.
Conversion time is not fixed. The more complex your document is it takes more time to be converted.
I run it in my local environment and my code took :
Aspose PDF API is an on-premise software. Which mean all the processes are run locally in the machine is running the program. The faster the machine, the quicker the execution will be.
Will time reduced if we remove the images while converting to HTML?
If yes, could you please share the code for the same or for remove the images from PDF?
I tried but it is saving the images at the location,
try(com.aspose.pdf.Document doc = new com.aspose.pdf.Document(fileData))
{
It all depends on what is the objective. If it is to display to clients. I would not suggest removing images, since they will see a different document compared to the original one. So I really think you should not consider this option.
Code Sample:
private void LogicAlt2()
{
var docWithImages = new Document($"{PartialPath}_input.pdf");
foreach (var page in docWithImages.Pages)
{
for(int imageNumber = 0; imageNumber < page.Resources.Images.Count; imageNumber++)
{
page.Resources.Images.Delete(1);
}
}
var saveOptions = new HtmlSaveOptions();
saveOptions.PartsEmbeddingMode = HtmlSaveOptions.PartsEmbeddingModes.EmbedAllIntoHtml;
saveOptions.LettersPositioningMethod = LettersPositioningMethods.UseEmUnitsAndCompensationOfRoundingErrorsInCss;
saveOptions.RasterImagesSavingMode = HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground;
docWithImages.Save($"{PartialPath}WithoutImage_output.html", saveOptions);
}
If it is for display only, Maybe a solution can be to transform to images. And display an image in the browser instead. This process also takes time.
I do not know your machine specs, but I will give you a couple of example for you to try.
Code Sample:
private void LogicAlt()
{
var doc = new Document($"{PartialPath}_input.pdf");
using (PdfConverter converter = new PdfConverter())
{
// Set the resolution to 300 DPI
converter.Resolution = new Resolution(300);
// Convert the whole PDF file to an image
converter.BindPdf(doc);
converter.StartPage = 1;
converter.EndPage = doc.Pages.Count;
converter.DoConvert();
// Save the image
converter.SaveAsTIFF($"{PartialPath}_output.tiff");
// Dispose the PdfConverter object
converter.Dispose();
}
}
Another code Sample using PNG:
private void LogicAlt2()
{
var doc = new Document($"{PartialPath}_input.pdf");
Document newDocWithImages = new Document();
int resolution = 300;
PngDevice png = new PngDevice(new Resolution(resolution));
foreach (Page page in doc.Pages)
{
FileStream imageStream = new FileStream($"{PartialPath}_{page.Number}.png", FileMode.OpenOrCreate);
png.Process(page, imageStream);
var newPage = newDocWithImages.Pages.Add(page);
page.Resources.Images.Add(imageStream);
imageStream.Dispose();
File.Delete($"{PartialPath}_{page.Number}.png");
}
newDocWithImages.Save($"{PartialPath}_output.pdf");
var saveOptions = new HtmlSaveOptions();
saveOptions.PartsEmbeddingMode = HtmlSaveOptions.PartsEmbeddingModes.EmbedAllIntoHtml;
saveOptions.LettersPositioningMethod = LettersPositioningMethods.UseEmUnitsAndCompensationOfRoundingErrorsInCss;
saveOptions.RasterImagesSavingMode = HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground;
newDocWithImages.Save($"{PartialPath}AsImage_output.html", saveOptions);
}