HI, is there a way to have Aspose.PDF generate as many html files as pages (considering that both SplitIntoPages i set to true and FixedLayout is set to true? Alternatively is there a way to have it generate a fixed layout with a separate file for each page?
Thank you for contacting support.
You can load a PDF document, iterate through each page and convert it to an HTML file; as in the code snippet below:
Document sourcePDF = new Document(@"Test.pdf");
foreach (Aspose.Pdf.Page page in sourcePDF.Pages)
{
Document newDocument = new Document();
newDocument.Pages.Add(page);
newDocument.Save(@"Page_" + page.Number + ".html", SaveFormat.Html);
}
We hope this will be helpful. Please feel free to contact us if you need any further assistance.
I will try it , thank you. Eventually i want to generate a fixed layout epub and hence i needed the output to be per page
Please take your time and test suggested approach in your environment. Please feel free to contact us if you need any further assistance.
Hi, sorry for bothering but after exporting per page into html i noticed that the links aren’t exported(towards other pdf pages - i still need to check for external ones) . Is this a feature that isn’t supported or am I doing something wrong?
Later edit: Never mind, figured it out, will post it here for future reference:
You need to go through each of the page’s annotations which have GoToActions and change them into GoToURIActions:
for (int linkCount = list.Count-1; linkCount >=0; linkCount--)
{
LinkAnnotation a = list[linkCount] as LinkAnnotation;
// Print the destination URL
if (a != null)
{
if (a.Action.ToString() == "Aspose.Pdf.Annotations.GoToAction")
{
a.Action = new GoToURIAction("Page_"+ page.Number + ".html");
}
}
}
Thank you for your kind feedback.
We are glad to know that things have started working in your environment. Please feel free to contact us if you need any further assistance and we will be more than happy to assist you.
HI, I’m back again (sorry for being a nuissance). I noticed that upon export to html the links are exported wrongfully (while the spans are corectly placed, the destinations are messed up -> in a TOC page 9 points to page 13, page 13 points to page 15, and so on, even though in the pdf are corectly placed.
example.jpg (196.5 KB)
Is there anything I can do to have the correct order of links ?
Kindest regards,
Bogdan
Thank you for getting back to us.
Would you please share the source and generated files with us along with a narrowed down code snippet reproducing this issue so that we may investigate further to help you out.
@Farhan.Raza sure
Page_007.pdf (53.9 KB)
as for the code used :
HtmlSaveOptions options = new HtmlSaveOptions();
options.FixedLayout = true;
options.SplitIntoPages = false;
options.SplitCssIntoPages = false;
options.CompressSvgGraphicsIfAny = false;
options.RasterImagesSavingMode = HtmlSaveOptions.RasterImagesSavingModes.AsExternalPngFilesReferencedViaSvg;
options.FontSavingMode = HtmlSaveOptions.FontSavingModes.SaveInAllFormats;
options.LettersPositioningMethod = HtmlSaveOptions.LettersPositioningMethods.UseEmUnitsAndCompensationOfRoundingErrorsInCss;
options.HtmlMarkupGenerationMode = HtmlSaveOptions.HtmlMarkupGenerationModes.WriteAllHtml;
options.PreventGlyphsGrouping = false;
options.RemoveEmptyAreasOnTopAndBottom = false;
options.PagesFlowTypeDependsOnViewersScreenSize = false;
options.UseZOrder = true;
options.SaveTransparentTexts = false;
options.SaveShadowedTextsAsTransparentTexts = false;
foreach (Aspose.Pdf.Page page in doc.Pages)
{
Document newDocument = new Document();
newDocument.Pages.Add(page);
newDocument.Save(dirName + @"\Page_" + pageNumber + ".pdf",SaveFormat.Pdf); //this was used to generate the attached pdf, in order to see if the links are ok in the pdf page
newDocument.Save(dirName + @"\Page_" + pageNumber + ".html", options);
}
As for the html file contents, (though the image paths are modified after with another script - if you think necessarily i can generate an intermediary file with the raw output from Aspose.PDF):
Page_007.zip (2.0 KB)
We have worked with the data shared by you and have been able to reproduce the issue in our environment. A ticket with ID PDFNET-45219 has been logged in our issue management system for further investigation and resolution. The ticket ID has been linked with this thread so that you will receive notification as soon as the ticket is resolved.
We are sorry for the inconvenience.
The issues you have found earlier (filed as PDFNET-45219) have been fixed in Aspose.PDF for .NET 19.11.