Hai,
My requiremnt is I wanted to highlight certain paragaraphs in a pdf document and convert each page of highlighted document into html string.
I did the highlighting portion, and its working
But when it comes to converting each pdf document into html string and viewing it in a browser, it is not working as expected.
Here I am sharing the code I have used to convert pdf document to html page as I have already done pdf highlighting and the code is bit more lengthy, I am not sharing it here.
byte[] byteData = null;
int pageCount = doc.Pages.Count;
for (int page = 0; page < pageCount; page++)
//foreach (Page page in pdfFile.Pages)
{
using (MemoryStream pageStream = new MemoryStream())
{
// Save each page as a separate document.
//Page extractedPage = page;
Aspose.Pdf.Document extractedPage = new Aspose.Pdf.Document();
extractedPage.Pages.Add(doc.Pages[page + 1]);
HtmlSaveOptions htmlOptions = new HtmlSaveOptions();
htmlOptions.FixedLayout = true;
htmlOptions.PartsEmbeddingMode = Aspose.Pdf.HtmlSaveOptions.PartsEmbeddingModes.EmbedAllIntoHtml;
htmlOptions.RasterImagesSavingMode = Aspose.Pdf.HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground;
htmlOptions.RemoveEmptyAreasOnTopAndBottom = true;
htmlOptions.SplitIntoPages = false;
htmlOptions.SplitCssIntoPages = false;
string cssprefix = "aspose_pdf" + page;
htmlOptions.CssClassNamesPrefix = cssprefix;
//htmlOptions.HtmlMarkupGenerationMode = Aspose.Pdf.HtmlSaveOptions.HtmlMarkupGenerationModes.WriteAllHtml;
extractedPage.Save(pageStream, htmlOptions);
//pdfFile.Save(pageStream, htmlOptions);
var pageBytes = pageStream.ToArray();
if ((pageNumber == 0) & (page == 0))
{
byteData = pageBytes;
}
if (pageNumber == page + 1)
{
byteData = pageBytes;
}
}
}
string HtmlString = byteData.ProcessHtml();
And when viewing HtmlString in browser, output I am getting is
output.PNG (62.8 KB)
May I know why this happens? Only the highlighted color can be seen not text,
sample document is :
whitepaper.pdf (335.7 KB)