i use below c# code to convert pdf stream to html file, but it generate another folder file with this html file like below:,
<a class="attachment" href="/uploads/default/31426">aspose issue.jpg</a> (25.8 KB)
how can i let it only generate html file because convert pptx or docx to html file only generate one html file? thanks
var data = FileProvider.DownloadFile(FileDownMess);
MemoryStream fs = new MemoryStream();
data[0].MemoryStream.Position = 0;
fs.Write(data[0].MemoryStream.ToArray(), 0, data[0].MemoryStream.ToArray().Length);
Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(fs);
Aspose.Pdf.HtmlSaveOptions htmlOptions = new Aspose.Pdf.HtmlSaveOptions();
htmlOptions.SplitIntoPages = false;
// Save the document
string pdfPath = @"C:\AsposeTest\testPdf.html";
pdfDocument.Save(pdfPath, htmlOptions);
Thanks for contacting support.
Please use code snippet given in following documentation article in order to achieve your requirements. In case you face any issue, please feel free to let us know.
thanks for your answer, it works now, btw, i also want to convert excel to html not generate additional folder file, i use below code, can’t works, still generate additional folder file, whether can implement it like pdf to html? if can, could you give me the link? thanks
Workbook excel = new Workbook(fs);
Aspose.Cells.HtmlSaveOptions opts = new Aspose.Cells.HtmlSaveOptions();
opts.ExportImagesAsBase64 = true;
string excelPath = @“C:\AsposeTest\testExcelnew.html”;
excel.Save(excelPath, opts);
Regarding Aspose.Cells, you may add a few lines (see the lines in bold) to accomplish the task:
e.g
Sample code:
Workbook excel = new Workbook(fs);
Aspose.Cells.HtmlSaveOptions opts = new Aspose.Cells.HtmlSaveOptions();
opts.ExportImagesAsBase64 = true;
opts.ExportActiveWorksheetOnly = true;
opts.ExportSingleTab = false;
string excelPath = @“C:\AsposeTest\testExcelnew.html”;
excel.Save(excelPath, opts);
Hope, this helps a bit.
Yes, this is what MS Excel also does. If a spreadsheet has multiple sheets and you need to render single HTML (with all resources embedded) for it, it is not possible even in MS Excel. Aspose.Cells follows Ms Excel standards and specifications in rendering Excel to HTML file format, so by default it will create folder containing the resource files against worksheets in the workbook. But you may still choose the following option/approach to accomplish the task and cope with it:
Try to export every worksheet (in the workbook) to single HTML and then group these individual HTMLs to one (final) HTML by yourselves via e.g some tag control or using your own code. In a loop, you may set active for each sheet and then render separate HTML file (based on every worksheet) via Aspose.Cells APIs. Please note, when exporting every worksheet to separate HTML, you would need to export image as base64 format (you will use HtmlSaveOptions class here) otherwise it will create folders.
Hope, this helps a bit.
@Amjad_Sahi
i solve it by another method using below code:
Workbook excel = new Workbook(fs);
string excelPath = @"C:\AsposeTest\testExcelnew321.mht";
excel.Save(excelPath, Aspose.Cells.SaveFormat.MHtml);
but another question is that when i convert docx to MHTML using below code, I found that every page's header and footer is missing, do you have any method to keep the page header and footer with content? thanks
Aspose.Words.Document docx = new Aspose.Words.Document(fs);
string outFn = @"C:\AsposeTest\test123.mht";
docx.Save(outFn, Aspose.Words.SaveFormat.Mhtml);
if i convert docx to html using below code, some content's picture will miss style,can't display normally, can i have any method to fix it? thanks
Aspose.Words.Document docx = new Aspose.Words.Document(fs);
HtmlFixedSaveOptions options = new HtmlFixedSaveOptions();
options.PageIndex = 0;
options.PageCount = docx.PageCount;
options.ExportEmbeddedImages = true;
options.ExportEmbeddedCss = true;
options.ExportEmbeddedSvg = true;
options.ExportEmbeddedFonts = true;
options.NumeralFormat = NumeralFormat.System;
options.UseHighQualityRendering = true;
options.SaveFormat = Aspose.Words.SaveFormat.HtmlFixed;
options.PageHorizontalAlignment = HtmlFixedPageHorizontalAlignment.Center;
string outFn = @"C:\AsposeTest\Test1122.html";
docx.Save(outFn, options);
Good to know that you have sorted it out now. And, yes, MHTML is another option for you, it is single file format with self embedded resources in it.
Regarding your other issue for Aspose.Words API, kindly provide your sample document and output file(s) to show the issue, we will check it soon.
@Amjad_Sahi
below is my sample document and generate related html:
testdocandhtml.zip (628.7 KB)
in the html Prerequisites word is not match with document's original location, it have right offset,if have multiple page with pictures, pictures will hide some part due to the offset, how can i deal with it using code according to last mentioned code? thanks
Please note that Aspose.Words mimics the behavior of MS Word. If you convert your HTML to DOCX using MS Word, you will get the same output.
@tahir.manzoor
i have another problem, whether can add watermark to converted html from pdf file directly? i try use previous add watermark to converted pdf, change to the memorystream using html stream, not work, below is my code:
byte[] byteArrDOC = null;
Aspose.Pdf.Document pdfDocumentbyteArrDOC = null;
Aspose.Pdf.TextStamp textStampbyteArrDOC = null;
MemoryStream memStream = null;
Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(fs);
Aspose.Pdf.HtmlSaveOptions newOptions = new Aspose.Pdf.HtmlSaveOptions();
//// Enable option to embed all resources inside the HTML
newOptions.PartsEmbeddingMode = Aspose.Pdf.HtmlSaveOptions.PartsEmbeddingModes.EmbedAllIntoHtml;
//// This is just optimization for IE and can be omitted
newOptions.LettersPositioningMethod = Aspose.Pdf.HtmlSaveOptions.LettersPositioningMethods.UseEmUnitsAndCompensationOfRoundingErrorsInCss;
newOptions.RasterImagesSavingMode = Aspose.Pdf.HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground;
newOptions.FontSavingMode = Aspose.Pdf.HtmlSaveOptions.FontSavingModes.SaveInAllFormats;
string pdfPath = @"C:\AsposeTest\testPdf.html";
using (FileStream fileStream = System.IO.File.OpenRead(pdfPath))
{
memStream = new MemoryStream();
memStream.SetLength(fileStream.Length);
fileStream.Read(memStream.GetBuffer(), 0, (int)fileStream.Length);
}
byteArrDOC = ObjectToByteArray(memStream);
pdfDocumentbyteArrDOC = new Aspose.Pdf.Document(new MemoryStream(byteArrDOC)); //this line have exception ("Startxref not found")
You would need to add a watermark to the source PDF document and save it to a memory stream and then convert it to HTML document with Aspose.PDF for .NET API. Please visit below documentation atrticles for your kind reference and feel free to contact us if you need any further assistance.
@Farhan.Raza
i can use your solution to add watermark to pdf document, but if i add watermark to excel file, i convert excel file to pdf file then add watermark, save to html file, it will miss the all sheets format and only show all sheet's content one by one in one page, if i save to mht file, it will show a mess of code, could you give me an advise to solve it?
code is as below:
MemoryStream outputforpdf = new MemoryStream();
Workbook excel = new Workbook(fs);
excel.Save(outputforpdf, Aspose.Cells.SaveFormat.Pdf);
byteArrDOC = ObjectToByteArray(outputforpdf);
pdfDocumentbyteArrDOC = new Aspose.Pdf.Document(new MemoryStream(byteArrDOC));
string textStampContent = string.Format("{0}-{1}", "Aspose.Words", DateTime.Now.ToLongTimeString());
textStampbyteArrDOC = new Aspose.Pdf.TextStamp(textStampContent);
//set whether stamp is background
textStampbyteArrDOC.Background = false;
//set origin
textStampbyteArrDOC.Height = 100;
textStampbyteArrDOC.Width = 500;
textStampbyteArrDOC.HorizontalAlignment = Aspose.Pdf.HorizontalAlignment.Center;
textStampbyteArrDOC.VerticalAlignment = Aspose.Pdf.VerticalAlignment.Center;
//rotate stamp
textStampbyteArrDOC.RotateAngle = 45;
//set text properties
textStampbyteArrDOC.TextState.Font = FontRepository.FindFont("Arial");
textStampbyteArrDOC.TextState.FontSize = 14.0F;
textStampbyteArrDOC.TextState.ForegroundColor = Aspose.Pdf.Color.Gray;
textStampbyteArrDOC.TextState.StrokingColor = Aspose.Pdf.Color.Gray;
//add stamp to particular page
for (var i = 1; i <= pdfDocumentbyteArrDOC.Pages.Count; i++)
{
pdfDocumentbyteArrDOC.Pages[i].AddStamp(textStampbyteArrDOC);
}
MemoryStream temOutputForPdf = new MemoryStream();
pdfDocumentbyteArrDOC.Save(temOutputForPdf);
//convert to mht
excel = new Workbook(temOutputForPdf);
string excelPath = @"C:\AsposeTest\xlswithwatermark.mht";
excel.Save(excelPath, Aspose.Cells.SaveFormat.MHtml);
Please elaborate the problem while sharing respective files as ZIP, along with some screenshots so that we may investigate further.
please see the below attachment excel file and converted html file, thanks
AsposeTest.zip (253.0 KB)
Thank you for sharing the data.
We have modified the code snippet to narrow down the problem and figure out which API is causing the problem. Kindly try below code snippet and then elaborate the issue along with screenshots and expected output so that we may proceed further.
FileStream fs = new FileStream(dataDir + "testES_5.4.3_to_6.7.1_upgrade_plan.xlsx", FileMode.Open, FileAccess.Read);
MemoryStream outputforpdf = new MemoryStream();
Aspose.Cells.Workbook excel = new Aspose.Cells.Workbook(fs);
excel.Save(outputforpdf, Aspose.Cells.SaveFormat.Pdf);
var byteArrDOC = ObjectToByteArray(outputforpdf);
var pdfDocumentbyteArrDOC = new Aspose.Pdf.Document(new MemoryStream(byteArrDOC));
string textStampContent = string.Format("{0}-{1}", "Aspose.Words", DateTime.Now.ToLongTimeString());
var textStampbyteArrDOC = new Aspose.Pdf.TextStamp(textStampContent);
//set whether stamp is background
textStampbyteArrDOC.Background = false;
//set origin
textStampbyteArrDOC.Height = 100;
textStampbyteArrDOC.Width = 500;
textStampbyteArrDOC.HorizontalAlignment = Aspose.Pdf.HorizontalAlignment.Center;
textStampbyteArrDOC.VerticalAlignment = Aspose.Pdf.VerticalAlignment.Center;
//rotate stamp
textStampbyteArrDOC.RotateAngle = 45;
//set text properties
textStampbyteArrDOC.TextState.Font = FontRepository.FindFont("Arial");
textStampbyteArrDOC.TextState.FontSize = 14.0F;
textStampbyteArrDOC.TextState.ForegroundColor = Aspose.Pdf.Color.Gray;
textStampbyteArrDOC.TextState.StrokingColor = Aspose.Pdf.Color.Gray;
//add stamp to particular page
for (var i = 1; i <= pdfDocumentbyteArrDOC.Pages.Count; i++)
{
pdfDocumentbyteArrDOC.Pages[i].AddStamp(textStampbyteArrDOC);
}
MemoryStream temOutputForPdf = new MemoryStream();
pdfDocumentbyteArrDOC.Save(temOutputForPdf);
Aspose.Pdf.Document document = new Document(temOutputForPdf);
HtmlSaveOptions options = new HtmlSaveOptions();
options.PartsEmbeddingMode = HtmlSaveOptions.PartsEmbeddingModes.EmbedAllIntoHtml;
options.RasterImagesSavingMode = HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground;
document.Save(dataDir + "PDF.html" , options);