C# HTML to PDF Encoding UTF8 Stream Issue with Aspose.PDF - Index Out of Range

Hi Team/@asad.ali,
I’m facing issue when converting HTML to PDF using the UTF8 encoding. we are getting this issue for the file attached. index out of range exception. We are using 19.7 version of aspose.pdf

Please find the html file and the code we are using to convert.

https://drive.google.com/drive/folders/1CS6fPRyQnRUJs4ReY8qKcoxyJNn_a3f5?usp=sharing

@NPSwaroop

We have used the below code snippet (as some values and objects were undefined in your shared snippet) with Aspose.PDF for .NET 21.1 and did not notice any issue. For your kind reference, an output PDF is also attached:

HtmlLoadOptions options = new HtmlLoadOptions(dataDir);
var htmlBody = File.ReadAllText(dataDir + "PdfInstruction.htm");
var exceptions = new List<string>();
var htmlByteArray = Encoding.UTF8.GetBytes(htmlBody);  // The default character encoding for HTML5 is UTF-8.
using (var stream = new MemoryStream(htmlByteArray))
{
 using (Document pdfDocument = new Document(stream, options))
 {
  pdfDocument.OptimizeResources(new Aspose.Pdf.Optimization.OptimizationOptions() //for file size optimizations
  {
   LinkDuplcateStreams = true,
   RemoveUnusedObjects = true,
   RemoveUnusedStreams = true,
  });
  pdfDocument.Save(dataDir + "output.pdf");
 }
}

output.pdf (194.6 KB)

Would you please use the latest version of the API and let us know in case issue still persists.

Hi @asad.ali,
this issue got resolved after upgrading to 21.1 version. thanks.

@NPSwaroop

It is good to know that your issue has been resolved. Please keep using our API and feel free to let us know in case you need further assistance.