Convert PDF to HTML using C# with Aspose.PDF - NullReferenceException occurs at Save() in Docker Container

We have a requirement for extract/import the data from PDF document and convert that to HTML to show on UI.

The functionality works fine in localhost but not on server where we running with docker container and always throws NULL reference exception.

I had put a logs on dev server, checked the errors and found that it causes from Aspose.Pdf.Document.Save() method where it tries to convert pdf to HTML content.

Below is the code:
doc = new Aspose.Pdf.Document(buffer);
HtmlSaveOptions newOptions = new HtmlSaveOptions();
newOptions.RasterImagesSavingMode = HtmlSaveOptions.RasterImagesSavingModes.AsPngImagesEmbeddedIntoSvg;
newOptions.SplitIntoPages = true;// Force write HTMLs of all pages into one output document
newOptions.CustomHtmlSavingStrategy = new HtmlSaveOptions.HtmlPageMarkupSavingStrategy(SavingToStream);
// We can use some non-existing puth as result file name - all real saving will be done
// In our custom method SavingToStream() (it’s follows this one)
doc.Save(dataDir + “OutPutToStream_out.html”, newOptions);

private static void SavingToStream(HtmlSaveOptions.HtmlPageMarkupSavingInfo htmlSavingInfo)
{
byte[] resultHtmlAsBytes = new byte[htmlSavingInfo.ContentStream.Length];
htmlSavingInfo.ContentStream.Read(resultHtmlAsBytes, 0, resultHtmlAsBytes.Length);
// Here You can use any writable stream, file stream is taken just as example
string fileName = “stream_out.html”;
using (Stream outStream = File.OpenWrite(fileName))
{
outStream.Write(resultHtmlAsBytes, 0, resultHtmlAsBytes.Length);
}
string html = Encoding.UTF8.GetString(resultHtmlAsBytes, 0, (int)resultHtmlAsBytes.Length);
}

We have purchased Aspose.Total license and this very critical functionality for us.

Please help us ASAP to resolve this issue.

@VenkateshBT,

Can you please share source files along with generated result so that we may further investigate to help you out.

Hi,

SampleApp1.zip (1.1 MB)

I have uploaded the sample application.

**

The attached code is to convert PDF file into HTML files, splitted based on each page

**

Issue: The attached code works fine and returns “done” when I call endpoint “sendfile” in ValuesController.cs from my application code in debug mode. But the same code breaks with null exception in line “doc.Save(@“OutPutToStream_out.html”, newOptions)” if the same is been executed from docker file. My docker run on Linux container.

Please find the steps and application details:
DockerFile --> Present in “SampleApp1/Dockerfile
commands executed for building image: docker build --no-cache -t pdfval .
commands executed for running container: docker run -d -p 8091:80

Exception:
/nSystem.ArgumentNullException: Value cannot be null.
Parameter name: key
at System.Collections.Generic.Dictionary2.FindEntry(TKey key) at System.Collections.Generic.Dictionary2.TryGetValue(TKey key, TValue& value)
at #=zR$xHhFyvcsZK0Vu$2JlIxzmiTVl4FTrU9Q==.#=zfCZ1MbA=(#=zBaLKh_I= #=zF9ghvp0=)
at #=z2Os1BZv77j3v_gBQ2OMYwlIUfEn8bSyvGA==.#=zt7eq_8YI62aI.#=zO68nwNYBC47E(#=zBaLKh_I= #=zF9ghvp0=, #=zjdbcs$I=& #=zJnyZyrA=)
at #=zaNRKxjmZV_PyVIGrFNSDpWVjbyfBXO$szi7j0js=.#=zXnpjAFAcFnvK(#=zZw88whnMhGnHAru48ffqA7Ij2nrG #=zEfLohd91up1T)
at #=zmM4K3eyhwDCz_UQVsRSQclH8VyEy17BbUwJmr06mFuOE.#=zNlXZ$QD7ytG2()
at #=zmM4K3eyhwDCz_UQVsRSQclH8VyEy17BbUwJmr06mFuOE.#=zTihx$Vg=()
at #=zrb$IqnzKCkb7G2bW6ptsbh5qe9MnV7VQOyUXZUJOqzkm.#=zrnVAUT8=(#=zbA9hs2Ynezupw4dYLPs9bmZhEp68 #=zzghhL38=, #=zwuvQBaqYw7QuVGUmnJZBf1NPrfQy9_fP9Q== #=zRUk5cvU=, #=zv5h22hZBZa0Qt9X$mg9EMbfPSkUPLfGJ61EmdTdb9pBK #=zGom9tp8=)
at #=zRfihhgjPCnm$b5kMlaOBzGDNaAeYY7ZzKA==.#=z8iw6RLE=(Int32 #=ziKTYog4=, IList`1 #=z$Asybwlv8WbXbnEvfw==, #=zm50kPmY8EzhX #=z$VUoXo8=)
at #=zRfihhgjPCnm$b5kMlaOBzGDNaAeYY7ZzKA==.#=zrnVAUT8=()
at #=zgYXDQ2B8nLM0raE9CjYvdchxB3my.#=zSN61mE_yGcU7(Document #=zzghhL38=, #=zv5h22hZBZa0Qt9X$mg9EMbfPSkUPLfGJ61EmdTdb9pBK& #=zwKRLdyoQONbDqiiC2g==, UnifiedSaveOptions #=zRPLrpq0=, Int32& #=zhjEhxZHloN7q, Boolean #=zOGP4Jc0=)
at #=zmoTkc_nXqxzXieGE_k_2sso=.#=zq4O5p8g=(Document #=zzghhL38=, String #=zM5HhAjsJP$gBhk3nCw==, Stream #=zx7C48ht$WPxb, HtmlSaveOptions #=zRPLrpq0=)
at Aspose.Pdf.Document.#=zoj1G29hyRAU7(String #=zbWykx4fq_PsE, SaveOptions #=zRPLrpq0=)
at Aspose.Pdf.Document.Save(String outputFileName, SaveOptions options)
at SampleApp1.Controllers.ValuesController.SendFile(IFormFile file) in /src/SampleApp1/Controllers/ValuesController.cs:line 64

@VenkateshBT

Thanks for contacting support.

I have observed your issue and like to inform that I have created ticket with ID PDFNET-47665 in our issue tracking system to investigate and resolve this issue as soon possible.

Any updates on this issue please?

@VenkateshBT

We like to inform that issue has been added recently in our issue tracking system and in Aspose.PDF forum the issues are selected for investigation on first come first serve basis. Also the first priority for scheduling and resolution is given to paid Enterprise and priority support customers. Then Aspose.PDF normal or free support customers issues are scheduled and resolved on first come and first come serve basis. We will share the further information with you as soon as the issue will be resolved.

Just to inform you that we have been using the licensed version (SITE OEM)

@VenkateshBT,

I have observed your comments. I like to inform that if you are entitled to priority support than please visit Paid support helpdesk to get your issue resolve as soon as possible.

We are actually seeing the Document ctor fail with a very similar stack trace.

Code we used is

MemoryStream inputStream = new MemoryStream(Encoding.UTF8.GetBytes(html));
Aspose.Pdf.HtmlLoadOptions options = new Aspose.Pdf.HtmlLoadOptions
{
PageInfo = new PageInfo
{
Margin = new MarginInfo { Left = 0, Right = 0, Top = 0, Bottom = 0 },
}
};

  Document doc = new Document(inputStream, options);

IS there any update?

@benoisc1,

I like to inform that issue has been created in our issue tracking system. We will share good news with you soon. I request for your patience.