Object reference not set to an instance of an object. Error when converting PDF to HTML

When converting a PDF to HTML, some PDF documents cause an error when calling the Save method.

An example of the exception returned.

Object reference not set to an instance of an object.

at .( , TTFFont , , GlyphID[] , )
at . ( , , )
at .( , List1 ) at .( , List1 )
at . ( , List`1 )
at . (​ , , )
at .(Document , & , UnifiedSaveOptions , Int32& )
at .(Document , String , Stream , HtmlSaveOptions )
at Aspose.Pdf.Document.Save(String outputFileName, SaveOptions options)
at pdftohtmlinspector.PdfInspector.ProcessSelectedPDF()

I just finished testing with version 17.8.0 of the aspose.pdf.dll and the issue is still reproduce-able.

I have attached 2 PDFs that may demonstrate the issue.
win7error.pdf causes an error on windows 7 professional. The stack trace above came from that document.

prodError.PDF causes an error in the production environment. I am not sure what the OS is there but i believe it is server 2008r2.
This document returns the following error.
reference not set to an instance of an object.Object reference not set to an instance of an object. at Aspose.Fonts.CFF.CFFFontMetrics.get_FontMatrix()
at Aspose.Fonts.CFF.CFFFontMetrics.get_UnitsPerEM()
at . ( , , )
at .( , List1 ) at .( , List1 )
at . ( , List`1 )
at . ( , , )
at .(Document , & , UnifiedSaveOptions , Int32& )
at .(Document , String , Stream , HtmlSaveOptions )
at Aspose.Pdf.Document.Save(String outputFileName, SaveOptions options)

Neither of these documents cause an error when running on windows 10.

prodError.PDF (386.3 KB)
win7error.pdf (845.0 KB)

@bhgiq,
Both these errors appear as the environmental specific problems. Kindly share the code snippet, .NET Framework, application type, Chinese fonts, the local language settings and some other handy information which could help us to replicate the same errors in our environment. We will investigate and share our findings with you. Your response is awaited.

Best Regards,
Imran Rafique

I have attached PDFinspector.zip.
This file contains a windows form and the related support files.
This is written for framework 4.0.
You should be able to add it to a project as needed. I created it with VS2017.

In addition, there are 2 screenshots in the zip file. 1 from the win7 workstation and the other from the win10 workstation.
Both show the results when processing the same PDF as i attached yesterday. (win7error.pdf).
The win7 workstation has SimHei and Symbol fonts installed.
The win10 workstation does not, yet it processes the PDF without error.

Neither machine has Helvetica fonts.
I do not have access to the production server to get a font list.

PDFinspector.zip (215.5 KB)

@bhgiq

Thanks for providing details.

We have tested the scenario over Windows 7 EN 64-Bit and observed the same exception while converting one of your shared PDFs (win7error.pdf). For the sake of detailed investigation, we have logged and issue as PDFNET-43190 in our issue tracking system.

However, we were unable to notice any exception while converting other PDF document (prodError.pdf) through same application, but we have observed that some images were missing in the resultant HTML. For your reference, I have attached an screenshot of test case as well.

PDF2HTML.png (66.4 KB)

We have logged an issue, related to missing images, as PDFNET-43191 in our issue tracking system. We will further investigate both issues and keep you updated with the status of their correction. Furthermore, we will test the scenario in Windows Server 2008 R2 as well and share our finding with you. Please be patient and spare us little time.

I am following up on this issue as well as a new issue.
Using the PDF document attached, sample3.pdf, and using the same code sample provided previously, we get the following error.
The errors have become a real issue and the lack of a resolution is having an impact on the business.
We really need to get a fix in place for these issues.

Thanks,
Patrick

Parameter is not valid.

at System.Drawing.Bitmap…ctor(Int32 width, Int32 height, PixelFormat format)
at …ctor( , )
at ​.( , )
at . (​ , , , , )
at ​.(​ )
at ​ .( , , )
at .( )
at . ( , List1 ) at . (​ , , , ) at .( , ) at . ( ) at .( ) at . ( , List1 )
at . (​ , , , )
at . ( )
at .( )
at . ( , List`1 )
at . (​ , , )
at .(Document , & , UnifiedSaveOptions , Int32& )
at .(Document , String , Stream , HtmlSaveOptions )
at Aspose.Pdf.Document.Save(String outputFileName, SaveOptions options)sample3.pdf (464.1 KB)

@bhgiq

Thanks for contacting support.

I have tested the scenario and managed to replicate the issue in specified environment (i.e Windows 7 EN x64). Therefore I have logged an issue as PDFNET-43223 in our issue tracking system and linked with this forum thread. We will further look into the details of the issue and keep you informed with the status of
its correction. Please be patient and spare us little time.

We are sorry for the inconvenience.

@bhgiq

Thanks for your patience.

Our product team has further investigated the earlier logged issue i.e PDFNET-43191 and as per their findings, the issue is not related to our API but to IE8 in Win7 OS. Please note that the default version of the IE installed in the Windows 7 is IE8 and application control WebBrowser uses the installed version of the IE by default, to render HTML content in Windows OS.

Aspose.Pdf API generates one background image for all images in PDF and embeds it into HTML code (which is rendered in WebBrowser control) as base64 URI string. IE8 has restriction on the size of such URIs. In our case we have base64 URI for 1st page with size more than 32K (i.e supported size of URI by IE8).

We did some experiments, generated separate HTML file and tried to open it in IE8 and Chrome which were installed on target Win7. So this file opened correctly in Chrome and showed the similar problem as WinForm in IE8. So we propose to update IE on target OS. E.g. to make update up-to IE11 for Win7 OS and input application will start work correctly.

Now concerning to the other logged issues, we will let you know once we have some definite updates about their resolution progress. Please spare us little time.

The issues you have found earlier (filed as PDFNET-43190) have been fixed in Aspose.PDF for .NET 18.6. This message was posted using BugNotificationTool from Downloads module by asad.ali