Problems converting PDF (containing images) to HTML

damong · April 21, 2015, 9:56pm

I’ve upgraded our version of Aspose PDF for .Net recently and am now seeing problems when converting a PDF document to HTML. This only happens for some PDF documents - the ones I have noticed it happening on are documents that have been scanned to PDF (i.e. full page images).

The exception is:

System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. —> System.ArgumentException: Parameter is not valid. at System.Drawing.Bitmap…ctor(Int32 width, Int32 height, PixelFormat format) at .?…ctor(Int32 , Int32 , Single , Single , PixelFormat ) at ?.?.?( , ) at ?.?.Convert( , ? ) at ??.a?.(? , Dictionary2 ) at ..?(? ) at ..?(? ) at ..?(? ) at .?.Convert(ArrayList , ? , ? ) at ?.?.Convert(String , ? , IList1 , a , ) at ?..?(Document , String , Stream , HtmlSaveOptions ) at Aspose.Pdf.Document.Save(Stream outputStream, SaveOptions options)

Does anyone have any pointers as to what may be needed to fix this. It used to work in the older version of the Aspose Library.

Our PDF to HTML conversion code is fairly boilerplate (snippet provided below):

var doc = new Document(source); var borderStyle = new SaveOptions.BorderPartStyle { LineType = SaveOptions.HtmlBorderLineType.Dotted, Color = System.Drawing.Color.Gray }; var options = new HtmlSaveOptions { SpecialFolderForAllImages = _imagesDir, SplitCssIntoPages = false, SplitIntoPages = false, CustomResourceSavingStrategy = SaveFontsAndImages, CustomCssSavingStrategy = WriteCssToFolder, CustomStrategyOfCssUrlCreation = GetCssUrl, LettersPositioningMethod = HtmlSaveOptions.LettersPositioningMethods.UseEmUnitsAndCompensationOfRoundingErrorsInCss, FontSavingMode = HtmlSaveOptions.FontSavingModes.SaveInAllFormats, HtmlMarkupGenerationMode = HtmlSaveOptions.HtmlMarkupGenerationModes.WriteAllHtml, PageBorderIfAny = new SaveOptions.BorderInfo(borderStyle), FixedLayout = true, RasterImagesSavingMode = HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground }; doc.Save(destination, options);

tilal.ahmad · April 22, 2015, 1:11am

Hi Damon,

Thanks for your inquiry. Please share your sample PDF document here, we will look into it and guide you accordingly.

We are sorry for the inconvenience caused.

Best Regards,

damong · April 22, 2015, 2:06am

Here is a sample PDF that causes it.

I have found that it is pretty consistent that PDFs created by scanning cause it whilst others do not seem to.

codewarior · April 22, 2015, 7:48am

Hi Damon,

Thanks for sharing the resource files.

I have tested the scenario and I am able to reproduce the same problem. For the sake of correction, I have logged it in our issue tracking system as PDFNEWNET-38560. We will investigate this issue in details and will keep you updated on the status of a correction.

We apologize for your inconvenience.

damong · May 25, 2015, 10:08pm

Do you have an update on this issue?
We cant release our next version of our software until this is fixed.

tilal.ahmad · May 25, 2015, 11:41pm

Hi Damon,

Thanks for your inquiry. I am afraid your reported issue is still not resolved. It is pending for investigation due to other issues, already under investigation and resolution. However we have recorded your concern and will notify you as soon as we made some significant progress towards issue resolution.

We are sorry for the inconvenience caused.

Best Regards,

damong · August 6, 2015, 10:54pm

Has there been any progress on this issue. We cannot upgrade until it is fixed and it has been some time with no response.

tilal.ahmad · August 7, 2015, 8:42am

Hi Damon,

Thanks for your inquiry. I am afraid your reported issue is still not resolved. Currently our product team is busy to resolve other issues in the queue, reported earlier. We will notify you as soon as we made some significant progress towards issue resolution.

We are sorry for the inconvenience caused.

Best Regards,

damong · December 7, 2015, 6:30pm

Is there a fix for this yet, or even a work around for converting these PDF files to HTML?

We have waited a long time for this fix and it is becoming a critical issue. If Aspose PDF is not able to convert PDF files to HTML then we will have to start looking elsewhere for a working solution.

Any help provided would be appreciated.

tilal.ahmad · December 8, 2015, 2:44am

Hi Damon,

Thanks for your inquriy. Your above reported issue has been fixed in result of some other related fix. Please download and try latest version of Aspose.Pdf for .NET, it will help you to accomplish the task.

Please feel free to contact us for any further assistance.

Best Regards,