Getting Exception While converting PDF to Searchable PDF apply


#1

at System.Drawing.Bitmap.SetResolution(Single xDpi, Single yDpi)
at Aspose.Pdf.ImagePlacement.Save(Stream stream, ImageFormat format)
at #=zjHi53l$6uGI3JvQ1j6D1IoBAvGn0NmjDf7rhl0WCwnjsgttVr9WXG3A=.#=zkNlYWlM=(CallBackGetHocr #=zI7htKj75yPRO, Document #=zpJhYYgQ=)
at Aspose.Pdf.Document.Convert(CallBackGetHocr callback)
at Rectify.Core.Utilities.OcrConverterWithAspose.ConvertPDFToSearchable(String inputFilePath, String outputFilePath) in C:\Users\kumar\Documents\rectify-service\Rectify\000-Shared Library\Rectify.Core\Utilities\OcrConverterWithAspose.cs:line 84

I am getting this Exception while converting pdf to Searchable PDF. This what the code I am applying.

public static String ConvertPDFToSearchable(String inputFilePath, String outputFilePath)
{
Document doc = new Document(inputFilePath);

        try
        {
            //doc.Validate("validation-result-A1A.xml", PdfFormat.PDF_A_1B);
            doc.Convert(CallBackGetHocrWidthDLL);
            doc.Save(outputFilePath);
        }
        catch (Exception ex)
        {
            try
            {
                String TempPath = String.Format("{0}/{1}.tiff", Path.GetTempPath(), DateTime.Now.ToFileTime());
                TiffHelper.PDFToTiffImageOnOCRError(inputFilePath, TempPath);
                ConvertTiffToSearchable(TempPath, outputFilePath);
                File.Delete(TempPath);               

            }
            catch (Exception innerEx)
            {

                LOGMANAGER.LogsManager.Logger.LogProcessInfo(typeof(OcrConverterWithAspose)
                                          , COREDATAMODELS.LogLevel.SubComponent
                                          , nameof(OcrConverterWithAspose)
                                          , nameof(ConvertPDFToSearchable)
                                          , null
                                          , String.Format("Exception Occure.. {0}", innerEx.Message)
                                          , COREDATAMODELS.Stage.End);
            }
          

            LOGMANAGER.LogsManager.Logger.LogProcessInfo(typeof(OcrConverterWithAspose)
                                           , COREDATAMODELS.LogLevel.Component
                                           , nameof(OcrConverterWithAspose)
                                           , nameof(ConvertPDFToSearchable)
                                           , null
                                           , String.Format("Exception Occure.. {0}", ex.Message)
                                           , COREDATAMODELS.Stage.End);             
        }

        return outputFilePath;
    }

And this is coming for a particular PDF. I am attaching the PDF. Form NYS45 Complex RegEx Example.pdf (318.0 KB)


#2

@mdalam

Thank you for contacting support.

Would you please share SSCCE code so that we may try to reproduce and investigate it in our environment. Before sharing requested data, please ensure using Aspose.PDF for .NET 19.6.


#3

I am sharing SSCCE, here you can reproduce the same exception. Please find the sample.
demo.zip (6.2 MB)


#4

@mdalam

Thank you for sharing requested data.

We have worked with the application shared by you but are unable to notice any exception in our environment. It keeps executing smoothly and produces a PDF document fine. We have attached generated file for your kind reference. output.pdf

We have used Aspose.PDF for .NET 19.6, Tesseract 3.3.0 in Windows 10 with .NET Core 2.2.


#5

Thanks for your valuable time. Actually I was handling the Exception, Try Catch was there. So may be that’s why you are not able to reproduce the given exception. I am sharing you the code without Try catch. Can Please look into this.
Demo.zip (6.1 MB)


#6

@mdalam

Thank you for sharing updated code.

We have been able to reproduce ArgumentException and a ticket with ID PDFNET-46561 has been logged in our issue management system for further investigation and resolution. The ticket ID has been linked with this thread so that you will receive notification as soon as the ticket is resolved.

We are sorry for the inconvenience.


#7

Any update for above issue?


#8

@dharam116

Please note that the ticket PDFNET-46561 has been logged under free support model and will be investigated on first come first serve basis. Therefore, it may take some months to resolve. As soon as we have some definite updates or ETA regarding ticket resolution, we will let you know.

Moreover, we also offer Paid Support, where issues are used to be investigated with higher priority. Our customers, who have paid support subscription, report their issue there which are meant to be investigated urgently. In case your reported issue is a blocker, you may please consider subscribing for Paid Support. For further information, please visit Paid Support FAQs.