PDF to HTML is not properly converting images and urls

Hello
Test PDFs.zip (1.8 MB)

We have been using Aspose PDF v20.5 and for some pdf documents that contain images and/or urls the output html seems broken. I have also tried with the latest version of the dll and its having the same issue. I have attached the pdfs I am using and pasted the code below. Please let me know what we can do to fix this issue.

private string PDFToHTML(string filePath)
{
Aspose.Pdf.License license = new Aspose.Pdf.License();
// Apply a license using the embedded resource name.
// license.SetLicense(“Aspose.Pdf.lic”);

        Document doc = new Document(filePath);

        HtmlSaveOptions htmlOptions = this.GetHtmlOptions();

        using (var output = new MemoryStream())
        {
            doc.Save(output, htmlOptions);
            var html = Encoding.UTF8.GetString(output.GetBuffer(), 0, (int)output.Length);
            return html;
        }
    }

    private HtmlSaveOptions GetHtmlOptions()
    {
        HtmlSaveOptions htmlOptions = new HtmlSaveOptions
        {
            FixedLayout = true,
            CompressSvgGraphicsIfAny = false,
            SaveTransparentTexts = true,
            SaveShadowedTextsAsTransparentTexts = true,
            FontSavingMode = HtmlSaveOptions.FontSavingModes.DontSave,
            // DefaultFontName = "Comic Sans MS",
            UseZOrder = true,
            LettersPositioningMethod = HtmlSaveOptions.LettersPositioningMethods.UseEmUnitsAndCompensationOfRoundingErrorsInCss,
            PartsEmbeddingMode = HtmlSaveOptions.PartsEmbeddingModes.EmbedAllIntoHtml,
            RasterImagesSavingMode = HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground,
            SplitIntoPages = false
        };
        return htmlOptions;
    }

Thanks
Vijay

@courtneye

Can you please confirm if you meant by these white lines in the image inside HTML?
image.jpg (44.4 KB)

Yes, those are the lines and also some characters were getting replaced by some unicode character.

@courtneye

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-56654

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Thank you Asad. We will look into Paid Support Services.

Hi Asad

We are planning to go ahead and get Paid Support Services. What would be the estimate for the fix, our customer would want to know.

Thanks
Vijay

@courtneye

Using paid support option, your issue will be raised to the highest priority and you will be able to receive updates quicker regarding its investigation process as well as the ETA. Once you purchase it, you will be able to login into the paid support forum with the same email address used for purchase.

Please allow us to perform some investigation so that we can share how soon it can be resolved in case you go with the paid support option.

Hi Ali

We have purchased the paid support options. Do you want me to create a new ticket on the paid support forum or can you move the ticket there.

Thanks

@courtneye

You do not need to describe the whole issue there. You can please create a topic there with the reference to the ticket ID and it will escalated to the highest priority.

@courtneye

Right now the best option it is using Aspose.Pdf.Drawing (latest version) instead of Aspose.PDF for .NET. We are working on it and we need to make some additional investigation for it.