GoToURIAction destination encoding issues

Hi Aspose team,

I have one PDF with some text and few links. When I open document in Adobe Reader I see URL having “Démo estFolder”. When I use attached project Aspose.PDF is returning “/Démo%20estFolder2”.

What should I do to get what is really written in document? Can I influence encoding done by Aspose.PDF?

Aspose.Bugs.PDF.Encoding.zip (77.2 KB)

Thanks,
Oliver

@dr_oli

Thank you for contacting support.

Please always share SSCCE code for efficiency purposes. We have used below code and extracted result is almost similar to what it appears when checked with Adobe Acrobat. We have attached a comparison screenshot for your kind reference. Comparison.PNG

//open document
Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(@"D:\AsposeFiles\Aspose.Bugs.PDF.Encoding\Aspose.Bugs\bin\Debug\1.pdf");
foreach (Aspose.Pdf.Page objPage in pdfDocument.Pages)
{
    foreach (Aspose.Pdf.Annotations.Annotation objAnnotation in objPage.Annotations)
    {
        if (objAnnotation is Aspose.Pdf.Annotations.LinkAnnotation)
        {

            Aspose.Pdf.Annotations.LinkAnnotation objLinkAnnotation;
            objLinkAnnotation = (Aspose.Pdf.Annotations.LinkAnnotation)objAnnotation;

            if ((objLinkAnnotation.Action != null && objLinkAnnotation.Action.GetType() == typeof(Aspose.Pdf.Annotations.GoToURIAction)))
            {
                Aspose.Pdf.Annotations.GoToURIAction objURIAction;
                objURIAction = (Aspose.Pdf.Annotations.GoToURIAction)objLinkAnnotation.Action;
                string Address = objURIAction.URI.ToString();
                if (Address == null)
                    Address = "n.a";
                Console.WriteLine(Address);
            }
        }
    }
}

Hi Farhan,

what is SSCCE?

In your screenshot is exactly what I am saying - in Adobe Acrobat you see Démo and Aspose.PDF is returning Démo.

You cannot say we are returning almost similar :wink: it is either the same or it is not.

If this is a bug fix would be needed or if you say that you are reading encoding differently then question is what are options to set in Aspose.PDF so that encoding is properly interpreted.

Thx,
Oliver

@dr_oli

Thanks for writing back.

We are looking into the scenario again and will get back to you shortly.

Hi Asad,

any news here? Is this bug or you will have workaround how to encode properly PDF content?

Thx,
Oliver

@dr_oli

Thank you for elaborating it further.

About SSCCE, we had hyperlinked the text for your kind reference. About the issue of encoding, we have also tried HttpUtility.HtmlDecode method but the problem persists. Therefore, a ticket with ID PDFNET-46738 has been logged in our issue management system for further investigation and resolution. The ticket ID has been linked with this thread so that you will receive notification as soon as the ticket is resolved.

We are sorry for the inconvenience.

Hi,

I understand first come, first served approach but any progress regarding encoding here? I doubt that this is huge problem.

Thx,
Oliver

@dr_oli

Thank you for getting back to us.

We really understand your concerns and realize the significance of this issue. Please note that not every encoding issue is caused by same reasons as scenarios vary PDF to PDF. We have recorded your comprehensions and have escalated it internally. We will be trying to schedule it soon and will share our findings with you. Please spare us some time.