We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Include current page object in CallBackGetHocr delegate

Hello,

Currently the delegate callback function Aspose.Pdf.Document.CallBackGetHocr only provides one parameter, which is the current page image:

delegate string Aspose.Pdf.Document.CallBackGetHocr(System.Drawing.Image img)

Can you please add an overload to include the current page object, e.g.:

delegate string Aspose.Pdf.Document.CallBackGetHocr(System.Drawing.Image img, Aspose.Pdf.Page page)

This would be useful to allow access to further information about the page being processed.

Thanks

Even just the page number of the image being passed would be OK, e.g.:

delegate string Aspose.Pdf.Document.CallBackGetHocr(System.Drawing.Image img, int pageNumber)

@ast3

We will surely look into your requirements and investigate its feasibility. However, could you please share complete code snippet which you are using at your side so that we can generate a feature request accordingly.

Thanks Asad

I want to be able to do something like this:

private void ConvertPdf()
{
    var pdf = new Aspose.Pdf.Document("input.pdf");
    pdf.Convert(GetHocr);
}

private string GetHocr(System.Drawing.Image img)
{
    Console.WriteLine("Processing image on page ??"); // How to show current page?
    string hocr = GenerateHocr(img); // function that returns HOCR of image
    return hocr;
}

All this needs is simply an overload of the CallBackGetHocr delegate that provides the page number. So the GetHocr method can be:

private string GetHocr(System.Drawing.Image img, int pageNumber)
{
    Console.WriteLine("Processing image on page " + pageNumber);
    string hocr = GenerateHocr(img); // function that returns HOCR of image
    return hocr;
}

Or it could provide the Page object, i.e.

private string GetHocr(System.Drawing.Image img, Aspose.Pdf.Page page)
{
    Console.WriteLine("Processing image on page " + page.Number);
    string hocr = GenerateHocr(img); // function that returns HOCR of image
    return hocr;
}

This would also allow skipping images from certain pages. E.g.

private string GetHocr(System.Drawing.Image img, int pageNumber)
{
    // Only generate HOCR for pages 1-3
    if (pageNumber < 4)
    {
        string hocr = GenerateHocr(img); // function that returns HOCR of image
        return hocr;
    }
    else
    {
        return "";
    }
}

@ast3

Thanks for providing further details.

We have logged a feature request as PDFNET-48540 in our issue tracking system. We will investigate the feasibility of your requirements and keep you informed about the status of their implementation. Please be patient and spare us some time.