We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Problems with Text absorber with pdf generated by Amyuni pdf converter

We use Aspose pdf for .net with Text absorber, to extract text from specific regions of a pdf page. We havo problems with a customer wich produces Pdf from AS/400 using a software (Laser 400) based on Amyuni Pdf Converter. We have two categories of files with errors, in the first (Gel_Bolla_ven.pdf)Text absorber seems find no text (in some points in debug we see extracted a serie of \0 only), in the second (Gel_Ord_For.pdf) we find the fixed text (these are invoices generated from a template filled with specific values in the fields) but not the values.
We use absorber with a rectangle, we tried also with a rectangle which covers the whole page,

                var absorber = new TextAbsorber
                    TextSearchOptions =
                        LimitToPageBounds = true,
                        Rectangle = new Aspose.Pdf.Rectangle(left, pdf.PageInfo.Height - top, right, pdf.PageInfo.Height - bottom)
                // Accept the absorber for page (1-based)
                pdf.Pages[nPage + 1].Accept(absorber);
We tried also to set the options

but with no result
Gel_Bolla_Ven.pdf (544.4 KB)
Gel_Ord_For.pdf (517.6 KB)


Can you give me a code snippet that I can run, please?

You are missing some lines in order for me to run your code and replicate the issue.

Here it is, thanks

Program.zip (590 Bytes)


I tried TextAbsorber and TextFragmentAbsorber but none worked properly on this PDF document. I will create a ticket for the dev team.

This is the code I used to read it. I drawed a rectangle on top of the text just to know if the coordinated where the correct ones.

private void Logic()
    Document doc = new Document($"{PartialPath}_input.pdf");

    var page = doc.Pages[1];

    var ta = new TextAbsorber();
    ta.TextSearchOptions.Rectangle = new Aspose.Pdf.Rectangle(310, 630, 210, 55);
    ta.TextSearchOptions.LimitToPageBounds = true;
    ta.TextSearchOptions.SearchForTextRelatedGraphics = false;

    Console.WriteLine($"Text: {ta.Text}");

    var tfa = new TextFragmentAbsorber();
    tfa.TextSearchOptions.Rectangle = new Aspose.Pdf.Rectangle(310, 630, 210, 55);
    tfa.TextSearchOptions.LimitToPageBounds = true;
    tfa.TextSearchOptions.SearchForTextRelatedGraphics = false;

    int count = 0;
    foreach (var fragment in tfa.TextFragments)
        Console.WriteLine($"Frag {count}: {fragment.Text}");

    var pageInfo = page.PageInfo;
    var marginInfo = page.PageInfo.Margin;
    var graph = new Graph((float)pageInfo.Width, (float)pageInfo.Height);
    graph.Left = marginInfo.Left * -1;
    graph.Top = marginInfo.Top * -1;
    var rectangle = new Aspose.Pdf.Drawing.Rectangle(310, 630, 210, 55);
    rectangle.GraphInfo.FillColor = Aspose.Pdf.Color.Red;
    rectangle.GraphInfo.Color = Aspose.Pdf.Color.Black;

    // Save output PDF document

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-53952

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.