Helo
I use:
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber(new TextEditOptions(TextEditOptions.NoCharacterAction.UseStandardFont ));
info.Document.Pages[page].Accept(textFragmentAbsorber);
the result in the text is as follows:
textFragmentAbsorber.Text ==> " Fecha: 18/01/2012 Longitudinal, 6 Nº 117 Mercabarna"
But, in textFragmentAbsorber.TextFragments ==>
TextFragments .item(9) == 1
TextFragments .item(10) == 8
TextFragments .item(11 == /01/
TextFragments .item(12) == 20
WHY???
@manel.gracia
The API extracts the text in the form it was added in the PDF document. In other words, the text extraction depends upon how the text is stored in the structure of PDF file. Would you please share your sample PDF document with us so that we can test the scenario in our environment and address it accordingly.
@manel.gracia
We tested the scenario in our environment and as per our observations, the API was extracting text as expected. However, we have logged an investigation ticket as PDFNET-48965 in our issue tracking system to investigate further and determine how to force API to extract text as you desire. We will look into ticket details and keep you posted with the status of its resolution. Please be patient and spare us some time.
We are sorry for the inconvenience.