Absorber.Visit() throws out of range error

When the code goes to absorber.Visit(pdfDocument.Pages[1]); always give me a error,such as index out of range ,internal error.

Hi there,


Thanks for your inquiry. We will appreciate it if you please share a sample document here. We will look into it and will guide you accordingly.

We are sorry for the inconvenience caused.

Best Regards,

Hi Tilal,

I too facing the same problem when trying to extract the pages from the PDF

Below is the sample code

static void Main(string[] args)
{
// load existing PDF file
Document pdfDocument = new Document(“sample5.pdf”);
// Create TableAbsorber object to find tables

TableAbsorber absorber = new TableAbsorber(); // Index out of range error occurring here

absorber.Visit(pdfDocument.Pages[2]);

foreach (var table in absorber.TableList)
{
foreach (var row in table.RowList)
foreach (var cell in row.CellList)
foreach (var txt in cell.TextFragments)
Console.WriteLine(txt.ToString());
}

}

Sample5.pdf is a multi page PDF with tables embedded

Thanks in Advance

Ravi

Hi Ravi,


Thanks for contacting support.

The Index Out Of Range exception occurs when you try to access the resource/page which does not exist in PDF document. During my my testing when I have tried accessing second page of PDF file which only had single page, similar error message is displayed. However from your above statement, the input/source file is multi-page, so it appears to be a document specific issue.

Can you please share the resource/input file which you are using, so that we can again test the scenario in our environment. We are really sorry for this inconvenience.

Hi Nayyer,

Thanks for your reply.

The below code which am using to visit the pages
// load existing PDF file
Document pdfDocument = new Document(“sample4.pdf”);
// Create TableAbsorber object to find tables

TableAbsorber absorber = new TableAbsorber();

absorber.Visit(pdfDocument.Pages[2]);

Am able to visit all the pages when am using Textfragmentabsorber. The issue is only when am using Tableabsorber.

Please find the attached pdf that i used for extracting the pages


Also please find the stack track below which i observed



at System.ThrowHelper.ThrowArgumentOutOfRangeException()
at System.Collections.Generic.List1.get_Item(Int32 index)<br> at Aspose.Pdf.Text.TableAbsorber. (List1 )
at Aspose.Pdf.Text.TableAbsorber.Visit(Page page)
at PDFAbsorber.Program.Main(String[] args) in c:\Users\rkumar\Documents\Visual Studio 2012\Projects\PDFAbsorber\PDFAbsorber\Program.cs:line 28
at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[] args)
at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args)
at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()

Hi Ravi,


Thanks for sharing the details.

I have tested the scenario and I am able to
notice the same problem. For the sake of correction, I have logged this problem
as PDFNEWNET-39715 in our issue tracking system. We will
further look into the details of this problem and will keep you updated on the
status of correction. Please be patient and spare us little time. We are sorry
for this inconvenience.

Hi Nayyer,

Thanks for the info. I’ll wait for your update on this

Regards,
Ravi

I am experiencing the same issue regardless of the page #. Please let me know when there is a fix.

Hi Nayyer,


It seems like lot of facing the same problem. If you could give a quick solution or a patch it would be of great help

Thanks in advance

Regards,
Ravi
jim.fisher:
I am experiencing the same issue regardless of the page #. Please let me know when there is a fix.
Hi Jim,

Thanks for using our API's.

Can you please share your sample PDF file, so that we can also test the scenario using this document. We are sorry for this inconvenience.
ravikaranam27:
It seems like lot of facing the same problem. If you could give a quick solution or a patch it would be of great help
Hi Ravi,

When using same code snippet with one of my sample files with simple Table instance, I am unable to notice any issue. However the issue appears when trying to extract table from complex PDF document. Nevertheless, the team will investigate the earlier reported issue and as soon as we have some further updates, we will let you know.
ravikaranam27:
It seems like lot of facing the same problem. If you could give a quick solution or a patch it would be of great help
Hi Ravi,

When using same code snippet with one of my sample files with simple Table instance, I am unable to notice any issue. However the issue appears when trying to extract table from complex PDF document. Nevertheless, the team will investigate the earlier reported issue and as soon as we have some further updates, we will let you know.

Hi Nayyer,


Thanks for the reply. We will wait for the update. If possible could you please let me know any workaround to extract tables from a complex PDF, as it is an urgent requirement. It will be of great help

Thanks in Advance.

Regards,
Ravi

Attached is the request PDF sample.

jim.fisher:
Attached is the request PDF sample.
Hi Jim,

Thanks for sharing the resource file.

I have tested the scenario and I am able to notice the same problem. For the sake of correction, I have separately logged this problem as PDFNEWNET-39736 in our issue tracking system. We will further look into the details of this problem and will keep you updated on the status of correction. Please be patient and spare us little time. We are sorry for this inconvenience.

Hi Nayyer,


Any update on the issue posted. Could you please let me know

Regards,
Ravi

Hi Ravi,


Thanks for your patience.

As we recently have been able to notice this issue, so its still pending for review and as soon as we have some definite updates regarding its resolution, we will let you know.

It’s been a little over a month, do you have any updates?

Hi Jim,


Thanks for your inquiry. I am afraid above reported issues are still pending for analysis as currently our product team is busy in resolving other issues in the queue, reported earlier. We will notify you as soon as we made some significant progress towards issues’ resolution.

We are sorry for the inconvenience caused.

Best Regards,

Month #2 … Any update? Any guesses for a timeframe?