LayoutEnumerator omits text on second page of merge table cell

Hi,

the attached document contains a merged table cell which is split over two pages. When working with the LayoutEnumerator class to determine the layout of text in the document, it does not report the text that occurs in the cell on the second page of the document. The problem only seems to occur for merged cells - ordinary cells split over two pages work fine.

To reproduce, run the code below and note that ‘Eight’ and ‘Nine’ do not appear in the program output, which they should.

Hope that this can be fixed :slight_smile:

cheers,

Robin

using System;
using Aspose.Words;
using Aspose.Words.Layout;

namespace Aspose.Bugs
{
class Program
{
static void Main(string[] args)
{
var lic = new License();
lic.SetLicense(“Aspose.lic”);

var doc = new Document(“RowOverTwoPages.docx”);
var le = new LayoutEnumerator(doc);

Visit(le);
}

private static void Visit(LayoutEnumerator le)
{
do
{
if (le.Type == LayoutEntityType.Span)
Console.WriteLine(le.Text);

if (le.MoveFirstChild())
{
Visit(le);
le.MoveParent();
}
} while (le.MoveNext());
}
}
}


Hi Robin,


Thanks for your inquiry. I tested the scenario and have managed to reproduce the same problem on my side. For the sake of correction, I have logged this problem in our issue tracking system as WORDSNET-11097. Our development team will further look into the details of this problem and we will keep you updated on the status of correction. We apologize for your inconvenience.

Best regards,
Hi Robin,

Thanks for being patient. It is to update you that content in the second and remaining parts of the broken cell can be iterated using LayoutEnumerator.MoveNextLogical method. Currently there is no other way to get to this content from the page where it is rendered. This method moves to the next sibling entity in a logical order. When iterating lines of a paragraph broken across pages this method will move to the next line even if it resides on another page.

I hope, this helps.

Best regards,

Hi Awais,

thanks for this suggestion - as you say it works when a paragraph is broken over the page break within a merged cell, allowing you to move to the next line of the paragraph.

However, there is also the possibility that within the cell there is a paragraph break that occurs at the point where the cell is split between the two pages. In this case, calling MoveNextLogical() doesn’t advance the LayoutEnumerator at all and there still seems to be no way to access the layout of the second part of the cell. Is there a workaround for this situation too?

thanks,

Robin

Hi Robin,


Thanks for the additional information. We have logged your comment in our issue tracking system and will update you as soon as required information is available.

Best regards,

Hi Robin,


Thanks for being patient.

The layout model of the document has entities as described by LayoutEntityType enum. Method MoveNextLogical will move between entities logically linked underneath a common parent. For example, all lines of a paragraph are linked thus MoveNextLogical will move from one line to the next line no matter on what page that next line resides. However last line of paragraph does not have link to the next line (first in the following paragraph) because that line has different parent. However all spans of the story are linked together irrespective of parents. That is last span of the line is linked to the next span in the logical order which is the first span of the next line (which can be in the next paragraph).

So, technically, moving from the last line of a paragraph which is last in a broken cell to the first line of the paragraph which is first in the next broken part of this cell can be accomplished by moving between spans of these lines. For example,

LayoutEnumerator en = new LayoutEnumerator(doc);
en.MoveFirstChild(); // Moves to the 1st column on the 1st page.
en.MoveFirstChild(); // Assume there is a one cell broken table at document start. This will move the 1st row of it.
en.MoveFirstChild(); // Moves to the 1st cell of the 1st row. This cell is broken.
en.MoveFirstChild(); // Moves to the 1st line of the cell. This is the only line in the 1st paragraph in the cell.
en.MoveNext(); // Returns false since there is no more content on the page.
en.MoveNextLogical(); // Return false also since line is last in paragraph and is not liked to the next logical line.
en.MoveLastChild(); // Moves to the last span in the line which is paragraph break.
en.MoveNextLogical(); // Moves to the 1st span of the next line which is the 1st line in the 2nd part of the cell on the 2nd page.
en.MoveParent(); // Moves to the parent line which is the 1st line in the broken part of the 1st cell of the table on the 1st page.

I hope, this helps.

Best regards,

Hi Awais,


thanks for the further update - it makes sense and I will implement it in our product.

cheers,

Robin

Hi Robin,


Thanks for your feedback. We have now closed WORDSNET-11097 with “Not a Bug” decision.

Best regards,