How to calculate paragraphs in word occupy a few lines

LayoutEnumerator

Please provide me with a case, thanks

@Crane

To calculate how many lines a paragraph occupies in a Word document using the LayoutEnumerator class in Aspose.Words for Java, you can follow the example below. This example demonstrates how to traverse the layout entities of a document and determine the number of lines occupied by a specific paragraph.

Example Code

import com.aspose.words.*;
import java.awt.geom.Rectangle2D;

public class ParagraphLineCounter {
    public static void main(String[] args) throws Exception {
        // Load the document
        Document doc = new Document("path/to/your/document.docx");

        // Create a LayoutEnumerator to traverse the layout entities
        LayoutEnumerator layoutEnumerator = new LayoutEnumerator(doc);

        // Move to the first paragraph in the document
        layoutEnumerator.setCurrent(layoutEnumerator.getEntity(doc.getChild(NodeType.PARAGRAPH, 0, true)));

        // Initialize a counter for the number of lines
        int lineCount = 0;

        // Traverse through the layout entities
        do {
            // Check if the current entity is a span (which represents a line of text)
            if (layoutEnumerator.getType() == LayoutEntityType.SPAN) {
                // Get the rectangle that represents the span's area
                Rectangle2D.Float rect = layoutEnumerator.getRectangle();
                System.out.println("Line: " + layoutEnumerator.getText() + " occupies rectangle: " + rect);
                lineCount++;
            }
        } while (layoutEnumerator.moveNext());

        // Output the total number of lines occupied by the paragraph
        System.out.println("Total lines occupied by the paragraph: " + lineCount);
    }
}

Explanation

  1. Document Loading: The document is loaded from a specified path.
  2. LayoutEnumerator: An instance of LayoutEnumerator is created to navigate through the layout entities of the document.
  3. Current Entity: The enumerator is set to the first paragraph in the document.
  4. Line Counting: A loop is used to traverse through the layout entities. For each span (which represents a line of text), the rectangle dimensions are printed, and a counter is incremented.
  5. Output: Finally, the total number of lines occupied by the paragraph is printed.

This approach allows you to effectively count the number of lines a paragraph occupies in a Word document using Aspose.Words. Make sure to replace "path/to/your/document.docx" with the actual path to your Word document.

If you have any further questions or need additional examples, feel free to ask!

The results obtained are incorrect. :broken_heart:
countLine.docx (16.1 KB)

Document doc = new Document("countLine.docx");

// Create a LayoutEnumerator to traverse the layout entities
LayoutEnumerator layoutEnumerator = new LayoutEnumerator(doc);
LayoutCollector layoutCollector = new LayoutCollector(doc);

// Move to the first paragraph in the document
layoutEnumerator.setCurrent(layoutCollector.getEntity(doc.getChild(NodeType.PARAGRAPH, 0, true)));

// Initialize a counter for the number of lines
int lineCount = 0;

// Traverse through the layout entities
do
{
    // Check if the current entity is a span (which represents a line of text)
    if (layoutEnumerator.getType() == LayoutEntityType.SPAN)
    {
        // Get the rectangle that represents the span's area
        Rectangle2D.Float rect = layoutEnumerator.getRectangle();
        System.out.println("Line: " + layoutEnumerator.getText() + " occupies rectangle: " + rect);
        lineCount++;
    }
} while (layoutEnumerator.moveNext());

// Output the total number of lines occupied by the paragraph
System.out.println("Total lines occupied by the paragraph: " + lineCount);

@Crane You can use the following code to split the document into lines:

Document doc = new Document("C:\\Temp\\in.docx");

// Split all Run nodes in the document to make them not more than one word.
Node[] runs = doc.getChildNodes(NodeType.RUN, true).toArray();
for (Node n : runs)
{
    Run current = (Run)n;
    while (current.getText().indexOf(' ') >= 0)
        current = SplitRun(current, current.getText().indexOf(' ') + 1);
}

// Wrap all runs in the document with bookmarks to make it possible to work with LayoutCollector and LayoutEnumerator
runs = doc.getChildNodes(NodeType.RUN, true).toArray();
    
ArrayList<String> tmpBookmakrs = new ArrayList<String>();
int bkIndex = 0;
for (Node r : runs)
{
    // LayoutCollector and LayoutEnumerator does not work with nodes in header/footer or in textboxes.
    if (r.getAncestor(NodeType.HEADER_FOOTER) != null || r.getAncestor(NodeType.SHAPE) != null)
        continue;
        
    BookmarkStart start = new BookmarkStart(doc, "r" + bkIndex);
    BookmarkEnd end = new BookmarkEnd(doc, start.getName());
        
    r.getParentNode().insertBefore(start, r);
    r.getParentNode().insertAfter(end, r);
        
    tmpBookmakrs.add(start.getName());
    bkIndex++;
}

// Now we can use collector and enumerator to get runs per line in MS Word document.
LayoutCollector collector = new LayoutCollector(doc);
LayoutEnumerator enumerator = new LayoutEnumerator(doc);
    
Object currentLine = null;
for (String bkName : tmpBookmakrs)
{
    Bookmark bk = doc.getRange().getBookmarks().get(bkName);
        
    enumerator.setCurrent(collector.getEntity(bk.getBookmarkStart()));
    while (enumerator.getType() != LayoutEntityType.LINE)
        enumerator.moveParent();
            
    if (!enumerator.getCurrent().equals(currentLine))
    {
        currentLine = enumerator.getCurrent();
            
        System.out.println();
        System.out.println("-------=========Start Of Line=========-------");
        // Here you can get coordinates of the line.
        System.out.println(enumerator.getRectangle());
    }
        
    Node nextNode = bk.getBookmarkStart().getNextSibling();
    if (nextNode != null && nextNode.getNodeType() == NodeType.RUN)
        System.out.print(((Run)nextNode).getText());
}
private static Run SplitRun(Run run, int position)
{
    Run afterRun = (Run)run.deepClone(true);
    run.getParentNode().insertAfter(afterRun, run);
    afterRun.setText(run.getText().substring(position));
    run.setText(run.getText().substring(0, position));
    return afterRun;
}

@alexey.noskov

Then, do you want to calculate the number of lines based on the character size and line spacing?

@Crane The code provided above splits the document into lines using document layout information provided by Aspose.Words document layout engine. It takes in account character size, line spacing and many other parameters.