Docx to pdf, table shape change

yjsdfsdf · June 28, 2023, 10:55am

Hello, When I use aspose java to convert docx to PDF, there is a problem with the table inside, there are two cases, please see the picture below, thank you
Downloads.zip (259.1 KB)

alexey.noskov · June 28, 2023, 12:37pm

@yjsdfsdf As I can see Aspose.Words produced the same output as MS Word does. Even in MS Word tables in your document go beyond the page margins. Partially, the problem can be resolved using the following code:

Document doc = new Document("C:\\Temp\\in.docx");
        
Iterable<Table> tables = doc.getChildNodes(NodeType.TABLE, true);
for (Table t : tables)
    t.autoFit(AutoFitBehavior.AUTO_FIT_TO_WINDOW);
        
doc.save("C:\\Temp\\out.pdf");

In this case tables does not go beyond the page margins, but some tables have too narrow columns in this case. If you have control over the document creation, I would suggest you to reformat the table in MS Word to make them fit the page.

yjsdfsdf · June 28, 2023, 12:52pm

In case 2，the columns are not narrow，I donnot know why，do you have any other ideas？thank you

alexey.noskov · June 28, 2023, 1:17pm

@yjsdfsdf Aspose.Words autofits the table (the case 2) the same way as MS Word does. So Aspose.Words behavior is correct. the only way to to improve the table, is to reformat it using MS Word.

yjsdfsdf · June 29, 2023, 12:15am

Are there some APIs that use aspose to reformat this table？thank you

alexey.noskov · June 29, 2023, 5:03am

@yjsdfsdf Unfortunately, there is no general solution for the issue. Could you please let us know how the tables was generated? Probably it would be easier to adjust the table generation process to make the tables look correct.

yjsdfsdf · June 30, 2023, 5:41am

Excuse me, how can I get the width of the table? thank you

alexey.noskov · June 30, 2023, 8:30am

@yjsdfsdf You can use LayoutCollector and LayoutEnumerator classes to calculate actual size of tables
You can use code like the following to get bounding box of tables in your document:

// Open document
Document doc = new Document("C:\\Temp\\in.docx");

// Create LayoutCollector and LayoutEnumerator classes to get layout information of nodes.
LayoutCollector collector = new LayoutCollector(doc);
LayoutEnumerator enumerator = new LayoutEnumerator(doc);

// Calculate bounding boxes of table in the document.
Iterable<Table> tables = doc.getChildNodes(NodeType.TABLE, true);
for (Table t : tables)
{
    // Skip tables which are in header footer(LayoutCollector and LayoutEnumerator classes do not work with header/footer nodes)
    if (t.getAncestor(NodeType.HEADER_FOOTER) != null)
        continue;

    // Move LayoutEnumerator to the first row
    enumerator.setCurrent(collector.getEntity(t.getFirstRow().getFirstCell().getFirstParagraph()));
    while (enumerator.getType() != LayoutEntityType.ROW)
        enumerator.moveParent();

    //Get rectangle of the first row of the table.
    Rectangle2D first_rect = enumerator.getRectangle();

    // Do the same with last row
    enumerator.setCurrent(collector.getEntity(t.getLastRow().getFirstCell().getFirstParagraph()));
    while (enumerator.getType() != LayoutEntityType.ROW)
        enumerator.moveParent();

    // Get rectangle of the last row in the table.
    Rectangle2D last_rect = enumerator.getRectangle();
    // Union of the rectangles is the bounding box of the table.
    Rectangle2D result_rect = first_rect.createUnion(last_rect);

    System.out.println("Table rectangle : x=" + result_rect.getX() + ", y=" + result_rect.getY() + ", width=" + result_rect.getWidth() + ", height=" + result_rect.getHeight());
}

Please note, the code is simplified to demonstrate the basic technique and converts only tables placed on a single page. In MS Word tables can span more than one page.