Fit Wide HTML Tables with 100% Width within the Page Bounds of Word DOCX Document | Java

I am using aspose.words to convert html to docx; I have no control of the html input. When the input is very wide (such as a table with many columns) then the resulting docx is cut off on the right side. Is there a way to shrink to fit? The html is 100% which is fine in a browser but not in Print Layout in Docx.

I have a sample java program (Java 8, Aspose-words 21.7) which I am happy to post here but I don’t see a way to attach a zip file so here are 3 files: the pom, the program, the input html.

If Aspose does not support this out of the box then is there any workaround I could use?

pom.xml:

 <?xml version="1.0" encoding="UTF-8"?>
 <project xmlns="http://maven.apache.org/POM/4.0.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
     <modelVersion>4.0.0</modelVersion>
 
     <groupId>com.sharon</groupId>
     <artifactId>aspose-test</artifactId>
     <version>1.0-SNAPSHOT</version>
 
     <build>
         <plugins>
             <plugin>
                 <groupId>org.apache.maven.plugins</groupId>
                 <artifactId>maven-compiler-plugin</artifactId>
                 <configuration>
                     <source>8</source>
                     <target>8</target>
                 </configuration>
             </plugin>
         </plugins>
     </build>
 
     <dependencies>
     <dependency>
         <groupId>com.aspose</groupId>
         <artifactId>aspose-words</artifactId>
         <version>21.7</version>
         <type>pom</type>
     </dependency>
     </dependencies>
 
 </project>

ConvertHtmlToDocx :

 package com.sharon;
 
 import com.aspose.words.*;
 import java.io.*;
 import java.util.stream.Collectors;
 
 public class ConvertHtmlToDocx {
 
     public static void main(String[] args) {
         System.out.println("Start");
         InputStream inputStream = ConvertHtmlToDocx.class.getResourceAsStream("/input.html");
         String html = new BufferedReader(new InputStreamReader(inputStream)).lines().collect(Collectors.joining("\n"));
         File tempFile = createOutputFile("output.docx");
         htmlToDocx(tempFile, html);
         System.out.println("Done");
     }
 
     private static void htmlToDocx(File finalOutputFile, String html) {
         Document finalDoc = null;
         try {
             finalDoc = new Document();
             DocumentBuilder builder = new DocumentBuilder(finalDoc);
             builder.insertHtml(html);
             SaveOptions saveOptions = getSaveOptions();
             FileOutputStream fileOutputStream = new FileOutputStream(finalOutputFile);
             finalDoc.save(fileOutputStream, saveOptions);
         } catch (Exception e) {
             e.printStackTrace();
         }
     }
 
     private static File createOutputFile(String fileName) {
         File tmpFile = new File("/tmp", fileName);
         tmpFile.getParentFile().mkdirs();
         return tmpFile;
     }
 
     private static SaveOptions getSaveOptions() {
         OoxmlSaveOptions ooxmlSaveOptions = new OoxmlSaveOptions();
         ooxmlSaveOptions.setSaveFormat(SaveFormat.DOCX);
         return ooxmlSaveOptions;
     }
 }

input.html:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <title> Foo Bar Title </title>
</head>
<body>
    <div xmlns="" id="edgarForm4" class="edgarFormContainer">
        <BR /><HR WIDTH="90%" SIZE="2" COLOR="#000033" align="left" />
        <table width="100%" border="0" cellspacing="0" cellpadding="4">
            <tr>
                <td width="20%" colspan="2" valign="top" align="center" class="FormName">FORM 4</td>
                <td rowspan="2" width="60%" valign="middle" align="center"><span class="FormTitle">UNITED STATES SECURITIES AND EXCHANGE COMMISSION</span><br /><span class="MedSmallFormText">Washington, D.C. 20549</span><br /><br /><span class="FormTitle">STATEMENT OF CHANGES IN BENEFICIAL OWNERSHIP</span><br /><br /><span class="MedSmallFormText">Filed pursuant to Section 16(a) of the Securities Exchange Act of 1934</span><br /><span class="MedSmallFormText">or Section 30(h) of the Investment Company Act of 1940</span></td>
                <td rowspan="2" width="20%" valign="top" align="center">
                    <table width="100%" border="1" summary="OMB Approval Status Box">
                        <tr>
                            <td class="FormTextC">SOME APPROVAL</td>
                        </tr>
                        <tr>
                            <td>
                                <table width="100%" border="0" summary="OMB Interior Box">
                                    <tr>
                                        <td class="SmallFormText" colspan="3">OMB Number:</td>
                                        <td class="SmallFormTextR">3235-0287</td>
                                    </tr>
                                    <tr>
                                        <td class="SmallFormText">Expires:</td>
                                        <td class="SmallFormTextR" colspan="3">December 31, 2014</td>
                                    </tr>
                                    <tr>
                                        <td class="SmallFormText" colspan="4">Estimated average burden</td>
                                    </tr>
                                    <tr>
                                        <td class="SmallFormText" colspan="3">hours per response:</td>
                                        <td class="SmallFormTextR">0.5</td>
                                    </tr>
                                </table>
                            </td>
                        </tr>
                    </table>
                </td>
            </tr>
            <tr valign="middle">
                <td>
                    <table width="100%" border="1" cellpadding="0" cellspacing="0">
                        <tr>
                            <td>  </td>
                        </tr>
                    </table>
                </td>
                <td class="SmallFormText">
                    Check this box if no longer subject to Section 16. Form 4 or Form 5 obligations may
                    continue.
                    <i>See</i>

                    Instruction 1(b).
                </td>
            </tr>
        </table>
        <table width="100%" border="1" cellspacing="0" cellpadding="4">
            <tr>
                <td rowspan="3" width="35%" valign="top">
                    <span class="MedSmallFormText">1. Name and Address of Reporting Person<sup>*</sup></span><table border="0" width="100%">
                        <tr>
                            <td>LAST FIRST M</td>
                        </tr>
                    </table>
                    <hr width="98%" />
                    <table border="0" width="100%">
                        <tr>
                            <td width="33%" class="MedSmallFormText">(Last)</td>
                            <td width="33%" class="MedSmallFormText">(First)</td>
                            <td width="33%" class="MedSmallFormText">(Middle)</td>
                        </tr>
                    </table>
                    <table border="0" width="100%">
                        <tr>
                            <td><span class="FormData">SOME COMPANY NAME</span></td>
                        </tr>
                        <tr>
                            <td><span class="FormData">P.O. BOX 1234</span></td>
                        </tr>
                    </table>
                    <hr width="98%" /><span class="MedSmallFormText">(Street)</span><table border="0" width="100%">
                        <tr>
                            <td width="33%"><span class="FormData">ABC</span></td>
                            <td width="33%"><span class="FormData">CT</span></td>
                            <td width="33%"><span class="FormData">12345</span></td>
                        </tr>
                    </table>
                    <hr width="98%" />
                    <table border="0" width="100%">
                        <tr>
                            <td width="33%" class="MedSmallFormText">(City)</td>
                            <td width="33%" class="MedSmallFormText">(State)</td>
                            <td width="33%" class="MedSmallFormText">(Zip)</td>
                        </tr>
                    </table>
                </td>
                <td width="35%" valign="top">
                    <span class="MedSmallFormText">
                        2. Issuer Name <b>and</b> Ticker or Trading Symbol
                    </span><br />ANOTHER COMPANY NAME
                    [ <span class="FormData">DAL</span> ]
                </td>
                <td rowspan="2" valign="top">
                    <span class="MedSmallFormText">
                        5. Relationship of Reporting Person(s) to Issuer
                    </span><br /><span class="MedSmallFormText">(Check all applicable)</span><table border="0" width="100%">
                        <tr>
                            <td width="15%" align="center"></td>
                            <td width="35%" class="MedSmallFormText">Director</td>
                            <td width="15%" align="center"></td>
                            <td width="35%" class="MedSmallFormText">10% Owner</td>
                        </tr>
                        <tr>
                            <td align="center"><span class="FormData">X</span></td>
                            <td class="MedSmallFormText">Officer (give title below)</td>
                            <td align="center"></td>
                            <td class="MedSmallFormText">Other (specify below)</td>
                        </tr>
                        <tr>
                            <td colspan="4" align="center"><span class="FormData">President</span><span class="FormData"></span></td>
                        </tr>
                    </table>
                </td>
            </tr>
            <tr>
                <td valign="top">
                    <span class="MedSmallFormText">
                        3. Date of Earliest Transaction
                        (Month/Day/Year)
                    </span><br /><span class="FormData">11/09/2020</span>
                </td>
            </tr>
            <tr>
                <td valign="top">
                    <span class="MedSmallFormText">
                        4. If Amendment, Date of Original Filed
                        (Month/Day/Year)
                    </span><br />
                </td>
                <td valign="top">
                    <span class="MedSmallFormText">
                        6. Individual or Joint/Group Filing (Check Applicable Line)
                    </span><table border="0" width="100%">
                        <tr>
                            <td width="15%" align="center"><span class="FormData">X</span></td>
                            <td width="85%" class="MedSmallFormText">Form filed by One Reporting Person</td>
                        </tr>
                        <tr>
                            <td width="15%" align="center"></td>
                            <td width="85%" class="MedSmallFormText">Form filed by More than One Reporting Person</td>
                        </tr>
                    </table>
                </td>
            </tr>
        </table>
        <table width="100%" border="1" cellspacing="0" cellpadding="4">
            <thead>
                <tr>
                    <th width="100%" valign="top" colspan="11" align="center" class="FormTextC"><b>Table I - Non-Derivative Securities Acquired, Disposed of, or Beneficially Owned</b></th>
                </tr>
                <tr>
                    <th width="36%" valign="top" rowspan="2" align="left" class="MedSmallFormText">
                        1. Title of Security (Instr.
                        3)
                    </th>
                    <th width="6%" valign="top" rowspan="2" align="left" class="SmallFormText">
                        2. Transaction Date
                        (Month/Day/Year)
                    </th>
                    <th width="5%" valign="top" rowspan="2" align="left" class="SmallFormText">
                        2A. Deemed Execution Date, if any
                        (Month/Day/Year)
                    </th>
                    <th width="7%" valign="top" colspan="2" align="left" class="SmallFormText">
                        3. Transaction Code (Instr.
                        8)
                    </th>
                    <th width="19%" valign="top" colspan="3" align="left" class="SmallFormText">
                        4. Securities Acquired (A) or Disposed Of (D) (Instr.
                        3, 4 and 5)
                    </th>
                    <th width="11%" valign="top" rowspan="2" align="left" class="SmallFormText">
                        5.
                        LOOK HERE: The right side of this will be cut off. The right edge of the box is cut off and the next 2 columns are also cut off.
                    </th>
                    <th width="9%" valign="top" rowspan="2" align="left" class="SmallFormText">
                        CANNOT SEE THIS IN PRINT LAYOUT: 6. Ownership Form: Direct (D) or Indirect (I) (Instr.
                        4)
                    </th>
                    <th width="8%" valign="top" rowspan="2" align="left" class="SmallFormText">
                        CANNOT SEE THIS IN PRINT LAYOUT: 7. Nature of Indirect Beneficial Ownership (Instr.
                        4)
                    </th>
                </tr>
                <tr>
                    <th width="4%" align="center" class="SmallFormText">Code</th>
                    <th width="3%" align="center" class="SmallFormText">V</th>
                    <th width="8%" align="center" class="SmallFormText">Amount</th>
                    <th width="5%" align="center" class="SmallFormText">(A) or (D)</th>
                    <th width="6%" align="center" class="SmallFormText">Price</th>
                </tr>
            </thead>
            <tbody>
                <tr>
                    <td align="left"><span class="FormData">Common Stock</span></td>
                    <td align="center"><span class="FormData">13/13/3223</span></td>
                    <td align="center"><span class="FormData"></span></td>
                    <td align="center"><span class="SmallFormData">S</span></td>
                    <td align="center"></td>
                    <td align="center"><span class="FormData">15,155</span></td>
                    <td align="center"><span class="FormData">D</span></td>
                    <td align="center"><span class="FormText">$</span><span class="FormData">12.345</span><a xmlns="http://www.w3.org/1999/xhtml" href="#F1"><sup>(1)</sup></a></td>
                    <td align="center"><span class="FormData">123,456</span></td>
                    <td align="center"><span class="FormData">D</span></td>
                    <td align="left"></td>
                </tr>
            </tbody>
        </table>
        <table width="100%" border="1" cellspacing="0" cellpadding="4">
            <thead>
                <tr>
                    <th width="100%" valign="top" colspan="16" align="center" class="FormTextC"><b>Table II - Derivative Securities Acquired, Disposed of, or Beneficially Owned</b><br /><b>(e.g., puts, calls, warrants, options, convertible securities)</b></th>
                </tr>
                <tr>
                    <th width="13%" valign="top" rowspan="2" align="left" class="SmallFormText">
                        1. Title of Derivative Security (Instr.
                        3)
                    </th>
                    <th width="5%" valign="top" rowspan="2" align="left" class="SmallFormText">
                        2. Conversion or Exercise Price of Derivative Security
                    </th>
                    <th width="5%" valign="top" rowspan="2" align="left" class="SmallFormText">
                        3. Transaction Date
                        (Month/Day/Year)
                    </th>
                    <th width="5%" valign="top" rowspan="2" align="left" class="SmallFormText">
                        3A. Deemed Execution Date, if any
                        (Month/Day/Year)
                    </th>
                    <th width="9%" valign="top" colspan="2" align="left" class="SmallFormText">
                        4. Transaction Code (Instr.
                        8)
                    </th>
                    <th width="10%" valign="top" colspan="2" align="left" class="SmallFormText">
                        5.
                        Number of Derivative Securities Acquired (A) or Disposed of (D) (Instr.
                        3, 4 and 5)
                    </th>
                    <th width="9%" valign="top" colspan="2" align="left" class="SmallFormText">
                        6. Date Exercisable and Expiration Date
                        (Month/Day/Year)
                    </th>
                    <th width="17%" valign="top" colspan="2" align="left" class="SmallFormText">
                        7. Title and Amount of Securities Underlying Derivative Security (Instr.
                        3 and 4)
                    </th>
                    <th width="6%" valign="top" rowspan="2" align="left" class="SmallFormText">
                        8. Price of Derivative Security (Instr.
                        5)
                    </th>
                    <th width="6%" valign="top" rowspan="2" align="left" class="SmallFormText">
                        9.
                        Number of derivative Securities Beneficially Owned Following Reported Transaction(s)
                        (Instr.
                        4)
                    </th>
                    <th width="6%" valign="top" rowspan="2" align="left" class="SmallFormText">
                        10. Ownership Form: Direct (D) or Indirect (I) (Instr.
                        4)
                    </th>
                    <th width="7%" valign="top" rowspan="2" align="left" class="SmallFormText">
                        11. Nature of Indirect Beneficial Ownership (Instr.
                        4)
                    </th>
                </tr>
                <tr>
                    <th width="4%" valign="bottom" align="center" class="SmallFormText">Code</th>
                    <th width="4%" valign="bottom" align="center" class="SmallFormText">V</th>
                    <th width="5%" valign="bottom" align="center" class="SmallFormText">(A)</th>
                    <th width="5%" valign="bottom" align="center" class="SmallFormText">(D)</th>
                    <th width="5%" valign="bottom" align="center" class="SmallFormText">Date Exercisable</th>
                    <th width="4%" valign="bottom" align="center" class="SmallFormText">Expiration Date</th>
                    <th width="10%" valign="bottom" align="center" class="SmallFormText">Title</th>
                    <th width="7%" valign="bottom" align="center" class="SmallFormText">Amount or Number of Shares</th>
                </tr>
            </thead>
        </table>
        <table border="0" width="100%">
            <tr>
                <td class="MedSmallFormText"><b>Explanation of Responses:</b></td>
            </tr>
            <tr>
                <td class="FootnoteData">
                    1. The price reported in column 4 is a weighted average price. The reported shares were
                    sold in multiple transactions at prices ranging from $36.89 to $36.94 per share. The
                    Reporting Person undertakes to provide, upon request, details regarding the number
                    of shares sold at each separate price to the staff of the Company1, Company 2, and some other Companies, Inc.
                </td>
            </tr>
        </table>
        <table width="100%" border="0">
            <tr>
                <td width="60%"></td>
                <td width="20%"><u><span class="FormData">/s/ Mr. Foo as attorney-in-fact for Mr. Bar</span></u></td>
                <td width="20%"><u><span class="FormData">10/11/1212</span></u></td>
            </tr>
            <tr>
                <td></td>
                <td class="MedSmallFormText">** Signature of Reporting Person</td>
                <td class="MedSmallFormText">Date</td>
            </tr>
            <tr>
                <td colspan="3" class="MedSmallFormText">
                    Reminder: Report on a separate line for each class of securities beneficially owned
                    directly or indirectly.
                </td>
            </tr>
            <tr>
                <td colspan="3" class="MedSmallFormText">
                    * If the form is filed by more than one reporting person,
                    <i>see</i>

                    Instruction
                    4

                    (b)(v).
                </td>
            </tr>
            <tr>
                <td colspan="3" class="MedSmallFormText">
                    ** Intentional misstatements or omissions of facts constitute Federal Criminal Violations
                    <i>See</i>

                    18 U.S.C. 1001 and 15 U.S.C. 78ff(a).
                </td>
            </tr>
            <tr>
                <td colspan="3" class="MedSmallFormText">
                    Note: File three copies of this Form, one of which must be manually signed. If space
                    is insufficient,
                    <i>see</i>

                    Instruction 6 for procedure.
                </td>
            </tr>
            <tr>
                <td colspan="3" class="MedSmallFormText">
                    <b>
                        Persons who respond to the collection of information contained in this form are not
                        required to respond unless the form displays a currently valid OMB Number.
                    </b>
                </td>
            </tr>
        </table>
    </div>
</body>
</html>

@sharonkass,

We have logged this problem in our issue tracking system. Your ticket number is WORDSNET-22610. We will further look into the details of this problem and will keep you updated here on the status of the linked ticket. We apologize for any inconvenience.

Thank you again. Is there any update?

@sharonkass,

Regarding WORDSNET-22610, we have completed the analysis of this issue and concluded to close this issue with “not a bug” status. Tables in the source document are too wide to fit on a standard document page. Web browsers render the document without the horizontal scroll bar only if the browser window is wider than 1260 pixels, which is about 13 inches at 96 dpi, while the standard page is only about 6.5 inch wide. The only workaround is to increase the page width of the target document (and, optionally, change its orientation):

Document doc = new Document();

// Increase page width.
PageSetup pageSetup = doc.getFirstSection().getPageSetup();
pageSetup.setOrientation(Orientation.LANDSCAPE);
pageSetup.setPaperSize(PaperSize.A3);

DocumentBuilder builder = new DocumentBuilder(doc);
String html = new String(Files.readAllBytes(Paths.get("in.html")));

builder.insertHtml(html);
doc.save("out.docx");

Thank you for your reply. Changing the orientation and/or page size is one option for sure - is there a way to know if the given html would require this change? We would only want to do this for documents which need it but I’m not sure how to check for this cut-off condition. Can you suggest?

I was hoping Document.renderToScale() would provide the solution. Do you think I could use this API method to scale the document to fit in the standard layout? If so, same question as to how to detect if this is needed.

I hope you can help a suggestion… Thanks again!!

@sharonkass,

We have logged your query in our issue tracking system and will keep you posted here on further updates.

@sharonkass,

You can use the LayoutEnumerator class to check if all tables fit within page body:

Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);

String html = new String(Files.readAllBytes(Paths.get("C:\\Temp\\html file.txt")));
builder.insertHtml(html);

boolean allTablesFitPageBody = PageSizeValidator.Validate(doc);
if (!allTablesFitPageBody) {
    PageSetup pageSetup = doc.getFirstSection().getPageSetup();
    pageSetup.setOrientation(Orientation.LANDSCAPE);
    pageSetup.setPaperSize(PaperSize.A3);
}

doc.save("C:\\temp\\awjava-21.8.docx");

import com.aspose.words.Document;
import com.aspose.words.LayoutEntityType;
import com.aspose.words.LayoutEnumerator;

import java.awt.*;

public class PageSizeValidator {
    public static boolean Validate(Document doc) throws Exception {
        return Validate(new LayoutEnumerator(doc), null);
    }

    private static boolean Validate(LayoutEnumerator layoutEnumerator, Rectangle pageSize) throws Exception {
        do {
            if (layoutEnumerator.getType() == LayoutEntityType.PAGE) {
                pageSize = layoutEnumerator.getRectangle().getBounds();
            }

            // Check if a table row crosses the right side of the page body.
            if ((layoutEnumerator.getType() == LayoutEntityType.ROW) &&
                    ((layoutEnumerator.getRectangle().getX() + layoutEnumerator.getRectangle().getWidth()) >
                            (pageSize.getX() + pageSize.getWidth()))) {
                return false;
            }

            if (layoutEnumerator.moveFirstChild()) {
                // Recurse into this child element.
                if (!Validate(layoutEnumerator, pageSize)) {
                    return false;
                }
                layoutEnumerator.moveParent();
            }
        } while (layoutEnumerator.moveNext());

        return true;
    }
}

This looks perfect! Let me give it a try - thanks!

1 Like

Thanks again - really appreciate your help!

1 Like