Move Hidden Text to Previous Paragraph & Remove Extra Vertical Space from Word Table Number List Java

Gptrnt · July 25, 2020, 2:16pm

Hi,

I am formatting (or replacing some words in the input with data) input document. And downloading both pdf and word. While creating ,

I am givi1. ng Heading_1 style for table cell items, and adding the setDefaultBookmarksOutlineLevel value 1. But in pdf its not picking the default bookmark.
I am adding listed number for some value in the table. But after the bullet i am getting an extra enter. I try to "write’ for last one, but then last one is not taking as listed.

I am attaching my sample code PdfHeadingBookmark.zip (113.3 KB). Please help me to fix the issue.

Thank you

awais.hafeez · July 26, 2020, 6:12am

@Gptrnt,

As a workaround, please use the following Java code to get the desired output:

License lic = new License();
lic.setLicense("Aspose.Total.Product.Family.lic");

Document document = new Document("E:\\Temp\\PdfHeadingBookmark\\input.docx");
NodeCollection<Paragraph> paragraphList = document.getChildNodes(NodeType.PARAGRAPH, true);

FindReplaceOptions findReplaceOptions = new FindReplaceOptions(FindReplaceDirection.BACKWARD);
findReplaceOptions.setReplacingCallback(new FindAndInsertHtml());
Pattern pattern = Pattern.compile("(\\|[a-zA-z]+[\\{[a-zA-Z0-9\\,\\-\\/]*\\}]*\\|)", Pattern.CASE_INSENSITIVE);
for (Paragraph paragraph : (Iterable<Paragraph>) paragraphList) {
    Range paragraphRange = paragraph.getRange();
    try {
        paragraphRange.replace(pattern, "", findReplaceOptions);
    } catch (Exception e) {
        System.out.println("Exception occured during token replacement in method replaceTokensInUploadedDoc(...) : " + e);
        throw new Exception(e);
    }

}

// Lets post-process the output further
int i = 1;
DocumentBuilder builder = new DocumentBuilder(document);
for (Paragraph para : (Iterable<Paragraph>) document.getChildNodes(NodeType.PARAGRAPH, true)) {
    // Bookmark the Heading Paragraphs
    // This way they will appear in PDF document's outline
    if (para.getParagraphFormat().isHeading() && !para.toString(SaveFormat.TEXT).trim().equals("")) {
        builder.moveTo(para);
        builder.startBookmark("give some name_" + i);
        builder.endBookmark("give some name_" + i++);
    }
    // Simulate removal of vertical space (enter) 
    if (para.isEndOfCell() && para.toString(SaveFormat.TEXT).trim().equals("")) {
        Paragraph prevPara = (Paragraph) para.getPreviousSibling();
        if (prevPara != null && prevPara.isListItem()) {
            // System.out.println("good");
            para.getParagraphBreakFont().setSize(1);
            para.getParagraphFormat().setSpaceAfter(0);
            para.getParagraphFormat().setSpaceBefore(0);
        }
    }
}

PdfSaveOptions options = new PdfSaveOptions();
options.getOutlineOptions().setDefaultBookmarksOutlineLevel(1);
options.getOutlineOptions().setHeadingsOutlineLevels(2);
document.save("E:\\Temp\\PdfHeadingBookmark\\awjava-20.7-.pdf", options);
document.save("E:\\Temp\\PdfHeadingBookmark\\awjava-20.7-.docx", SaveFormat.DOCX);

Gptrnt · August 7, 2020, 10:40am

Hi,

Your solution is working fine.
I having another issue regarding space. In my output doc output.zip (15.7 KB) i am adding html in notes and adding special character between this html. You can see in normal view(without showing the hidden words) extra space in the note content. Is there a way i can remove that space ?

I am attaching my sample to reproduce the issue PdfHeadingBookmark (2).zip (125.5 KB)

Thank you

awais.hafeez · August 8, 2020, 5:13am

@Gptrnt,

In addition to my previous answer, you can simply reduce the font size of hidden text in your output document to simulate removal of vertical space:

Document document = new Document("C:\\Temp\\PdfHeadingBookmark (2)\\output.docx");

for (Run run : (Iterable<Run>) document.getChildNodes(NodeType.RUN, true)) {
    if (run.getFont().getHidden()) {
        run.getFont().setSize(1);
    }
}

for (Paragraph para : (Iterable<Paragraph>) document.getChildNodes(NodeType.PARAGRAPH, true)) {
    if (para.isEndOfCell()) {
        para.getParagraphBreakFont().setSize(1);
        para.getParagraphFormat().setSpaceAfter(0);
        para.getParagraphFormat().setSpaceBefore(0);
    }
}

document.save("C:\\Temp\\PdfHeadingBookmark (2)\\awjava-20.7.docx");

Gptrnt · August 19, 2020, 12:46pm

Hi,
I tried your solution but it is not giving the expected result. output.zip (15.7 KB) This is the output i got from your solution. Instead of deleting extra space, it deleting hidden word. I only wants to remove the extra space.

Thank you

awais.hafeez · August 20, 2020, 5:22am

@Gptrnt,

The code does not remove the hidden words; instead, it reduces their size to bare minimum and trims extra vertical space by setting paragraph ‘space after’ and ‘space before’ values to 0.

Please ZIP and attach your actual expected DOCX file showing the desired output here for our reference. You can try to create this document manually by using MS Word. Please also list the complete steps that you performed in MS Word to create the expected document on your end. We will then further investigate the scenario and provide you code to achieve the same by using Aspose.Words.

Gptrnt · August 20, 2020, 7:36am

Hi,

I am attaching my sample code with output, expected output and input document PdfHeadingBookmark (3).zip (143.5 KB)
. I am exporting the generated document. So I want the hidden word in the same paragraph.

awais.hafeez · August 20, 2020, 3:33pm

@Gptrnt,

Another way is to first check if all characters in last Paragraph of Cell are hidden, then move them to at the end of previous Paragraph and remove the last Paragraph:

Document document = new Document("C:\\Temp\\PdfHeadingBookmark (2)\\output.docx");

ArrayList toBeDeleted = new ArrayList();
for (Paragraph para : (Iterable<Paragraph>) document.getChildNodes(NodeType.PARAGRAPH, true)) {
    if (para.isEndOfCell() && areAllRunsHidden(para)) {
        if (para.getPreviousSibling() != null &&
                para.getPreviousSibling().getNodeType() == NodeType.PARAGRAPH) {

            for (Node node : (Iterable<Node>) para.getChildNodes(NodeType.ANY, true))
                ((Paragraph) para.getPreviousSibling()).appendChild(node);

            toBeDeleted.add(para);
        }
    }
}

for (Paragraph para : (Iterable<Paragraph>) toBeDeleted) {
    para.remove();
}

document.save("C:\\Temp\\PdfHeadingBookmark (2)\\awjava-20.8.docx");

public static boolean areAllRunsHidden(Paragraph para) {
    boolean flag = true;
    for (Run run : (Iterable<Run>) para.getChildNodes(NodeType.RUN, true)) {
        if (!run.getFont().getHidden()) {
            flag = false;
            break;
        }
    }
    return flag;
}

Gptrnt · August 25, 2020, 5:37am

Hi,

Thank you its working fine.