Spacing and Indentation in java

Kusumanchi.Rajesh · August 19, 2015, 7:18am

Hi ,

Need the Indentation of the Content in the body and spacing of the Document in line by line.

For example from the below content “LIST OF SOFTWARES THAT SUPPORTS, office open Xml Word Vs PDF, RTF Vs WPD” is having double indentation and between “December 10,2005” and “STATEMENT OF ASPOSE OF WORDS[1]” having spacing 1 with double enter…
these information we need to know for every line by line traverse for entire document

LIST OF SOFTWARES THAT SUPPORTS, office open Xml Word Vs PDF, RTF Vs WPD
Aspose.Words 15.7.0
December 10, 2005
STATEMENT OF ASPOSE OF WORDS[1]
For reference i have atteched the input file.docx
Thanks,
Rajesh

tahir.manzoor · August 20, 2015, 6:30am

Hi Rajesh,

Thanks
for your inquiry. Please note that Aspose.Words mimics the same behavior as MS Word does.

Could you please share some more detail about your query ‘Indentation of the Content in the body and spacing of the Document in line by line.’?

Please read about specifying formatting from here:
https://docs.aspose.com/words/java/programming-with-documents/

If your query is about line spacing, please use ParagraphFormat.LineSpacing property to get or set the line spacing (in points) for the paragraph. The line spacing is used for a Paragraph, not for a single line of a Paragraph.

*When LineSpacingRule property is set to AtLeast, the line spacing can be greater than or equal to, but never less than the specified LineSpacing value.

When LineSpacingRule property is set to Exactly, the line spacing never changes from the specified LineSpacing value, even if a larger font is used within the paragraph.*

tahir.manzoor · August 21, 2015, 2:53am

Hi Rajesh,

Thanks
for sharing the detail via live chat. I have created a sample document for LineSpacingRule and have attached it with this post for your kind reference. Please use following code example to achieve your requirements.

Document doc = new Document(MyDir + "LineSpacingRule.docx");
for (Paragraph para : (Iterable)doc.getChildNodes(NodeType.PARAGRAPH, true))
{
    if (para.getParagraphFormat().getLineSpacing() == 12 && para.getParagraphFormat().getLineSpacingRule() == LineSpacingRule.MULTIPLE)
    {
        System.out.println("Single");
    }
    else if (para.getParagraphFormat().getLineSpacing() == 18 && para.getParagraphFormat().getLineSpacingRule() == LineSpacingRule.MULTIPLE)
    {
        System.out.println("1.5 Lines");
    }
    else if (para.getParagraphFormat().getLineSpacing() == 24 && para.getParagraphFormat().getLineSpacingRule() == LineSpacingRule.MULTIPLE)
    {
        System.out.println("Double");
    }
    else if (para.getParagraphFormat().getLineSpacing() > 24 && para.getParagraphFormat().getLineSpacingRule() == LineSpacingRule.MULTIPLE)
    {
        System.out.println("Multiple");
    }
}

Please check following detail of LineSpacingRule enumeration from here:

Multiple : The line spacing is specified in the LineSpacing property as the number of lines. **One line equals 12 points.

Exactly** :
The line spacing never changes from the value specified in the
LineSpacing property, even if a larger font is used within the
paragraph.

AtLeast : The line spacing can be greater than or equal to, but never less than, the value specified in the LineSpacing property.

Kusumanchi.Rajesh · August 21, 2015, 4:46am

Hi Tahir,

Thanks for information.

i need Page no and line no information too.

Can aspose can do that…

Can you share me an example for Indentation in the Docx file

Thanks,
Rajesh

tahir.manzoor · August 21, 2015, 9:03am

Hi Rajesh,

Thanks
for your inquiry. Please use ParagraphFormat.FirstLineIndent property to get or set the value (in points) for a first line or hanging indent. Use a positive value to set a first-line indent, and use a negative value to set a hanging indent.

Use ParagraphFormat.LeftIndent property to get or set the value (in points) that represents the left indent for paragraph.

Use LayoutCollector.GetStartPageIndex method to get 1-based index of the page where node begins. Returns 0 if node cannot be mapped to a page.

Please check following highlighted code snippet. For line number information, please check my reply from here. Hope this helps you.

Please check the attached image for paragraph indents and spacing. You can get the paragraph format properties using ParagraphFormat class. Please read the members of this class from here:
https://reference.aspose.com/words/java/com.aspose.words/paragraphformat

Document doc = new Document(MyDir + "LineSpacingRule.docx");
LayoutCollector collector = new LayoutCollector(doc);
for (Paragraph para : (Iterable)doc.getChildNodes(NodeType.PARAGRAPH, true))
{
    if (para.getParagraphFormat().getLineSpacing() == 12 && para.getParagraphFormat().getLineSpacingRule() == LineSpacingRule.MULTIPLE)
    {
        System.out.println("Single");
    }
    else if (para.getParagraphFormat().getLineSpacing() == 18 && para.getParagraphFormat().getLineSpacingRule() == LineSpacingRule.MULTIPLE)
    {
        System.out.println("1.5 Lines");
    }
    else if (para.getParagraphFormat().getLineSpacing() == 24 && para.getParagraphFormat().getLineSpacingRule() == LineSpacingRule.MULTIPLE)
    {
        System.out.println("Double");
    }
    else if (para.getParagraphFormat().getLineSpacing() > 24 && para.getParagraphFormat().getLineSpacingRule() == LineSpacingRule.MULTIPLE)
    {
        System.out.println("Multiple");
    }
    // Get the left indenet and first line indent of paragraph
    System.out.println(para.getParagraphFormat().getLeftIndent());
    System.out.println(para.getParagraphFormat().getFirstLineIndent());
    // Get the page number of paragraph
    System.out.println(collector.getStartPageIndex(para));
}

Kusumanchi.Rajesh · August 25, 2015, 2:43am

Hi Tahir,
i need the paragraph information i.e it is from Body or Header or Footer
Can aspose can do that…
package range;

import com.aspose.words.*;

public class Test_Spacing_Indentations {

    public static void main(String[] args) throws Exception {

        Document doc = new Document("C:\codereview\R5.3\SBC\Aspose\Input files\Test_Semicolon.docx");
        LayoutCollector collector = new LayoutCollector(doc);
        int line_number = 0;
        int index = 0;
        String lasttext = "";
        Boolean check =false;
        for (Paragraph para : (Iterable)doc.getChildNodes(NodeType.PARAGRAPH, true))
        {
            System.out.println("text–>"+para.getRange().getText());
            collector.getStartPageIndex(para);
            check = para.getAncestor(NodeType.HEADER_FOOTER).getNodeType() == NodeType.HEADER_FOOTER;
            if (check.TRUE)
            {
                System.out.println("Header");
            }
            System.out.println("Start offset -->"+ index);
            String text = para.toString(SaveFormat.TEXT);
            index = text.length();
            System.out.println("end offset -->"+ index);
            System.out.println("PageNo–>"+ collector.getStartPageIndex(para));
            System.out.println("Line Nuber—>" + line_number);
            System.out.println("Line Text–>"+para.getText());
            System.out.println("end offset -->"+ index);
            if(para.getParagraphFormat().getLineSpacing() == 12 && para.getParagraphFormat().getLineSpacingRule() == LineSpacingRule.MULTIPLE)
            {
                System.out.println("Single");
                // index = lasttext.length();
            }
            else if (para.getParagraphFormat().getLineSpacing() == 18 && para.getParagraphFormat().getLineSpacingRule() == LineSpacingRule.MULTIPLE)
            {
                System.out.println("1.5 Lines");
            }
            else if (para.getParagraphFormat().getLineSpacing() == 24 && para.getParagraphFormat().getLineSpacingRule() == LineSpacingRule.MULTIPLE)
            {
                System.out.println("Double");
            }
            else if (para.getParagraphFormat().getLineSpacing() > 24 && para.getParagraphFormat().getLineSpacingRule() == LineSpacingRule.MULTIPLE)
            {
                System.out.println("Multiple");
            }
            line_number++;
            lasttext = para.getText();
            // Get the left indenet and first line indent of paragraph
            /*System.out.println("left Indent–>"+para.getParagraphFormat().getLeftIndent());*
             *System.out.println("Right Indent–>"+para.getParagraphFormat().getRightIndent());*
             *System.out.println("First Line Indent"+para.getParagraphFormat().getFirstLineIndent());*/
        }
    }
}

Thanks,
Rajesh

tahir.manzoor · August 25, 2015, 5:55am

Hi Rajesh,

Thanks
for your inquiry. Please use the following code example to achieve your requirements. See the highlighted code snippet.

Please feel free to ask if you have any question about Aspose.Words, we will be happy to help you.

Document doc = new Document(MyDir + "Test_Semicolon.docx");
Boolean check = false;
for (Paragraph para : (Iterable)doc.getChildNodes(NodeType.PARAGRAPH, true))
{
    System.out.println("text-->" + para.getRange().getText());
    Node node = para.getAncestor(NodeType.HEADER_FOOTER);
    check = node != null && node.getNodeType() == NodeType.HEADER_FOOTER;
    if (check)
    {
        System.out.println("Header");
    }
}

Kusumanchi.Rajesh · August 28, 2015, 6:25am

Hi Tahir,

While reading the text from the paragraph the footnote content is coming in the output along with the body.

We dont want the Footnote information along with the body it has to be in seperated formate.

please find the below code:

Document doc = new Document("C:\\codereview\\R5.3\\SBC\\Aspose\\Input files\\input.docx");
LayoutCollector collector = new LayoutCollector(doc);
int index = -1;
String lasttext = "";
System.out.println("<?xml version=\"1.0\" encoding=\"utf-8\" ?>");
System.out.println("");
for (Paragraph para : (Iterable)doc.getChildNodes(NodeType.PARAGRAPH, true))
{
    index++;
    System.out.println("");
    try
    {
        if (para.getAncestor(NodeType.HEADER_FOOTER).getNodeType() == NodeType.HEADER_FOOTER)
            System.out.println("Header/Footer");
    }
    catch (Exception e)
    {
        if (para.getAncestor(NodeType.BODY).getNodeType() == NodeType.BODY)
            System.out.println("Body");
    }

    System.out.println("" + index + "");
    String text = para.getRange().getText().replaceAll("null", "").trim();
    System.out.println(text.trim().length());
    index = index + text.trim().length();
    System.out.println("" + index + "</End Offset>");
    System.out.println("Line Text–>" + para.getText().trim());
    if (para.getParagraphFormat().getLineSpacing() == 12 && para.getParagraphFormat().getLineSpacingRule() == LineSpacingRule.MULTIPLE)
    {
        System.out.println("Single");
    }
    else if (para.getParagraphFormat().getLineSpacing() == 18 && para.getParagraphFormat().getLineSpacingRule() == LineSpacingRule.MULTIPLE)
    {
        System.out.println("1.5 Lines");
    }
    else if (para.getParagraphFormat().getLineSpacing() == 24 && para.getParagraphFormat().getLineSpacingRule() == LineSpacingRule.MULTIPLE)
    {
        System.out.println("Double");
    }
    else if (para.getParagraphFormat().getLineSpacing() > 24 && para.getParagraphFormat().getLineSpacingRule() == LineSpacingRule.MULTIPLE)
    {
        System.out.println("Multiple");
    }
    System.out.println("" + para.getParagraphFormat().getLeftIndent() + "</Left Indentation>");
    System.out.println("" + para.getParagraphFormat().getRightIndent() + "</Right Indentation>");
    System.out.println("");
    System.out.println();

Thanks,
Rajesh

tahir.manzoor · August 31, 2015, 6:37am

Hi Rajesh,

Thanks
for your inquiry. In this case, I suggest you please remove the footnotes from the document. See the highlighted code snippet. Hope this helps you. Please let us know if you have any more queries.

Document doc = new Document(MyDir + "input(2).docx");
doc.getChildNodes(NodeType.FOOTNOTE, true).clear();
LayoutCollector collector = new LayoutCollector(doc);
// Your code
// Your code

Kusumanchi.Rajesh · September 1, 2015, 8:51am

Hi Tahir,

thanks for sharing the info.

I wanted to validate the Header and footer inforamtion form the input doc

i am using the below but its not working

for (Paragraph para : (Iterable)doc.getChildNodes(NodeType.PARAGRAPH, true))
{
    try
    {
        if (para.getAncestor(NodeType.HEADER_FOOTER).getNodeType() == NodeType.HEADER_FOOTER)
        {
            if (para.getAncestor(HeaderFooterType.HEADER_PRIMARY).getNodeType() == HeaderFooterType.HEADER_PRIMARY)
                System.out.println("Header");
            else if (para.getAncestor(HeaderFooterType.FOOTER_PRIMARY).getNodeType() == HeaderFooterType.FOOTER_PRIMARY)
                System.out.println("Footer");
        }
    }
    catch (Exception e)
    {
        if (para.getAncestor(NodeType.BODY)
        .getNodeType() == NodeType.BODY)
            System.out.println("Body");
    }
}

Is Aspose can validate this?

Thanks,
Rajesh

tahir.manzoor · September 1, 2015, 10:33am

Hi Rajesh,

Thanks
for your inquiry. Your code works fine at my end. Could you please share what is the incorrect output you are getting?

If you just want to get the paragraph nodes of header/footer or body of a section, please use following code example. Hope this
helps you. Please let us know if you have any more queries.

Document doc = new Document(MyDir + "in.docx");
doc.getChildNodes(NodeType.FOOTNOTE, true).clear();
for (Section section : doc.getSections())
{
    for (HeaderFooter headerfooter : (Iterable)section.getHeadersFooters())
    {
        if (headerfooter.getHeaderFooterType() == HeaderFooterType.HEADER_PRIMARY ||
        headerfooter.getHeaderFooterType() == HeaderFooterType.FOOTER_PRIMARY)
            for (Paragraph para : (Iterable)headerfooter.getChildNodes(NodeType.PARAGRAPH, true))
            {
                System.out.println(para.toString(SaveFormat.TEXT));
                System.out.println("Header");
            }
    }
    for (Paragraph para : (Iterable)section.getBody().getChildNodes(NodeType.PARAGRAPH, true))
    {
        System.out.println(para.toString(SaveFormat.TEXT));
        System.out.println("Body");
    }
}

Kusumanchi.Rajesh · September 3, 2015, 2:19am

Hey tahir,

i have shared the expected output from this doc find the below

Text–> This document has Abc. II, & 2, de.111 been edited for illustratio

Header

Text–> Page 1 of 1

FOOTER

for this i am using the validation like below :

for header:

for (Paragraph para : (Iterable)doc.getChildNodes(NodeType.PARAGRAPH, true))
{

    if (para.getAncestor(NodeType.HEADER_FOOTER).getNodeType() == NodeType.HEADER_FOOTER)
    {
        if (para.getAncestor(HeaderFooterType.HEADER_FIRST).getNodeType() == HeaderFooterType.HEADER_FIRST)
        {

For Footer:

for (Paragraph para : (Iterable)doc.getChildNodes(NodeType.PARAGRAPH, true))
{
    if (para.getAncestor(NodeType.HEADER_FOOTER).getNodeType() == NodeType.HEADER_FOOTER)
    {
        if (para.getAncestor(HeaderFooterType.FOOTER_PRIMARY).getNodeType() == HeaderFooterType.FOOTER_PRIMARY)
        {

tahir.manzoor · September 3, 2015, 8:23am

Hi Rajesh,

Thanks
for your inquiry and sharing the detail via live chat. Please use following code example to achieve your requirements.

Document doc = new Document(MyDir + "inputfile (2).docx");
for (Paragraph para : (Iterable)doc.getChildNodes(NodeType.PARAGRAPH, true))
{
    if (para.getAncestor(NodeType.BODY) != null && !para.toString(SaveFormat.TEXT).trim().equals(""))
    {
        System.out.println("Body");
    }
    else if (para.getAncestor(NodeType.HEADER_FOOTER) != null)
    {
        HeaderFooter hf = (HeaderFooter)para.getAncestor(NodeType.HEADER_FOOTER);
        if (hf.getHeaderFooterType() == HeaderFooterType.HEADER_PRIMARY)
            System.out.print("Header" + para.toString(SaveFormat.TEXT));
        else if (hf.getHeaderFooterType() == HeaderFooterType.FOOTER_PRIMARY)
            System.out.print("Footer" + para.toString(SaveFormat.TEXT));
    }
}

Following is the output of this code example. Hope this helps you.

HeaderThis document has Abc. II, & 2, de.111 been edited for illustratio
FooterPage 1 of 1
Body
Body
Body
Body

Kusumanchi.Rajesh · September 24, 2015, 8:59am

Hi Tahir,

Thanks for the information.

i want to know the Endnote inforamtion from the Docx file like footnote from paragraph API
is it possible…

for (Paragraph para : (Iterable)doc.getChildNodes(NodeType.PARAGRAPH, true))
{
    if (para.getAncestor(NodeType.FOOTNOTE).getNodeType() == NodeType.FOOTNOTE)
    {
    }
}

Thanks,
Rajesh

tahir.manzoor · September 25, 2015, 2:09pm

Hi Rajesh,

Thanks for your inquiry. Please use Footnote.FootnoteType property get a value that specifies whether this is a footnote or endnote. Please let us know if you have any more queries.