Hi!
I am using aspose in my Java project. I have a document with Bookmarks, I have to get the content beteween the bookmarks but I get nodes of type paragraph then, when I append those nodes I get 1 paragraph for each node (with their line breaks).
@Jagovi In this case you should append only paragraph content not the paragraph itself. For example see the following simple code example:
Document doc = new Document("C:\\Temp\\in.docx");
// Get paragraphs in the document.
Iterable<Paragraph> paragraphs = doc.getFirstSection().getBody().getParagraphs();
// Create paragraph we will insert the content to.
Paragraph target = new Paragraph(doc);
// Put content into the target paragraph.
for (Paragraph p : paragraphs)
{
while (p.hasChildNodes())
target.appendChild(p.getFirstChild());
}
doc.getFirstSection().getBody().appendChild(target);
doc.save("C:\\Temp\\out.docx");
Thanks @alexey.noskov for your quick response. I think I didn’t explain myself very well. I will try to be more descriptive.
I have a marker with several sub-markers and I want to go through all of them without generating a new paragraph for each one (with its line breaks).
I have:
@@Marker1([zone1][zone2][zone3])
I need to treat each bookmark as a paragraph, but each sub-bookmark as an inline element with no paragraph or line breaks.
My document (.DOC) is like the following snippet:
[START_ZONE1]
RESUME:
[END_ZONE1]
[START_ZONE2]
Test
[END_ZONE2]
[START_ZONE3]
1
[END_ZONE3]
I get the following output:
RESUME:
Test
1
I expect to get as output:
RESUME: Test 1
(Without breaks of paragraph after and before).
Thank you very much and sorry if I haven’t explained myself correctly.
@Jagovi Yes, I understood your requirements and my suggestion is to insert only content of the extracted paragraph instead of whole paragraph. Just as I demonstrated in my simple code example. In this case there will not be redundant line breaks.
Hi!!
I am here again
I have detected a function missing with this code. But I can’t solve it. If I have several paragraphs in the document to recovery the content with this code I get all the content in the same paragraph.
And if I use @@ZONES([ZONE1][ZONE2][ZONE3]) I get:
RESUMEN:TestTest2123456
But I need:
RESUME:Test
Test21
2
345
6
If ZONE1,ZONE2 or ZONE3 have line breaks or something I would like to bring it to my document. But I wouldn’t like to get line breaks between ZONE1,ZONE2,ZONE3.
Could it be possible?
Thanks for your attention and sorry for my bad explication when I created the thread.
@Jagovi In this case content between tags is represented by several paragraphs, so you should modify your code like this:
public static void insertParaContent(DocumentBuilder builder, Paragraph[] paragraphs)
{
for(int i=0; i<paragraphs.length; i++) {
Paragraph para = paragraphs[i];
while (para.hasChildNodes())
builder.insertNode(para.getFirstChild());
if(i<(paragraphs.length-1))
builder.writeln();
}
}
Or alternatively instead of paragraph breaks in your content you can use soft line breaks Shift+Enter in MS Word. In this case all content between tags will be represented by a single paragraph with soft line breaks and will be properly handled by the code suggested in my previous answer.
@Jagovi The above code was provided for demonstration purposes. In your real scenario you extract paragraphs between [START_ZONEN] and [END_ZONEN] tags. With the modified method I have provided in my previous answer, you should pass array of extracted paragraphs as a last parameter of the insertParaContent method.
We are very close to the final solution.
If I start the text between marks with a table, for example or an image I lose this element in the final document. How can I correct it?
@Jagovi Images should be handled properly using the provided code, since images are inline nodes (are children of paragraphs). Tables are block level nodes and are on the same level as paragraphs. So to handle them, you should modify the code like this:
public static void insertParaContent(DocumentBuilder builder, Iterable<Node> paragraphs)
{
Iterator iter = paragraphs.iterator();
while (iter.hasNext())
{
// If the current node is a paragraph we process it as earlier.
Node current = (Node)iter.next();
if(current.getNodeType() == NodeType.PARAGRAPH) {
Paragraph para = (Paragraph)current;
while (para.hasChildNodes())
builder.insertNode(para.getFirstChild());
if (!iter.hasNext())
builder.writeln();
}
else
{
// If the node is not paragraph, insert a paragraph break and insert
// this node before the newly created paragraph.
builder.writeln();
builder.getCurrentParagraph().getParentNode().insertBefore(current, builder.getCurrentParagraph());
}
}
}
@JagovitextBetweenMarks.getChildNodes(NodeType.BODY, true) gets all Body nodes. In your case you need to get children of Body, i.e. paragraphs and tables.