Hi,
I have a scenario where I need to extract text between starting and ending text. The start and end of the text can be on different pages. Is there a solution in Aspose.word to perform this scenario.
Regards,
Hi,
I have a scenario where I need to extract text between starting and ending text. The start and end of the text can be on different pages. Is there a solution in Aspose.word to perform this scenario.
Regards,
Hi Rajeev,
Yes, you can do that. Please check https://docs.aspose.com/words/java/extract-selected-content-between-nodes/ for more details and let us know if you see any issue.
Best Regards,
Hi Ijaz,
Thanks for your response. I did tried the code and was able to extract text when it is a simple scenario. I am facing the following challenges.
Regards,
Hi Rajeev,
Hi Ijaz,
Thanks for providing the solutions. Using the first approach IO have been bookmarking the start and end of the search text. In a scenario where there is no end I need to get 100 characters or words starting from starting bookmark. How can I achieve that using Aspose.words for java.
Regards,
Hi Rajeev,
Can you please share your sample document and expected output string (after getting 100 characters or words)?
Best Regards,
Hi Ijaz,
I am using the following code for adding bookmarks in the existing document and then I am passing the starting and ending node to extractContents but it does not seems to be working. After adding bookmark I am saving the document but when I open the document I see the square brackets for starting bookmark but not for ending bookmark.
DocumentBuilder docBuilder = new DocumentBuilder(document);
// Move cursor to document start and insert bookmark
docBuilder.moveToDocumentStart();
NodeCollection paragraphs = summaryStatement.getChildNodes(NodeType.PARAGRAPH, true);
// Look through all paragraphs to find those with the specified style.
for (Paragraph paragraph : (Iterable)paragraphs)
{
if (paragraph.toString(SaveFormat.TEXT).trim().startsWith("Starting Text"))
{
docBuilder.startBookmark("BookMark1");
}
if (paragraph.toString(SaveFormat.TEXT).trim().startsWith("Ending Text"))
{
docBuilder.endBookmark("BookMark1");
break;
}
}
Bookmark bookmark1 = summaryStatement.getRange().getBookmarks().get("BookMark1");
ArrayList nodes = null;
Document newdoc = null;
nodes = extractContent(bookmark1.getBookmarkStart(), bookmark1.getBookmarkEnd(), true);
newdoc = generateDocument(summaryStatement, nodes);
When I print newDoc.getText(), it displays blank. Even if I print bookmark1.getText() that is also blank.
Can you please help me understand if I am doing sonething wrong.
Regards,
Hi Rajeev,
Looks like you are first moving to start of document (docBuilder.moveToDocumentStart) and then starting and ending bookmark at the same place so there is no text in the bookmark. You should properly move the cursor to starting and ending positions of the bookmark as you can see in the following code.
DocumentBuilder docBuilder = new DocumentBuilder(doc);
docBuilder.moveToDocumentStart();
NodeCollection paragraphs = doc.getChildNodes(NodeType.PARAGRAPH, true);
// Look through all paragraphs to find those with the specified style.
for (Paragraph paragraph : (Iterable)paragraphs)
{
if (paragraph.toString(SaveFormat.TEXT).trim().startsWith("StartText"))
{
docBuilder.moveTo(paragraph.getChildNodes(NodeType.RUN, true).get(0));
docBuilder.startBookmark("BookMark1");
}
if (paragraph.toString(SaveFormat.TEXT).trim().startsWith("EndText"))
{
docBuilder.moveTo(paragraph.getChildNodes(NodeType.RUN, true).get(0));
docBuilder.endBookmark("BookMark1");
break;
}
}
System.out.print(doc.getRange().getBookmarks().get("BookMark1").getText());
Best Regards,