The problem:
The content of the extract paragraph is incomplete.
微信图片_20210625110141.png (96.0 KB)
微信图片_20210625110150.png (2.8 KB)
code:
public static void ExtractParagraph02() {
// Open an existing PDF file
Document doc = new Document(FilePdfName);
// Instantiate ParagraphAbsorber
ParagraphAbsorber absorber = new ParagraphAbsorber();
absorber.visit(doc);
for (PageMarkup markup : absorber.getPageMarkups()) {
int i = 1;
for (MarkupSection section : markup.getSections()) {
int j = 1;
for (MarkupParagraph paragraph : section.getParagraphs()) {
StringBuilder paragraphText = new StringBuilder();
for (java.util.List<TextFragment> line : paragraph.getLines()) {
for (TextFragment fragment : line) {
paragraphText.append(fragment.getText());
}
paragraphText.append("\r\n");
}
paragraphText.append("\r\n");
System.out.println("Paragraph "+j+" of section "+ i + " on page"+ ":"+markup.getNumber());
System.out.println(paragraphText.toString());
j++;
}
i++;
}
}
}
Can aspose.pdf Java extract the content of the box mark on the image separately?
QQ图片20210625110845.png (141.4 KB)