Step1:
I have created a pdf which have subscript ,superscript ,underline ,strikethrough
Code for creating a pdf:
public void createNewDocument(){
Document document = new Document();
//Add page
Page page = document.getPages().add();
TextFragment t1=new TextFragment("superscript");
t1.getTextState().setSuperscript(true);
TextFragment t2=new TextFragment("underline");
t2.getTextState().setUnderline(true);
TextFragment t3=new TextFragment("strikethrough");
t3.getTextState().setStrikeOut(true);
TextFragment t4=new TextFragment("Subscript");
t4.getTextState().setSubscript(true);
page.getParagraphs().add(t1);
page.getParagraphs().add(t2);
page.getParagraphs().add(t3);
page.getParagraphs().add(t4);
// Save updated PDF
document.save("HelloWorld_out.pdf");
}
sample has been attached
Step2:
traverse through pdf using ParagraphAbsorber and i found all the four methods are not working as expected
boolean subscript= fragment.getTextState().isSubscript();
boolean superscript= fragment.getTextState().isSuperscript();
boolean getUnderline=fragment.getTextState().getUnderline();
boolean isUnderline=fragment.getTextState().isUnderline();
boolean strikethrough=fragment.getTextState().getStrikeOut();
Code used for traversing:
HelloWorld_out.pdf (2.0 KB)
public void paragraphExtract() {
List<Fragment> fragmentList= new ArrayList<Fragment>();
// open document
Document pdfDocument = new Document("HelloWorld_out.pdf");
// get particular page
Page pdfPage = pdfDocument.getPages().get_Item(1);
ParagraphAbsorber paraAbsorber = new ParagraphAbsorber();
paraAbsorber.visit(pdfPage);
for (PageMarkup page : paraAbsorber.getPageMarkups()) {
int i = 1;
for (MarkupSection section : page.getSections()) {
for (MarkupParagraph paragraph : section.getParagraphs()) {
System.out.println("####### paragraph number " + i + " ######### .............. \n" + paragraph.getText());
i++;
for (TextFragment fragment : paragraph.getFragments()) {
//System.out.println("XXXXXXXXXXXXXXXXXXXXXXX :"+fragment.getEndNote().toString());
boolean subscript= fragment.getTextState().isSubscript();
boolean superscript= fragment.getTextState().isSuperscript();
boolean getUnderline=fragment.getTextState().getUnderline();
boolean isUnderline=fragment.getTextState().isUnderline();
boolean strikethrough=fragment.getTextState().getStrikeOut();
for (TextSegment segment : fragment.getSegments()) {
boolean subscript1= segment.getTextState().isSubscript();
boolean superscript1= segment.getTextState().isSuperscript();
boolean isUnderline1=segment.getTextState().isUnderline();
boolean strike=segment.getTextState().getStrikeOut();
}
}
}
}
}
Can you try to reproduce and fix this if this is same for you too or am i missing something?
Note: i am using com.aspose:aspose-pdf:20.12 dependency java 17