Document document1 = new Document(String.valueOf(file));
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber();
document1.getPages().get_Item(1).accept(textFragmentAbsorber);
String text1 = textFragmentAbsorber.getText();
System.out.println(text1);
This code is returning element by element, But it is returning text in this order(header, footer, para,cell,para,para,textbox)
but in pdf page the order is(header, para, textbox,cell,para,para,footer).
output i got for this file(document12.pdf) while running this code:------------------------------------------Document12.pdf (107.1 KB)
Header- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
incididunt ut labore.
Footer- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
incididunt ut labore.
Para1- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
incididunt ut labore.
Cell1- Lorem ipsum dolor sit amet,
consectetur adipiscing elit, sed do
eiusmod tempor incididunt ut labore.
Cell2- Lorem ipsum dolor sit amet, consectetur
adipiscing elit, sed do eiusmod tempor incididunt
ut labore.
Para2- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
incididunt ut labore.
Para3- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
incididunt ut labore.
TextBox-Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut
labore.