Hello,
I’m currently faced with a problem.
I open a document and list text content for each paragraph for the main text of a document.
After presenting the text for the paragraph (through the Paragraph.toText() method), I perform the same action but for each individual run contained inside that same paragraph.
My problem is that for some documents the text that is presented for the paragraph and the text presented for the complete set of runs is different, with some of the text disappearing when listing the runs.
This is a big problem as I’m trying to transform the paragraph text into another, while strictly maintaning style formating (for example, if you had something originaly and you transformed as to add the letter s to the end, then your final document should have somethings).
The code I used for listing the contents of the file is the following:
public static void main(String[] args) throws Exception
{
// this method loads the Aspose license. It as been tested and validated
ProgramConfigurationSet.initAsposeLicense();
String nextFile = null;
Scanner s = new Scanner(System.in);
System.out.print("File to read:");
nextFile = s.next();
FileInputStream inputDocumentStream = new FileInputStream(new File(nextFile));
Document doc = new Document(inputDocumentStream);
inputDocumentStream.close();
doc.joinRunsWithSameFormatting();
for (Section sec: doc.getSections())
{
System.out.println("Retrieved a section");
for (Paragraph p: sec.getBody().getParagraphs())
{
System.out.println("Retrieved a paragraph. Text: |"\ + p.getText()\ + "| Number of paragraphs in run:"\ + p.getRuns().getCount());
for (Run r: p.getRuns())
{
System.out.println("Retrieved a run. Text: |"\ + r.getText()\ + "|");
}
}
}
}
I’m also attaching a sample file for which I get the described behaviour (sample.doc) as well as a file with the output that running the code above gives out to me (sample.log.txt). If you check this last file, you can easily spot the symptoms by comparing the text retrieved for the paragraphs agains the text retrieved for its runs.
My initial suspicion was on the doc.joinRunsWithSameFormatting(); instruction so i’ve already tried running this code without it but got the same result.
I’va also tried different OS and JVM versions and the result is always the same.
I’ve searched the forum for any similar threads but came up empty.
Can anyone help me?
Cheers,
António Russo