We’ve run into a problem where we can no longer extract all of the text from a slide. This was working in aspose-slides 16.9.0 but doesn’t seem to work in 18.2 or 18.3.
Do we need to use a different API to extract the text, now?
Or is this a regression?
From the zip below, add the .java file and the .ppt file
Observed: With aspose-slides 16.9.0, we see the following output:
Example
Powerpoint
text1
text2
text3
text4
Last slide
goal3
goal2
goal1
But, if we adjust pom.xml to aspose-slides 18.2 or 18.3, then we only see this output – it’s missing some of the text fields:
Example
Powerpoint
Last slide
For reference, here’s the java source that we’re using:
package com.aspose;
import com.aspose.slides.*;
public class DumpAllText {
public static void main(String[] args) {
com.aspose.slides.License slidesLicence = new com.aspose.slides.License();
slidesLicence.setLicense(AsposeUtils.getLicenceData());
//ExStart:EndParaGraph
// Instantiate a Presentation class that represents a PPTX file
Presentation presentation = new Presentation("test.ppt");
ISlideCollection slides = presentation.getSlides();
for(ISlide slide : slides) {
for(ITextFrame textFrame : SlideUtil.getAllTextBoxes(slide)) {
for(IParagraph paragraph : textFrame.getParagraphs()) {
for(IPortion portion : paragraph.getPortions()) {
System.out.println(portion.getText());
}
}
}
}
}
//ExEnd:EndParaGraph
}
If we use SlideUtil.getAllTextFrames, it gives similar results – most of the textboxes in this .ppt are skipped over in aspose-slides 18 (but they’re picked up in 16.9.0)
I have observed your presentation and like to share that you need to traverse all slides and their respective shapes for extracting text. Can you please try using following sample code on your end.
I have observed your comments. I regret to inform that SlideUtil.getAllTextBoxes(slide) has some restrictions. We have created an internal investigation ticket for this. Please used shared sample code as workaround.