hi,
We are extracting text from ppt using Aspose.Slide, but we met some problem.
here is my code:
// extract text from silde
ISlideCollection slides = presentation.getSlides();
StringBuilder textBuilder = new StringBuilder();
for (ISlide slide : slides) {
ITextFrame[] textFrames = SlideUtil.getAllTextBoxes(slide);
if (textFrames != null && textFrames.length > 0) {
for (int index = 0; index < textFrames.length; index++) {
for (IParagraph paragraph : textFrames[index].getParagraphs()) {
for (IPortion portion : paragraph.getPortions()) {
textBuilder.append(portion.getText()).append("\n");
}
}
}
}
}
// extract text from master slide and layout slide
IMasterSlideCollection masters = presentation.getMasters();
if (masters != null && masters.size() > 0) {
for (IMasterSlide masterSlide : masters) {
getContentTextFromSlide(masterSlide, textBuilder);
IMasterLayoutSlideCollection masterLayoutSlides = masterSlide.getLayoutSlides();
if (masterLayoutSlides != null && masterLayoutSlides.size() > 0) {
for (ILayoutSlide masterLayoutSlide : masterLayoutSlides) {
getContentTextFromSlide(masterLayoutSlide, textBuilder);
}
}
}
}
private void getContentTextFromSlide(IBaseSlide slide, StringBuilder textBuilder) {
ITextFrame[] textFrames = SlideUtil.getAllTextBoxes(slide);
if (textFrames != null && textFrames.length > 0) {
for (int index = 0; index < textFrames.length; index++) {
for (IParagraph paragraph : textFrames[index].getParagraphs()) {
for (IPortion portion : paragraph.getPortions()) {
textBuilder.append(portion.getText()).append("\n");
}
}
}
}
}
- There is no “0” in my ppt slide, but there is a “0” in extracted slide text
2.when I extract text from masters slide and layout slide, “‹#›” is extracted as “*”
this is my ppt file:
testfile.zip (661.8 KB)