Cannot reliably extract slide titles

Hello,

I received a bug report today for a project I build last year using Aspose.Slides for Java. The issue is that my code is not able to always extract the slide title from a slide using Aspose.Slides. Here’s my code that extracts the titles from the slides and appends it to an xml response (this is in a java servlet):

for (int i=0; i<slideCount; i++) {
Slide slide = p.getSlideByPosition(i+1);
if (slide!=null) {
Shapes slideShapes = slide.getShapes();
Placeholders slidePHs = slide.getPlaceholders();
LOG.info(“extracting slide text”);
boolean foundTitle=false;
for (int j=0; j<slidePHs.size(); j++) {
String text = null;
Placeholder ph = slidePHs.get(j);
if (ph != null && ph instanceof TextHolder) {
TextHolder th = (TextHolder) ph;
text = th.getText().trim();
}

if (j==0 && text!=null) {
Element titleElm = new Element(“title”);
titleElm.setText(text.replace((char) 0xb, ‘\n’));
slideElm.addContent(titleElm);
foundTitle=true;
}
}

for (int j=0; j<slideShapes.size(); j++) {
String text = null;
Shape s = slideShapes.get(j);

TextFrame tf = s.getTextFrame();
if (tf!=null) {
text = tf.getText().trim();
if(j==0 && text!=null && !foundTitle) {
Element titleElm = new Element(“title”);
titleElm.setText(text.replace((char) 0xb, ‘\n’));
slideElm.addContent(titleElm);
foundTitle=true;
}
}
}

I have tried to include on the relevant code sections…basically I find that sometimes the title is in the slide’s first Placeholder, and sometimes it is in the slides first Shape. Other times, the first placeholder is not a TextHolder, and if I fall back on taking the first Shape, sometimes the first shape is not the title (one time it was something from the footer of the slide actually). Can you recommend an approach to getting the slide title’s reliably? I sent the ppt file in case you think its a file format abnomally. Thanks,

Greg

Dear Greg,

Title is actually a text inside the title placeholder. Which is placeholder at index 0.

I

When user wants to add a title, he adds a slide with some placeholders and adds a text in the title placeholder. When he does so, Aspose.Slides is sure that the text is definitely a title.

II

But user can also choose to use a textbox instead of title placeholder and add any text. Now Aspose.Slides does not have any clue that what he added as a text actually means a title, because it does not know semantics. The most it can do is give you access to that text.

In short, when the text is inside textbox (textframe in Aspose.Slides term) then Aspose.Slides can’t tell, if the text is title or some other thing, because it depends on text semantics.

You can however guess by checking the font size, normally, text font size is greater and bold etc.

III

One more thing, sometimes user adds a text inside the title placeholder in MS-PowerPoint 2007 and saves it in PPT format, but MS-PowerPoint converts the placeholders into textboxes (textframes) while saving it thus causing the same problem as in II.

For case III,

Also see this post.

https://forum.aspose.com/t/99341