Extract Text from PowerPoint and Find Coordinates, Width, and Height of Each Word

I have a ppt document. I need to read text and find each words

a. X, Y coordinates ,
b. width & height
c. slide number
d. each slide width & height

Earlier I worked with Aspose.Words and was able to acheive this using LayoutEnumerator.
Looks like there isnt such layoutenumerator for Aspose.Slides

Pls advise. I am eagerly looking forward to purchase license after I demonstrate a working proof of concept of the requirement to my team

@muhammad.sabir2 : can you pls help me here

@sashi1211,
Thank you for contacting support.

To make sure that I understand the issue correctly, could you please share a sample presentation and screenshots showing the expected result? We will do our best to help you.

The attached sample pptx contains 2 slides some text. WHat I am looking for is, split the entire text into words , space delimited and get X, Y coordinates, height & width of each word , number of slides, and each slide width & height

For e.g. For the word “Expleo” , X=72.000, Y=252.4534, height = 283.2133 , width = 256.23123, slide =1 , slideHeight = 787.0000, slideWidth = 658.1212

oops, pptx file upload not allowed on to this site , anyways, you catest.zip (28.1 KB)

I zipped the pptx file and uploaded , pls help … thank you

@sashi1211,
Thank you for the details. I am working on the issue and will get back to you as soon as possible.

Please let me know, this feature is critical to my team… thank you

@sashi1211,
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): SLIDESNET-44028

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Please note that you can use the ISlideSize interface to get the slide size from a presentation like this:

var slideSize = presentation.SlideSize.Size;
var slideWidth = slideSize.Width;
var slideHeight = slideSize.Height;

If you have a reference to a shape, you can get the number of a slide containing the shape like this:

// The shape is a text box, for example here
var slideNumber = ((ISlide)shape.Slide).SlideNumber;

Thank you , Eagerly looking forward for a solution to the problem. If this issue successfully gets resolved we will purchase license of Aspose.Slides, after demonstration of working solution

@sashi1211,
I hope our developers will implement such a feature and the issue will be resolved on your end as soon as possible. Thank you for your patience.

Just trying to reiterate my requirement/problem here… I have a powerpoint document which has text and has many slides … I need to extract each word from the text (space delimiter) along with Top(Y) , Left(X) coordinates, height & width of each word , height & width of each slide etc…

@sashi1211,

Thanks for specifying your requirements/task again. We have noted it down and logged it against your existing ticket “SLIDESNET-44028” into our database.

Once we have an update on it, we will let you know.

Gentlemen, can you pls let me know when can I expect the solution ? The sooner I get a resolution the sooner we are going to purchase license…

@sashi1211,
The issue is still open. I requested plans for the issue from our developers. We will let you know as soon as possible.

Gentlemen, My temporary license is about to expire in 3 days from now… can you pls try expedite solution…,
the evaluation version doesnt allow to process pptx , it truncates all the text except first 3 characters of the entire text …

pls suggest

@sashi1211,
We are still waiting for an answer from our development team. Unfortunately, I have no additional information yet.

@sashi1211,
It is possible to get the rectangle of each Portion object using the Portion.GetRect method like this:

using (Presentation pres = new Presentation("presentation.pptx"))
{
    foreach (ISlide slide in pres.Slides)
    {
        foreach (IShape shape in slide.Shapes)
        {
            var autoShape = shape as AutoShape;

            if (autoShape == null) 
                continue;

            foreach (IParagraph paragraph in autoShape.TextFrame.Paragraphs)
            {
                foreach (IPortion portion in paragraph.Portions)
                {
                    RectangleF rect = portion.GetRect();
                    Console.WriteLine($"\"{portion.Text}\", {rect}");
                }
            }
        }
    }
}

Also see: Portion

Please let us know if this way does not work for you.