Removing Pages from word document

Hi, Can you please help me by letting me know how can we removed Pages from a document in Aspose? Suppose we have a word document with 10 pages and out of those 10 pages we only need the first 4 pages to be converted to html , How this can be done?

Hello
Thanks for your request. Word document is flow document and does not contain any information about its layout into lines and pages. Therefore, technically there is no “Page” concept in Word document.
Aspose.Words uses our own Rendering Engine to layout documents into pages. But unfortunately, there is no public API, which allows you to determine where page starts or ends. Your request has been linked to the appropriate issue. You will be notified as soon as it is supported.
But as a workaround you can try using the code provided by Adam in this thread:
https://forum.aspose.com/t/58199
Best regards,

Hi Andrey,
Thanks for your reply, Can you please provide us with a workaround like “If in a section there are 10 pages , assuming the 10 pages as 10 page breaks. Once we have the count that reaches 10the Page break, then to delete the remaining content in that section and remaining sections of the word document”. How this can be done using Aspose?
Thanks,
Tanveer

Hi Andrey,
I have checked the link that you have mentioned in your last post but when I try to use it I’m unable to find the class PageNumberFinder . I’m using Aspose 7 , is this class available in any other version of Aspose?
Thanks,

Hello
Thanks for your request. This class you can find here as attachment to the Adam’s post:
https://forum.aspose.com/t/58199
This is direct link:
https://forum.aspose.com/t/58199
Best regards,

Hi Andrey,
Thanks alot for the help you provided so far. We have been trying to use the PageFinder class sent by you but it gives the attached error (Cannot implicitly convert type ‘Aspose.Words.Fields.FieldStart’ to ‘Aspose.Words.Fields.Field’ ). I believe it is because we ars using Aspose 7. Could you please share with us the Page Finder class for Aspose 7 or the fix for the attached issue?
Your help in this regard is greatly appreciated.
Thanks,

Hi
Thanks for your request. in older versions of Aspose.Words DocumentBuilder.InsertField method returns FieldStart instead of Field object. So you should change the code appropriately.
Also, I would advise you to use the latest version of Aspose.Words. You can download it from here:
https://releases.aspose.com/words/net
Best regards,

Hi there,
In addition, if you are looking to extract the content between explict page breaks then perhaps you can do this by using the code found on this page instead:
https://blog.aspose.com/words/extract-word-document-content-separated-by-page-breaks-using-csharp/
Thanks,

Hi guys,
Thanks alot for the help provided and for communicating on time. I have used the PageFinder class and also downloaded the Aspose 10 and it works fine now.
However, I have one question about the licensing. We have been using Aspose 7 and now that I we have downloaded Aspose 10 , will the licence for Aspose 7 work for Aspose 10? or do we have to purchase a separate licence for Aspose 10 ?
Thanks,

Hello
Thanks for your inquiry. Every Aspose license carries one-year subscription for free upgrades to new versions released during this time. So, you can check the expiration date of your license and upgrade to the newest version. To check an expiration date of your license open the license file using notepad. You will see the following tags:
20110218

It means that you can free upgrade to version of Aspose.Words published before 02/18/2011.
Best regards,

Hi Andrey,
First of all thanks for the help provided by you and Adam.We have another question for you guys, how can we ignore Word Objects (text boxes, word art, shapes etc) while converting from word document to html ? We only need the images and text(like paras/tables etc) to come across finally.
Thanks,

Hello
Thanks for your request. To achieve what you are looking for you should loop through all shapes inside your document and remove unneeded. For example you can try using the code like the following:

Document doc = new Document("C:\\Temp\\in.doc");
foreach(Shape shape in doc.GetChildNodes(NodeType.Shape, true))
{
    if (shape.IsWordArt)
        shape.Remove();
}

Also you can use switch construction like this:

switch (shape.ShapeType)
{
    case ShapeType.TextBox:
        shape.Remove();
        break;
    default:
        // do nothing.
        break;
}

Shape Type enumeration you can find here:
https://reference.aspose.com/words/net/aspose.words.drawing/shapetype/
Best regards,

The issues you have found earlier (filed as WORDSNET-2978) have been fixed in this .NET update and this Java update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.
(22)