Aspose.Words PageCount and Document Corruption

Hello!

I am in the process of prototyping some work with Aspose.Words which has resulted in me passing real CV’s through Aspose to find out the number of pages a document has and creating images of each page. While attempting to work out the average number of pages contained in CV’s we came across an interesting, but also potentially show stopping bug.

It would appear that some documents would have massively more pages than would ever be expected for a CV, occasionally upto 250! When opening the original copy of the document they would not have anywhere near the reported number of pages. When these documents were saved through Aspose they would be corrupted with x number of blank pages, text being compressed onto 1 character per line and a number of other strange results.

From a very quick investigation it would appear (At least in the case of a couple of documents) that the issue relates to Section and Page brakes, document 35277 is a prime example of this problem. Removing the sections breaks from this document and re-saving it results in the correct number of pages being reported and no corruption. Also if you save this document through Aspose in RTF format it will appear correctly in Word, but will still have the wrong PageCount when re-opening in Aspose.

I have attached a number of anonymized CV’s which are exhibiting the issues I am experiencing. Each folder has a text file which contains the number of pages reported by PageCount, a “x_aspose.doc”
file which is the document saved through Aspose and a “x_original.doc” which is

the original document.

We are using the very latest fully licenced version of Aspose.Words 7.0.0.0.

I hope I have provided enough information to find the route of the problem. Please don’t hesitate to contact me if any further information, examples etc… are required.

Many Thanks,

Martyn

Hi Martyn,

Thank you for reporting this problem to us. I managed to reproduce the problem on my side. your request has been linked to the appropriate issue. You will be notified as soon as it is resolved.
The problem occurs because for some reason Aspose.Words cannot read width of text columns from these documents. So width of text columns is set to 0. As a workaround, you can resetting width of text columns in these documents. For example you can try using code like the following:

Document doc = new Document(@"Test001\in.doc");
// Loop through all section
foreach(Section section in doc.Sections)
{
    // check if width of text column in the section is 0.
    if (section.PageSetup.TextColumns[0].Width != 0)
        continue;
    int textColumnsCount = section.PageSetup.TextColumns.Count;
    // Calculate widht of page
    double pageWidth = section.PageSetup.PageWidth - section.PageSetup.LeftMargin -
        section.PageSetup.RightMargin - section.PageSetup.Gutter;
    // Calculate widht of text column.
    double widht = pageWidth / textColumnsCount;
    // set width of text column.
    for (int i = 0; i <textColumnsCount; i++)
        section.PageSetup.TextColumns[i].Width = widht;
}
// Save output docuemnt
doc.Save(@"Test001\out.doc");

Hope this helps.
Best regards.

Excellent thanks for the quick response. I will look into appling the temporary fix listed above and will keep my eyes out for the bug fix.

Thanks again,

Martyn

The issues you have found earlier (filed as WORDSNET-2885) have been fixed in this .NET update and in this Java update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.