Document.PageCount returns Incorrect Value | Extract Pages from DOCX using .NET

i have a docx file (please see the attachment), i use the below code to get pagecount:

var doc = new Aspose.Words.Document(fileFullName);

int pageCount = doc.PageCount;

for (int page = 0; page < pageCount; page++)
{
    // Save each page as a separate document.
    var extractedPage = doc.ExtractPages(page, 1);
    var toFileName = Path.Combine(@"O:\", $"SplitDocument.PageByPage_{ page + 1}.docx");
    extractedPage.Save(toFileName);
}

the pagecount property returns 2 which is incorrect, and i separated the document to 2 docs file please also see the attments.

i dont know what’s wrong with my codes, can you please help me to find out why the pagecount property is not incorrect.

thank you very much.

SplitDocument.PageByPage_1.docx (16.9 KB)
SplitDocument.PageByPage_2.docx (16.4 KB)

common_FOOTER-Final.docx (22.9 KB)

@vs6060_qq_com

Please note that Aspose.Words requires TrueType fonts when rendering document to fixed-page formats (JPEG, PNG, PDF or XPS) and calling Document.PageCount or Document.ExtractPages method. You need to install fonts that are used in your document on the machine where you are extracting pages of document. Please refer to the following articles:

Using TrueType Fonts
Manipulating and Substitution TrueType Fonts

hi manzoor:
thank you for the reply, to follow your suggestion,i did the below steps
0) prepare my test document.common_FOOTER-Final.docx (20.3 KB)

  1. got the all font names provided by FontSettings.DefaultInstance by the below codes:

        var fontSettings = FontSettings.DefaultInstance;
         var defaultFontNames = new List<string>();
         fontSettings.GetFontsSources().ToList().ForEach(t =>
         {
             var thisFontNames = t.GetAvailableFonts().Select(t => t.FullFontName).ToList();
             defaultFontNames.AddRange(thisFontNames);
         });
    

    i got a name list ,please see the attachment(defaultnames.docx) (19.6 KB)

  2. got the document missing fonts:

         var fileFullName = @"common_FOOTER-Final.docx";
         var doc = new Document(fileFullName);
         var documentFontNames = doc.FontInfos.Select(t => new
         {
             Name = t.Name,
             AltName = t.AltName
         }).ToList();
    
         var fontSettings = FontSettings.DefaultInstance;
    
         var defaultFontNames = new List<string>();
         fontSettings.GetFontsSources().ToList().ForEach(t =>
         {
             var thisFontNames = t.GetAvailableFonts().Select(t => t.FullFontName).ToList();
             defaultFontNames.AddRange(thisFontNames);
         });
         var notFoundNames = documentFontNames.Where(t => !defaultFontNames.Contains(t.Name)).ToList();
         notFoundNames = notFoundNames.Where(t => !string.IsNullOrEmpty(t.AltName) && !defaultFontNames.Contains(t.AltName)).ToList();
    

the values of “notFoundNames” was:
Name = “等线”, AltName = "DengXian"
"隶书", AltName = "微软雅黑"

  1. i checked the default font name list , found the similar fonts,:

DengXian Regular
DengXian Bold
DengXian Light
Microsoft YaHei

  1. i though ,the next should be using " Substitute features", by the bloew

         var fileFullName = @"common_FOOTER-Final.docx";
         var fontSettings = FontSettings.DefaultInstance;
         fontSettings.SubstitutionSettings.FontConfigSubstitution.Enabled = true;
         var xml = @"
             <TableSubstitutionSettings xmlns=""Aspose.Words""> 
                 <SubstitutesTable> 
                     <Item OriginalFont=""等线"" SubstituteFonts=""DengXian Regular"" /> 
                     <Item OriginalFont=""隶书"" SubstituteFonts=""Microsoft YaHei"" /> 
                     <Item OriginalFont=""微软雅黑"" SubstituteFonts=""Microsoft YaHei"" /> 
                 </SubstitutesTable> 
             </TableSubstitutionSettings>";
         var bytes = Encoding.UTF8.GetBytes(xml);
         using var stream = new MemoryStream(bytes);
         fontSettings.SubstitutionSettings.TableSubstitution.Load(stream);
         var loadOptions = new LoadOptions();
         loadOptions.FontSettings = fontSettings;
         var doc = new Aspose.Words.Document(fileFullName, loadOptions);
         doc.UpdatePageLayout();
         doc.UpdateTableLayout();
         int pageCount = doc.PageCount;
         for (int page = 0; page < pageCount; page++)
         {
             // Save each page as a separate document.
             var extractedPage = doc.ExtractPages(page, 1);
             var toFileName = Path.Combine(@"O:\", $"SplitDocument.PageByPage_{ page + 1}.docx");
             extractedPage.Save(toFileName);
         }
    

but i still got 2 pages.

could you please help me point out where the prombles are.

thank you very much.

@vs6060_qq_com

We have tested the scenario and managed to reproduce the same issue at our side. For the sake of correction, we have logged this problem in our issue tracking system as WORDSNET-24858. You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

thank you ,manzoor, i am looking forward this issue get resolved.

hi ,

please any update of this issue?

@vs6060_qq_com

Your issue has been resolved and its fix will be available in the next version of Aspose.Words i.e. 21.8. Hopefully, this release will be available at the start of August 2021.

okay,sounds good, please let me know when the new version 21.8 is released.

hi ,i noticed the new version 21.8 has been realesed, but i still got the same page count (2) based on the new 21.8 version,can you please help me to reslove this issue?

@vs6060_qq_com

We have tested the scenario using following code example and have not found the shared issue. Please check the attached output documents.
21.8.pdf (40.9 KB)
SplitDocument.PageByPage_1.docx (15.4 KB)

var doc = new Aspose.Words.Document(MyDir + "common_FOOTER-Final (1).docx");
doc.Save(MyDir + "21.8.pdf");

doc.UpdatePageLayout();
doc.UpdateTableLayout();
int pageCount = doc.PageCount;
for (int page = 0; page < pageCount; page++)
{
    // Save each page as a separate document.
    var extractedPage = doc.ExtractPages(page, 1);
    var toFileName = Path.Combine(MyDir, $"SplitDocument.PageByPage_{ page + 1}.docx");
    extractedPage.Save(toFileName);
}

hi Dahir,
thank you for your help, but on my side ,the page count is still returnning 2,to find whrer the problem is ,i removed all useless codes as below:Capture.PNG (39.1 KB)

i also zipped the whole project codes:
ConsoleApp21.7z (17.8 KB)

can you please check this issue again?

@vs6060_qq_com

While using the latest version of Aspose.Words for .NET 21.8, we have not found the shared issue. Please make sure that the fonts used in your document are installed on the machine where you are executing the application.

@Tahir,

i am trying to remove not found fonts from the document:
[0]: “等线,DengXian”
[1]: “隶书,微软雅黑”

please are there any aspose.words API(s) to remove these fonts?

@vs6060_qq_com

Your document has not embedded fonts. Could you please share why you want to remove fonts from document?

If you want to change the font name, you can use Run.Font.Name property to change the font of Run nodes. For paragraphs, please use ParagraphFormat.Style property.

hi Tahir,
according your below help:
" we have not found the shared issue. Please make sure that the fonts used in your document are installed on the machine where you are executing the application.".

it sems there is only one reason which fired the wrong page count : missing fonts

so i though ,after removing these missing fonts, i should get the correct page count.

that’s the reason i want to remove the below 2 fonts from my document:

i also did the below 2 deep investigations:

  1. by the FontSettings.DefaultInstance api ,i can get a full aspose.word loaded fonts,on the list,i can’t find “等线” ,but i can find it on my computer:Capture.PNG (37.1 KB)
    probably FontSettings.DefaultInstance api does not work very well?

  2. and i also used the below code to force correct fonts:
    var fileFullName = @“common_FOOTER-Final.docx”;
    var fontSettings = FontSettings.DefaultInstance;
    fontSettings.SubstitutionSettings.FontConfigSubstitution.Enabled = true;
    var xml = @"
    <TableSubstitutionSettings xmlns="“Aspose.Words”">

    <Item OriginalFont="“等线”" SubstituteFonts="“DengXian Regular”" />
    <Item OriginalFont="“隶书”" SubstituteFonts="“Microsoft YaHei”" />
    <Item OriginalFont="“微软雅黑”" SubstituteFonts="“Microsoft YaHei”" />

    ";
    var bytes = Encoding.UTF8.GetBytes(xml);
    using var stream = new MemoryStream(bytes);
    fontSettings.SubstitutionSettings.TableSubstitution.Load(stream);
    var loadOptions = new LoadOptions();
    loadOptions.FontSettings = fontSettings;
    var doc = new Aspose.Words.Document(fileFullName, loadOptions);
    doc.UpdatePageLayout();
    doc.UpdateTableLayout();
    int pageCount = doc.PageCount;

but i still can’t get correct page count.

@vs6060_qq_com

We have tested again the same scenario using the shared code example and have not found page count issue. Please check the attached screenshot. PageCount.png (82.5 KB)

  • Please make sure that you are using the latest version of Aspose.Words for .NET 21.8.
  • Please make sure that you are using the same document that you attached in this thread.

If you still face problem, please share following detail for testing:

  • Your working environment e.g. operating system, .NET version etc.
  • Please open the document in MS Word and share its screenshot.
  • Please implement IWarningCallback interface and let us know if you face any missing fonts notification.
  • Please save the document to PDF and attach the PDF here for our reference.

We will investigate the issue and provide you more information on it.

@tahir.manzoor

my work environment is windows 10 vs2019 .net 5

my document scrren shot is : Capture.PNG (6.9 KB)

i have implement IWarningCallback interface ,not fonts is missing;

the pdf file is: common_FOOTER-Final.pdf (76.2 KB)

(i just noticed that there was @copyright remark on the pdf ,is it the reason?)

@vs6060_qq_com

Yes, your understanding is correct. Please get the 30 days temporary license and apply it before importing document into Aspose.Words’ DOM.

The issues you have found earlier (filed as WORDSNET-22491) have been fixed in this Aspose.Words for .NET 21.8 update and this Aspose.Words for Java 21.8 update.

thnk you @tahir.manzoor,i will try it later.