Problem with difference FontInfoCollection for doc and docx

I have a problem with fonts,
when I use same file (saved both as doc and docx extension) FontInfoCollection is different.
Here is the code:

Document document = new Document("D:/The.doc");
FontInfoCollection fc = document.getFontInfos();
// fc ={Arial=2, Calibri=4, Cambria=3, Cambria Math=5, Symbol=1, Times New Roman=0}

Document document = new Document("D:/The.docx");

FontInfoCollection fc = document.getFontInfos();
// fc = {Calibri=0, Cambria=2, Times New Roman=1}

Can someone please explain this to me!

Hi
Thanks for your request. Could you please also attach your test documents here for testing? We will check the issue and provide you more information.
Best regards,

Ok,
sorry I forgot to do that before

Hi
Thank you for additional information. Aspose.Words reads FontInfos from the document’s font table. Aspose.Words does not update it if not necessary. So in your case, font tables of these documents have different set of fonts. You can easily check font table of DOCX document (just unzip it and open fontTable.xml file). I posted its content here for your convenience:

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<w:fonts xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" mc:Ignorable="w14">
	<w:font w:name="Calibri">
		<w:panose1 w:val="020F0502020204030204" />
		<w:charset w:val="00" />
		<w:family w:val="swiss" />
		<w:pitch w:val="variable" />
		<w:sig w:usb0="E10002FF" w:usb1="4000ACFF" w:usb2="00000009" w:usb3="00000000" w:csb0="0000019F" w:csb1="00000000" />
	</w:font>
	<w:font w:name="Times New Roman">
		<w:panose1 w:val="02020603050405020304" />
		<w:charset w:val="00" />
		<w:family w:val="roman" />
		<w:pitch w:val="variable" />
		<w:sig w:usb0="E0002AFF" w:usb1="C0007841" w:usb2="00000009" w:usb3="00000000" w:csb0="000001FF" w:csb1="00000000" />
	</w:font>
	<w:font w:name="Cambria">
		<w:panose1 w:val="02040503050406030204" />
		<w:charset w:val="00" />
		<w:family w:val="roman" />
		<w:pitch w:val="variable" />
		<w:sig w:usb0="E00002FF" w:usb1="400004FF" w:usb2="00000000" w:usb3="00000000" w:csb0="0000019F" w:csb1="00000000" />
	</w:font>
</w:fonts>

As you can see font table of DOCX document contains the same fonts (and even in the same order) as Aspose.Words returned.
The same applies to DOC file. However, there is not so easy way to check font table of DOC file because DOC format is low-level binary format.
Best regards,

Thank you, for now!

I still have a problem about fonts,
here is test class where I get all runs from docx document, and fonts for runs are different from those in word docx file, when I save same file with doc extension everyting is normal.

WordDocxFontTest.java
-------------------------------

import java.util.ArrayList;
import java.util.List;
import org.junit.Test;
import com.aspose.words.Document;
import com.aspose.words.DocumentVisitor;
import com.aspose.words.Run;

public class WordDocxFontTest
{
    protected List getDocumentRuns(com.aspose.words.Document document)
    {
        final List runs = new ArrayList();
        try
        {
            document.accept(new DocumentVisitor()
            {
                @Override
                public int visitRun(Run run) throws Exception
                {
                    runs.add(run);
                    
                    return super.visitRun(run);
                }
            });
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
        
        return runs;
    }
    
    @Test
    public void testFont() throws Exception
    {
        Document document = new Document("d:/pp.docx");
        
        List runs = getDocumentRuns(document);
        
        for (int i = 0; i <runs.size(); i++)
        {
            System.out.println(runs.get(i).getFont().getName());
        }
    }
}

I think is error in aspose tools! Calibri and Cambria fonts is changed into Times New Roman.

Hi
Thanks for your request. The problem occurs because Calibri and Cambria fonts are set via themes in DOCX document. Since DOC format does not support themes, formatting specified via themes is converted to direct formatting.
Unfortunately, currently there is no way to access themes using Aspose.Words. We will consider exposing theme’s information. Your request has been linked to the appropriate issue. You will be notified as soon as it is resolved.
Best regards,

The issues you have found earlier (filed as WORDSNET-3312) have been fixed in this .NET update and this Java update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.
(7)