Problem with extracting fonts

bram-1 · April 3, 2014, 4:45am

When extracting embedded fonts from a document (docx) we stumbled upon strange behaviour. I'm hoping someone else can help us with the following question.

When iterating through all fonts in the document using (see attached file)

doc = Document()

for font_info in doc.getFontInfos():

print font_info.getName()

A list is printed including the font 'Helvetica'.

However if we extract the embedded font using

doc = Document()

for font_info in doc.getFontInfos():

if font_info.getName() == 'Helvetica'

font_data = font_info.getEmbeddedFont(EmbeddedFontFormat.OPEN_TYPE, EmbeddedFontStyle.BOLD)

We then save the font_data to file (in this case 'Helvetica_Bold.ttf'. I know it isn't really a ttf file, it worked fine until now.

When viewing the saved font file with a font viewer, the font identifies itself as 'Arial'.

How can it be that an embedded font is identified by Aspose.Words with the name 'Helvetica' and the extracted font data is identified as 'Arial'?

tahir.manzoor · April 4, 2014, 3:27am

Hi Bram,

Thanks for your inquiry.

I have tested the scenario and have managed to reproduce the same issue at my side. For the sake of correction, I have logged this problem in our issue tracking system as WORDSNET-9935. I have linked this forum thread to the same issue and you will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

tahir.manzoor · April 29, 2014, 2:31am

Hi Bram,

Thanks for your inquiry via live chat session.

I would like to share with you that issues are addressed and resolved based on first come first serve basis. Currently, your issue is pending for analysis and is in the queue. We will update you via this forum thread once there is any update available on your issue.

Thank you for your patience.

tahir.manzoor · June 13, 2014, 7:33am

Hi Bram,

Thanks for your patience.

It is to inform you that our development team has completed the work on the issue (WORDSNET-9935) and has come to a conclusion that this issue and the undesired behavior you're observing is actually not a bug in Aspose.Words. So, we have closed this issue as 'Not a Bug'.

If you try unzip this DOCX document, and then go to fontTable.xml you will see Helvetica font has w:embedBold->id = rId8. Then you can try decrypt font8.odttf and you will see this is the same Arial font which Aspose.Words returns.