OfficeMath formatting is lost when convert it to EMF using C#

Hi Aspose support,
In our product we have functionality that converts equations’ content to images and replace OMathPara/OMath node with resulted image shape.
Similar to what MS Word do when converts DOCX document with equations into DOC/RTF format.
After testing this approach we were faced with different conversion issues, mainly incorrect/missed symbols, etc. in resulted image.

Please see the issues:

Case #1:
After conversion OMathPara/OMath node to EMF some symbols become double-stuck instead of letters from Keyboard in some equations.
Document: Equations from Office.com.zip (13.5 KB)
See results of conversion by Aspose and Word:
Page#1
Case#1Doc1Page1.png (48.6 KB)
Page#2
Case#1Doc1Page2.png (33.0 KB)

Document: Equations in Linear format Unicode.zip (11.5 KB)
Page#1
Case#1Doc2Page1.png (46.2 KB)

Document: Equations in Professional format.zip (12.1 KB)
Page#1
Case#1Doc3Page1.png (32.1 KB)

Document: Ink Equations.docx Ink Equations.zip (10.3 KB)
Page#1
Case#1Doc4Page1.png (17.7 KB)
Expected that conversion should work as in Word.

Case #2:
After conversion OMathPara/OMath node to EMF one row of equation is shifted to the right.
Document: Equations from Office.com.zip (13.5 KB)
See results of conversion by Aspose and Word:
Page#1
Case#2Page1.png (10.3 KB)
Expected that conversion should work as in Word.

Case #3:
After conversion OMathPara/OMath node to EMF spaces are missed in some equations.
Document: Equations from Office.com.zip (13.5 KB)
See results of conversion by Aspose and Word:
Page#2
Case#3Page2.png (55.2 KB)
Expected that conversion should work as in Word.

Case #4:
After conversion OMathPara/OMath node to EMF missed twelve arrows.
Document: Equations with non-math text.zip (10.1 KB)
See results of conversion by Aspose and Word:
Page#1
Case#4Page1.png (27.5 KB)
Expected that conversion should work as in Word.

Case #5:
After conversion OMathPara/OMath node to EMF sign of integral is missed in one equation.
Document: Ink Equations.zip (10.3 KB)
See results of conversion by Aspose and Word:
Page#1
Case#5Page1.png (13.6 KB)
Expected that conversion should work as in Word.

Case #6:
After conversion OMathPara/OMath node to EMF incorrect conversion of fields inserted into equations.
Document: Equations with different fields.zip (47.9 KB)
See results of conversion by Aspose and Word:
Page#1, 3
Case#6Page1.png (48.5 KB)
Expected that conversion should work as in Word.

Case #7:
After conversion OMathPara/OMath node to EMF some symbols inserted into equations are cropped.
Document: Equations with different symbols.zip (12.3 KB)
See results of conversion by Aspose and Word:
Page#1
Case#7Page1.png (36.5 KB)
Expected that conversion should work as in Word.

Case #8:
After conversion OMathPara/OMath node to EMF incorrect conversion of text with different effects inserted into equations.
Document: Equations with text with different languages and formatting.zip (15.6 KB)
See results of conversion by Aspose and Word:
Page#1
Case#8Page1.png (257.3 KB)
Expected that conversion should work as in Word.

Here is our code that perform conversion:

        private void ReplaceFormulas(Document doc, ImageSaveOptions options, DocumentBuilder builder)
        {
            NodeCollection mathCollection = doc.GetChildNodes(NodeType.OfficeMath, true);

            // Loop from last one to first one
            for (int i = mathCollection.Count - 1; i >= 0; i--)
            {
                try
                {
                    OfficeMath math = (OfficeMath)mathCollection[i];

                    // if node is MathPara or node is Inline Math with parent which is not Math Node, then replace with image
                    // Needs to differentiate inline and display formulas
                    if ((math.MathObjectType == MathObjectType.OMathPara) ||
                        ((math.MathObjectType == MathObjectType.OMath) && (math.ParentNode?.NodeType != NodeType.OfficeMath)))
                    {
                        using (MemoryStream stream = new MemoryStream())
                        {
                            math.GetMathRenderer().Save(stream, options);
                            SizeF mathSize = math.GetMathRenderer().SizeInPoints;

                            builder.MoveTo(math);

                            Shape imageShape = builder.InsertImage(stream, mathSize.Width, mathSize.Height);
                           
                            math.Remove();
                        }
                    }
                }
                catch
                {
                    // Ignore exception and continue process math collection
                }
            }
        }


The code which calls this method:
        public void Replace()
        {
            try
            {


                // Save Equations into EMF format as optimal format for them
                ImageSaveOptions optionsEmf = new ImageSaveOptions(SaveFormat.Emf);
                optionsEmf.MetafileRenderingOptions.ScaleWmfFontsToMetafileSize = true;
                optionsEmf.MetafileRenderingOptions.RenderingMode = MetafileRenderingMode.Vector;
                optionsEmf.UseGdiEmfRenderer = false;

                DocumentBuilder builder = new DocumentBuilder(_doc);
                ReplaceFormulas(_doc, optionsEmf, builder);
            }
            catch (Exception e)
            {
                throw new DrawingReplacerException(e.Message);
            }
}

Tested on Aspose.Words.dll v19.8.
Same issues happen and for PNG format, so looks like it does not matter what ImageSaveOptions are set.

Could you help with solving these problems?
Thanks in advance.

@licenses

We are working over your query and will get back to you soon.

@licenses

We have converted the first document to DOC file format using the following code example and have managed to reproduced the same issue at our end. For the sake of correction, we have logged this problem in our issue tracking system as WORDSNET-19143. You will be notified via this forum thread once this issue is resolved. We apologize for your inconvenience.

Document document = new Document(@"Equations from Office.com.docx");
document.Save(@"Equations from Office.com.doc", SaveFormat.Doc);

If you want to save DOCX to DOC file format, you do not need to convert each OfficeMath to image. Could you please share why you want to convert each OfficeMath node to image?

Please also share to which file format you want to save your final document. We will then log your issues accordingly in our issue tracking system.

Unfortunately re-saving document into DOC format doesn’t work for us. Document format has to be left the same, otherwise, for instance after DOCX to DOC conversion, to much information form document can be missed.
The goal is to convert all OfficeMath objects into images, without changing resulted file format. These images required for further processing in our product.

Also, please take into account, your suggestion doesn’t fix described above in different cases issues. When save DOCX to DOC using Asepo.Words, the problems stay the same. It looks like the problem somewhere in functionality that perform OfficeMath range conversion to image (OfficeMathRenderer).

@licenses,

We have tested the scenario and have managed to reproduce the same issues at our side. For the sake of correction, we have logged these problems in our issue tracking system as follow. You will be notified via this forum thread once these issues are resolved. We apologize for your inconvenience.

Issue ID is WORDSNET-19147.

Issue ID is WORDSNET-19148.

Issue ID is WORDSNET-19149.

Issue ID is WORDSNET-19150.

We are investigating your remaining cases and will share the issue IDs soon.

@licenses

We have logged the following issues in our issue tracking system.

The issue ID is WORDSNET-19155.

The issue ID is WORDSNET-19156.

The issue ID is WORDSNET-19157.

Your issues has been logged in our issue tracking system as follow.

The issue ID is WORDSNET-19158.

The issue ID is WORDSNET-19159.

The issue ID is WORDSNET-19160.

The issue ID is WORDSNET-19162.

@licenses,

In this case, there are two problems:

  • The main problem is rendering of the “Letterlike” symbols ⅆⅇⅈⅉⅅ (will be fixed in WORDSNET-19147)
  • Rendering of 〗「」』『【】ⓘ symbols. We have created new ticket for this issue as WORDSNET-19950.

We will inform you once these issues are resolved.

The issues you have found earlier (filed as WORDSNET-19147,WORDSNET-19150,WORDSNET-19148) have been fixed in this Aspose.Words for .NET 20.3 update and this Aspose.Words for Java 20.3 update.

1 Like

The issues you have found earlier (filed as WORDSNET-19156) have been fixed in this Aspose.Words for .NET 20.4 update and this Aspose.Words for Java 20.4 update.

The issues you have found earlier (filed as WORDSNET-19155) have been fixed in this Aspose.Words for .NET 20.6 update and this Aspose.Words for Java 20.6 update.

Hi @tahir.manzoor,
Checked latest Aspose.Words.dll v20.7.0. and can confirm that following cases are fixed:
WORDSNET-19147
WORDSNET-19148
WORDSNET-19149
WORDSNET-19150
WORDSNET-19155
WORDSNET-19156

@licenses

Thanks for your feedback. Please let us know if you have any more queries.

The issues you have found earlier (filed as WORDSNET-19950) have been fixed in this Aspose.Words for .NET 21.8 update and this Aspose.Words for Java 21.8 update.

The issues you have found earlier (filed as WORDSNET-19158,WORDSNET-19157) have been fixed in this Aspose.Words for .NET 21.11 update also available on NuGet.

The issues you have found earlier (filed as WORDSNET-19160) have been fixed in this Aspose.Words for .NET 22.2 update also available on NuGet.