Rendering to Emf and pasting into Word causes pdf document to be malformed

Hi,


This is a bit of complex problem, but quite a large one for me.

The gist is this:

When rendering an Excel Table as Emf, the resulting image, when pasted into a Word document, causes the exported Pdf of that Word document to be badly formatted.

Here is the code:

var book = new Workbook(“Test.xlsx”);
Worksheet sheet = book.Worksheets[0];
var options = new ImageOrPrintOptions { HorizontalResolution = 300, VerticalResolution = 300, ImageFormat = ImageFormat.Emf };
var renderer = new SheetRender(sheet, options);

const string ImagePath = “out.emf”;
renderer.ToImage(0, ImagePath);

var doc = new Document();
var docBuilder = new DocumentBuilder(doc);

var image = Image.FromFile(ImagePath);

docBuilder.MoveToDocumentStart();
docBuilder.InsertImage(image);
doc.Save(“out.pdf”);

Now I am confident this is an Aspose.Cells issue because of the following:
If you manually paste the resulting out.emf into Word and save as Pdf using the Word UI you get the same issue. This issue does not exist with emf files that are not created by Aspose.Cells (for example, you can use PowerPoint to create an emf of the table in Test.xlsx, the resulting pdf after copying it into Word is of extremely high quality and without any formatting issues).
The issue also does not seem to exist if you export the Excel table as Png.

Best regards,

Hi Pit,


Thank you for contacting Aspose support.

We have evaluated your presented scenario while using the latest version of Aspose.Cells for .NET 8.5.0 to generate the EMF and inserting it manually in Word document to generate the PDF. We are able to observe the problem, that is; the text in resultant PDF seems to be overlapping. We have logged this incident in our bug tracking system under the ticket CELLSNET-43732 for further investigation. As soon as we have completed the analysis, we will share our results here for your reference.

Hi,


Have there been any updates on this issue?
The output pdf/xls files are pretty much unusable due to this bug.

Best regards,

Hi,

Thanks for your posting and using Aspose.Cells.

We are afraid there is no update for you regarding this issue. However, we have logged your comment in our database against this issue and requested the product team to provide some fix or ETA for this issue. Once there is some news for you, we will let you know asap.

Hi,


Are there any updates on this?
It has been 2 months…

Best regards,

Hi,

Thanks for your posting and using Aspose.Cells.

Please download and try the latest version: Aspose.Cells
for .NET v8.5.1.5
and see if it makes any difference and resolves your issue.

I have tested this issue with the following sample code using the latest version and the output emf image looks good. I have attached the emf image for your reference. Please remove its .txt extension after downloading it.

C#


var book = new Workbook(“Test.xlsx”);

Worksheet sheet = book.Worksheets[0];

var options = new ImageOrPrintOptions { HorizontalResolution = 300, VerticalResolution = 300, ImageFormat = ImageFormat.Emf };

var renderer = new SheetRender(sheet, options);


const string ImagePath = “out.emf”;

renderer.ToImage(0, ImagePath);

Hi,


If you look at the code I posted in the original post, the problem is not viewing the emf image. Your test code is different from the one I posted and does not do the same thing.
The problem is when the image is then pasted into a Word document and converted to Pdf.

If you run the code I posted above, even on the latest 8.5.2 the resulting emf file will create a corrupted image when exported to Pdf (or xps).

EDIT: As further proof that it is Aspose’s emf feature that is causing the problem:
If you open the emf file in Inkscape and resave the file, the bug does not appear anymore when converting to PDF (with or without ‘text to path’ checked). So clearly Aspose is badly exporting to emf.

Best regards,

Hi John,

Thank you for testing the latest release of Aspose.Cells for .NET 8.5.2. Please note, the originally posted problem (CELLSNET-43732 Rendering to Emf and pasting into Word causes text overlapping in resultant PDF) is currently unresolved, and due to the complexity of the issue, we are taking much time to find a solution for it. Moreover, at the time of recording the defect, we used your provided code segment for verification. We also tested the case by generating the EMF with Aspose.Cells for .NET API and inserted in a word document manually to convert it to PDF format using MS Words. In this case the result is not correct as well.

Regarding Inkscape application, please allow me to install the application and confirm your claim to record my observations.

Hi again,


This is to inform you that I am able to verify your recent concerns by re-saving the Aspose.Cells’ generated EMF with Inkscape application. The resultant EMF when inserted in a DOCX file (manually or using Aspose.Words API) produces the correct PDF file. I have logged my observations to the ticket CELLSNET-43732 and have requested the product team to share their analysis at earliest possible. As soon as we get updates, we will post here for your kind reference.

Hi John,


We have further investigated the scenario on our side. Please note, Aspose.Cells APIs generate EMF files in EmfPlusDual format. It seems that when EmfPlusDual file is insert in MS Word document (using Aspose.Words or manually), the image gets corrupted while converting the document to PDF format. While we are working to fix this problem, you can convert the Aspose.Cells generated EmfPlusDual file to EmfOnly using the ReSaveImageToEmfOnly method as demonstrated below.

C#

var book = new Workbook(“D:/Test.xlsx”);
Worksheet sheet = book.Worksheets[0];
var options = new ImageOrPrintOptions { HorizontalResolution = 300, VerticalResolution = 300 };
options.ImageFormat = ImageFormat.Emf;
var renderer = new SheetRender(sheet, options);
string ImagePath = “D:/out.emf”;
using (MemoryStream ms = new MemoryStream())
{
renderer.ToImage(0, ms);
ReSaveImageToEmfOnly(ms, ImagePath);
}
var doc = new Aspose.Words.Document();
var docBuilder = new Aspose.Words.DocumentBuilder(doc);
var image = Image.FromFile(ImagePath);
docBuilder.MoveToDocumentStart();
docBuilder.InsertImage(image);
doc.Save(“D:/out.pdf”);

static void ReSaveImageToEmfOnly(Stream srcStream, String destPath)
{
Bitmap dummyBitmap = null;
Graphics dummyGfx = null;
IntPtr hdc = IntPtr.Zero;
System.Drawing.Imaging.Metafile metafile = null;

try
{
dummyBitmap = new Bitmap(1, 1);
dummyGfx = Graphics.FromImage(dummyBitmap);
hdc = dummyGfx.GetHdc();
Image srcImage = Image.FromStream(srcStream);
Rectangle rect = new Rectangle(0, 0, srcImage.Width, srcImage.Height);
metafile = new System.Drawing.Imaging.Metafile(destPath, hdc, rect, System.Drawing.Imaging.MetafileFrameUnit.Pixel, EmfType.EmfOnly);
Graphics graphic = Graphics.FromImage(metafile);
graphic.DrawImage(srcImage, rect);
srcImage.Dispose();
graphic.Dispose();
}
finally
{
if (metafile != null)
{
metafile.Dispose();
}
if (hdc != IntPtr.Zero)
{
dummyGfx.ReleaseHdc(hdc);
}
if (dummyGfx != null)
{
dummyGfx.Dispose();
}
if (dummyBitmap != null)
{
dummyBitmap.Dispose();
}
}
}

Hi,


Thanks for your reply.

I have tested the workaround you posted and it does indeed fix the pdf export issue.
Sadly, it seems to produce different emf output to the ‘original’ emf file.

I have attached a new Test.xlsx and outputs for the EmfOnly and the default Aspose output (as I can’t attach emf files I attached the docx after inserting the resultant emf files).
It can be run using this code:

var book = new Workbook(“Test2.xlsx”);
Worksheet sheet = book.Worksheets[0];
var options = new ImageOrPrintOptions { HorizontalResolution = 300, VerticalResolution = 300, ImageFormat = ImageFormat.Emf };
var renderer = new SheetRender(sheet, options);

const string EmfOnlyImagePath = “EmfOnlyOut.emf”;
const string DefaultImagePath = “DefaultOut.emf”;
using (MemoryStream ms = new MemoryStream())
{
renderer.ToImage(0, ms);
ReSaveImageToEmfOnly(ms, EmfOnlyImagePath);
}
renderer.ToImage(0, DefaultImagePath);

As you can see from the output, some character sequences are now much too close together.
Such as the “ea” in “Beares”, the closing parenthesis in “(MM)”, the “es” in “Sales” and the “et” in “Target”. The original emf did not have that problem.

Best regards,

Hi,


We have tested the recently shared scenario on our side, and we regret to inform you that the presented behavior is actually the limitation of EmfOnly format. Please note, Aspose.Cells APIs generate the EMF in EmfPlusDual format that supports both EmfOnly & EmfPlus data formats to overcome such situations.

Please check the attached snapshot for the comparison of the DefaultOut.emf (provided by you) in IrfranView (supporting only EmfOnly) and Microsoft Office Picture Manage (supporting EmfOnly & EmfPlus).

Hi,


So if I understand correctly we have the following:

* If converted to EmfOnly we get formatting issues in the emf file. This is a limitation with EmfOnly.
* If using Aspose to generate in EmfPlusDual format, Aspose has a bug that corrupts the images when exported to pdf.
* InkScape exports to EmfPlusDual (?) without this bug.

Is this correct?

In this case I will have to wait for you to fix the original bug in Aspose’s emf export mechanism.

Best regards,

Hi,


For your concerns as follow InkScape exports to EmfPlusDual (?) without this bug, we have checked that InkScape’s saved EMF file is in EmfOnly format, however, the EmfOnly records have been changed a lot.

For your concerns as follow If using Aspose to generate in EmfPlusDual format, Aspose has a bug that corrupts the images when exported to pdf, we found the behavior strange that Aspose.Cells generated EmfPlusDual format image looks good in image viewers such as Microsoft Paint, Microsoft Office Picture Manager and Microsoft Word. The image gets corrupted when the document having the said EMF is converted to PDF format. We will keep on investigating this scenario and hope to fix it with future releases of Aspose.Cells APIs.

Hi,


Thanks for looking into it.

I consider this a very serious issue as it does not only happen with the specific document attached, but with any document. I tried it using the Calibri font and I get the same issues with exporting to Pdf.

This means that a workflow of pasting vector graphics (emf) into a Word document and exporting to Pdf is completely broken, which is a massive issue.

Best regards,

Hi John,

We can understand your situation, and we have already intimated the product team regarding your concerns by increasing the priority of the said ticket to Critical. We have also requested them to share alternative approach as you have rejected the conversion of Aspose.Cells’ generated EMF to EmfOnly format. As soon as we get more updates in this regard, we will post here for your kind reference.

Hi again,

This is to inform you that we have done further research on the presented scenario. Please note, we have used the plain .NET APIs to generate two EmfPlusDual files by changing the PageUnit to Pixels and Points. We have noticed that both resultant EMF files look the same when viewed in the image viewers such as Microsoft Office Picture Manager and Paint application. However, when inserted in a Document file and converted to PDF (manually with Office 2010 and using Aspose.Words APIs) we have noticed that there are problems with the image generated with PageUnit as Point, such as the location of text seems to be wrong when document is converted with Office 2010. When we used the Aspose.Words APIs to generate the PDF, the location of the text seems to be correct but the font size is different. Please check the attachment for all the files used/generated for this test case.

C#


Bitmap dummyBitmap = null;
Graphics dummyGfx = null;
System.IntPtr hdc = IntPtr.Zero;
System.Drawing.Imaging.Metafile metafile = null;
int width = 200;
int height = 200;

//in pixels
float loc = 34;
PointF p1, p2;
try
{
dummyBitmap = new Bitmap(1, 1);
dummyGfx = Graphics.FromImage(dummyBitmap);
hdc = dummyGfx.GetHdc();

metafile = new System.Drawing.Imaging.Metafile(imagePath, hdc, new RectangleF(0, 0, width, height), MetafileFrameUnit.Pixel, EmfType.EmfPlusDual);
Graphics mGr = Graphics.FromImage(metafile);
mGr.Clear(Color.White);
mGr.PageUnit = pageUnit;

p1 = new PointF(50, 100);
p2 = new PointF(50 + loc, 100);

if (mGr.PageUnit == GraphicsUnit.Point)
{
//pixel to point
p1 = new PointF(p1.X * 72 / mGr.DpiX, p1.Y * 72 / mGr.DpiX);
p2 = new PointF(p2.X * 72 / mGr.DpiX, p2.Y * 72 / mGr.DpiX);
}

System.Drawing.Font font = new System.Drawing.Font(“Arial”, 11);
Brush brush = new SolidBrush(Color.Red);
mGr.DrawString(“Test”, font, brush, p1);
mGr.DrawString(“Test”, font, brush, p2);

mGr.Dispose();

}
finally
{
if (metafile != null)
{
metafile.Dispose();
}
if (hdc != IntPtr.Zero)
{
dummyGfx.ReleaseHdc(hdc);
}
if (dummyGfx != null)
{
dummyGfx.Dispose();
}
if (dummyBitmap != null)
{
dummyBitmap.Dispose();
}
}

That being said, we suggest you to open a ticket in Aspose.Words support forum so that the Aspose.Words product team could also schedule an investigation in parallel.

Hi,


Has there been any progress on this?

Cheers,

Hi John,

Thanks for your posting and using Aspose.Cells.

We are very sorry to say that there is no further update for you. However, we have logged your comment in our database against this issue and requested the product team to provide you some ETA or fix for this issue. We are hopeful you will hear some good news soon. Please pardon us for delays as we have been working on lots of customer issues lately. Have a good day and cheers.

Hi,

Thanks for using Aspose.Cells.

As we posted in https://forum.aspose.com/t/41172 , please try to turn to Aspose Word product to see if they can have some progress.

                <br><br>