MHT conversion issues/discrepancies with Aspose.Words

Hi,

I am in the process of evaluating Aspose.Words (and eventually the rest of the Aspose product family) to replace Office Automation, and the first conversion I am looking to implement (which will ultimately determine the go ahead) is MHT as this accounts for a large percentatge of the volume (from email traffic).

In my current office automation solution I am performing the following steps which I have attempted to replicate with Aspose.Words:

  1. Adjust the margins from 1 inch to 1 cm
  2. Change page orientation to landscape if there are any tables that exceed the page width
  3. Auto fit tables that have a fixed size that still exceed the page width
  4. Resize (to scale) any inline images to fit within the page width/height

I have been able to determine most of the equivalent code to achieve this in Aspose.Words (yet to look at changing orientation however I don’t think this will be an issue), however I have a few issues with the output that I hope you may be able to assist with:

I have attached a sample MHT file that I have been working off containing 2 tables.

  1. When I adjust the margins and save to TIF or PDF, the larger table does not re-fit itself, it retains the size prior to the margin adjustment. However, when I save as docx the table does re-adjust.
  2. Only the first cell has a border, all other cells do not.
  3. It appears as though some of the defaults applied to certain formatting differ between Aspose.Words and Microsoft Word, such as the following
    1. Paragraph spacing is set to Auto in most cases whereas Word is set to 0. This results in a more spaced out output
    2. Paragraph blocks are justified whereas Word does not appear to do this
    3. The bullet points at the bottom are not indented whereas they are with Word

Some of these issues I could obviously intervene via code, however in doing this I may introduce issues elsewhere. For example, if I Auto Fit tables after adjusting the margins this will cause both tables to auto fit which is not the intention. Likewise for setting table borders, etc…

Otherwise, happy with the ease of use and performance so far.

Look forward to your response and potential solutions…

Regards
Trent

Hi Trent,

Thanks for your inquiry and sorry for the delayed response. First of all, I would suggest you please read the following article that describes why Aspose components are a better alternative to automation:
https://docs.aspose.com/words/net/aspose-words-or-other-solutions/

  1. Sure, please use the PageSetup.LeftMargin, PageSetup.RightMargin, PageSetup.TopMargin and PageSetup.BottomMargin properties to set the distances (in points) between the edges of the page and the boundary of the body text:
    https://reference.aspose.com/words/net/aspose.words/pagesetup/

  2. Sure, you can use ** PageSetup.Orientation ** property for this purpose

  3. I would suggest you please read the following article on adjusting the Table widths:
    https://docs.aspose.com/words/net/applying-formatting/

  4. Please try using the following code snippet to be able to resize images in your Document:

Document doc = new Document(@"C:\Temp\in.doc");
NodeCollection shapes = doc.GetChildNodes(NodeType.Shape, true);
foreach(Shape shape in shapes)
{
    if (shape.HasImage)
    {
        MemoryStream stream = new MemoryStream(shape.ImageData.ImageBytes);
        Image img = Image.FromStream(stream);
        // resize image
        img = ResizeImage(img, 100, 100);
        shape.ImageData.SetImage(img);
        shape.Width = 100;
        shape.Height = 100;
    }
}
doc.Save(@"C:\Temp\out.docx");

private static Image ResizeImage(Image sourceImage, int targetWidth, int targetHeight)
{
    float ratioWidth = (float) sourceImage.Width / targetWidth;
    float ratioHeight = (float) sourceImage.Height / targetHeight;
    if (ratioWidth > ratioHeight)
        targetHeight = (int)(targetHeight * (ratioHeight / ratioWidth));
    else
    {
        if (ratioWidth < ratioHeight)
            targetWidth = (int)(targetWidth * (ratioWidth / ratioHeight));
    }
    Image outputImage = sourceImage.GetThumbnailImage(targetWidth, targetHeight, null, new IntPtr());
    return outputImage;
}

Regarding your issues during rendering the Mht document to fixed page format (e.g. Pdf, Tiff etc), I am working over these issues and will get back to you as soon as possible.

Best Regards,

Hi Trent,

*Trent:

  1. When I adjust the margins and save to TIF or PDF, the larger table does not re-fit itself, it retains the size prior to the margin adjustment. However, when I save as docx the table does re-adjust.*

I was unable to reproduce this issue with Aspose.Words version 11.9.0 on my side. For example, when I set the width of the ‘Long Table’, it was correctly reflected in PDF. Moreover, I used the following code for testing:

Table longTable = doc.FirstSection.Body.Tables[doc.FirstSection.Body.Tables.Count - 1];
longTable.PreferredWidth = PreferredWidth.FromPercent(50);

Could you please also provide the code snippet to be able to reproduce the same issue on my side?

Best Regards,

Hi Trent,

Trent:
2. Only the first cell has a border, all other cells do not.

While using the latest version of Aspose.Words i.e. 11.9.0, I managed to reproduce this issue on my side. I have logged this issue in our bug tracking system. The issue ID is WORDSNET-7230. Your request has been linked to this issue and you will be notified as soon as it is resolved. Sorry for the inconvenience.

Best Regards,

Hi Trent,

*Trent:
3.1 Paragraph spacing is set to Auto in most cases whereas Word is set to 0. This results in a more spaced out output *

While using the latest version of Aspose.Words i.e. 11.9.0, I managed to reproduce this issue on my side. I have logged this issue in our bug tracking system. The issue ID is WORDSNET-7231. Your request has been linked to this issue and you will be notified as soon as it is resolved. Sorry for the inconvenience.

Best Regards,

Hi Trent,

*Trent:
3.2 Paragraph blocks are justified whereas Word does not appear to do this *

While using the latest version of Aspose.Words i.e. 11.9.0, I managed to reproduce this issue on my side. I have logged this issue in our bug tracking system. The issue ID is WORDSNET-7232. Your request has been linked to this issue and you will be notified as soon as it is resolved. Sorry for the inconvenience.

Best Regards,

Hi Awais,

** * awais.hafeez: **
Could you please also provide the code snippet to be able to reproduce the same issue on my side?

See code snippet below:

Document doc = new Document("Test.mht");

foreach(Section section in doc.Sections) >
   {
       section.PageSetup.LeftMargin = section.PageSetup.LeftMargin / 2.54; >
       section.PageSetup.RightMargin = section.PageSetup.RightMargin / 2.54; >
       section.PageSetup.TopMargin = section.PageSetup.TopMargin / 2.54; >
       section.PageSetup.BottomMargin = section.PageSetup.BottomMargin / 2.54;
}

doc.Save("Test.docx"); >
doc.Save("Test.pdf");

All I am doing is setting the new margins, and I am expecting the long table to automatically adjust to the full page width without any code (as the table is already set to AutoFit). As you can see the docx file is OK however the PDF is not, the table width is still the width of the page prior to the margin update.
Thanks
Trent

Hi Trent,

*Trent:
3.3 The bullet points at the bottom are not indented whereas they are with Word *

I managed to reproduce this issue as well. I have logged this issue in our bug tracking system. The issue ID is WORDSNET-7233. Your request has been linked to this issue and you will be notified as soon as it is resolved. Sorry for the inconvenience.

Best Regards,

** * awais.hafeez: **
Hi Trent,*

Trent:
1. When I adjust the margins and save to TIF or PDF, the larger table does not re-fit itself, it retains the size prior to the margin adjustment. However, when I save as docx the table does re-adjust.

*I was unable to reproduce this issue with Aspose.Words version 11.9.0 on my side. For example, when I set the width of the ‘Long Table’, it was correctly reflected in PDF. Moreover, I used the following code for testing:

Table longTable = doc.FirstSection.Body.Tables[doc.FirstSection.Body.Tables.Count - 1]; >
longTable.PreferredWidth = PreferredWidth.FromPercent(50);

Could you please also provide the code snippet to be able to reproduce the same issue on my side?
Best Regards,*

Hi Awais,

Not sure if you saw the code snippet i sent as it got caught up with a subsequent reply from yourself?

Regards
Trent

Hi Trent,

Thanks for your inquiry. In this case, you should just call Document.UpdateTableLayout method before rendering to PDF format; only in rare cases where you confirmed that tables appear incorrect in the output document. I hope, calling this method will help to correct the output. Here is how you should use it:

Document doc = new Document(@"C:\Temp\test.mht");
foreach(Section section in doc.Sections)
{
    section.PageSetup.LeftMargin = section.PageSetup.LeftMargin / 2.54;
    section.PageSetup.RightMargin = section.PageSetup.RightMargin / 2.54;
    section.PageSetup.TopMargin = section.PageSetup.TopMargin / 2.54;
    section.PageSetup.BottomMargin = section.PageSetup.BottomMargin / 2.54;
}
doc.UpdateTableLayout();
doc.Save(@"C:\Temp\out.pdf");

Best Regards,

The issues you have found earlier (filed as WORDSNET-7230;WORDSNET-7232) have been fixed in this .NET update and this Java update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.

Hi, thanks for the update, are you able to provide status as to where the other 2 issues are at and if they are likely to be resolved soon?

WORDSNET-7231
WORDSNET-7233

Thanks

Hi Trent,

Thanks for your inquiry. Please find below the progress of your issues:

WORDSNET-7231: Our development team has completed the analysis of this issue and the root cause has been identified. However, the implementation of the fix of this issue has been postponed till a later date. We will inform you as soon as this is resolved.

WORDSNET-7233: Same is the case with this issue i.e. our development team has completed the analysis of this issue and the root cause has been identified. However, the implementation of the fix of this issue has been postponed till a later date. We will inform you as soon as this is resolved.

We apologize for any inconvenience.

Best regards,

Hi,

Do you have an update on a date for a fix to:

WORDSNET-7231
WORDSNET-7233

Thanks
Trent

Hi Trent,

Thanks for your inquiry. Unfortunately, these issues are not resolved yet. I have verified the status of your issues from our issue tracking system and regret to share with you that these issues have been postponed till a later date. However, the fix to these problems may definitely come onto the product roadmap in the future. We apologize for any inconvenience.

Best regards,

Hi Awais,

I must say this is very disappointing that there are no plans to fix this any time soon, nor are you able to provide a timeline.

Are you able to advise the root cause of the issue, is is related to the creation of the MHT file (Aspose.Email) or is it in the interpretation of the MHT file (Aspose.Words). If it is in relation to the creation then if I know what the issue is perhaps I can implement a workaround to correct this whilst waiting for your development team to address?

Regards
Trent

Hi Trent,

Thanks for your inquiry.

Regarding WORDSNET-7231, this problem occurs during converting Mht to Pdf format using Aspose.Words. The problematic Paragraphs (for example starting with text ‘Samll Table’ and ‘Long Table’) have ParagraphFormat.SpaceAfterAuto set to True instead their ParagraphFormat.SpaceAfter property should be set to 0pt in Aspose.Words’ generated output document. This leads to an extra amount of vertical space between Paragraphs.

Regarding WORDSNET-7233, this problem also occurs during converting Mht to Pdf format using Aspose.Words. The problem occurs because currently Aspose.Words doesn’t apply required left-indentation to the paragraphs during importing “blockquote” tags from Mht.

Moreover, I have asked the ETA for these issues from our development team; I will update you as soon as I have any extra information. We apologize for your inconvenience.

Best regards,

Hi Trent,

Thanks for being patient. I have received response from our development team i.e. they have planned to integrate the fix to WORDSNET-7231 in Aspose.Words’ September 2013 release and the fix to WORDSNET-7233 in Aspose.Words’ October 2013 release. If everything goes by plan, we are very hopeful to include the fix to your issues in Aspose.Words 13.9.0 and 13.10.0 releases respectively.

Best regards,

Hi Awais,

Thanks for the update, I look forward to the September and October releases.

Regards
Trent

The issues you have found earlier (filed as WORDSNET-7231) have been fixed in this .NET update and this Java update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.