Duplicate Content is shown in the converted images

Hi Aspose Team,

The Requirement of my project is to convert the html file to image files.
I am using the following peice of code to accomplish this.

Aspose.Words.License license = new Aspose.Words.License();
// load the html file into Aspose.Words
Aspose.Words.LoadOptions lo = new Aspose.Words.LoadOptions();
lo.LoadFormat = Aspose.Words.LoadFormat.Html;
Aspose.Words.Document doc = new Aspose.Words.Document(@"D:\Aspose\RDP\NEWFiles\ImagePDF.html", lo);
// Make the text display in individual rows
NodeCollection tables = doc.GetChildNodes(NodeType.Table, true);
foreach(Aspose.Words.Tables.Table tbl in tables)
    // tbl.AutoFit(AutoFitBehavior.FixedColumnWidths);
    tbl.Alignment = TableAlignment.Left;
// Generate the images
for (int pageCounter = 0, stop = doc.PageCount; pageCounter <stop; pageCounter++)
    Aspose.Words.Saving.ImageSaveOptions options2 = new Aspose.Words.Saving.ImageSaveOptions(SaveFormat.Png);
    options2.PageIndex = pageCounter;
    options2.PrettyFormat = true;
    // images are of the format \<0-padded-page-index>.png, e.g. (somepath\myfile02.png)
    doc.Save(string.Format("{0}{1}{2}{3:d2}.png", "D:\\Aspose\\RDP\\NEWFiles\\Images\\", "", "MyImage", pageCounter + 1), options2);

I have used the latest version of Aspose.Words.dll Version
I am able to generate the images successfully.
But the issue I am facing is that in the converted image files some content is shown repeatedly.
Please find the attached documents for reference.
I requset you to look into this and help me to overcome this issue.

Hi Siddi,

Thanks for your inquiry. What I understand, you are considering two similar comments on first page as duplicate contents. If this is the case, please note that Aspose.Words renders tag Title as a comment and there are two same titles found in your HTML. Try changing the following markup:

<a class="HelpQuestion" href="" target="garbage" onmouseout="window.status = '';return true" title="View help for this window" onmouseover="window.status='View help for this window'; return true;" onclick="return openHelpWindow('/jetstream/500/presentation/1033/Asp/Help/Cands_help.asp#cand_view_cand_forms')">?</a>
<a class="HelpText" href="" target="garbage" onmouseout="window.status = '';return true" title="View help for this window" onmouseover="window.status='View help for this window'; return true;" onclick="return openHelpWindow('/jetstream/500/presentation/1033/Asp/Help/Cands_help.asp#cand_view_cand_forms')">Help</a>

I hope, this will help.

Best Regards,

Hi Siddi,

Thanks for your inquiry. Additionally, before saving your HTML to images, you can also try removing comments by using the following code:

NodeCollection comments = doc.GetChildNodes(NodeType.Comment, true);

Best Regards,

Hi Hafeez ,
Thanks for looking into this issue.
The issue I am mentioning is not regarding the Comments being shown twice.
If you compare the html content and the Generated images…In the generated images the portions of the html are duplicated many times…

Please find the input html file: "ImagePDF.html"

And I have attached all the images in the “Images” folder.
If you observe the “MyImage02.png” …The “Final” block is shown two times…Like this so much content of the html is repeatedly shown.
Please let me know if you need any more information.

Hi Siddi,

Thanks for your inquiry. This problem occurs because the structure of your HTML file is not well formed. For example, please see the following markup:

<table cellpadding="0" cellspacing="0" width="100%">
        <td width="85%">
        <td width="100%">
            American Indian or Alaskan

I have manually corrected your HTML file and attached it here for your reference.

Please let me know if I can be of any further assistance.

Best Regards,

Thanks Hafeez,
It is working fine now.Thanks for resolving this issue.

The issues you have found earlier (filed as WORDSNET-5146) have been fixed in this .NET update and this Java update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.