We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Missing Text in Word File

Hi there,
Here is my input html file and when i convert this html file into word some of the text is missing due to too much text in table row. and because table row cannot split onto more then one page so thats why the text that is manageable for a single page is displayed and rather text in a table row is missing. so please tell me if you have any possible solution?

Hi Ahmed,

Thanks for your inquiry. I have tested the scenario and have managed to reproduce the same issue at my side. For the sake of correction, I have logged this problem in our issue tracking system as WORDSNET-12244. I have linked this forum thread to the same issue and you will be notified via this forum thread once this issue is resolved. We apologize for your inconvenience.

Please use RowFormat.AllowBreakAcrossPages property as shown below to get the correct output.

Document doc = new Document(MyDir + "in.html");
foreach (Table table in doc.GetChildNodes(NodeType.Table, true))
{
    foreach (Row row in table.Rows)
    {
        row.RowFormat.AllowBreakAcrossPages = true;
    }
}
doc.Save(MyDir + "Out.docx");

Hi Tahir,

Thank you so much for you help, missing text problem is solve through these line of codes, but now another issue produce due to this,many elements are splitting on more then one pages. Nodoubt those elements which have too much data that is not adjustable on a single page, they need to split on more then one pages. But if an element has short text then it should not divide, it should start on a new page.
please give me any idea to do it.

Hi Ahmed,

Thanks for your inquiry. Your input document have false value of RowFormat.AllowBreakAcrossPages property. This is the reason you are getting missing date issue. The value of this property is True if the text in a table row is allowed to split across a page break.

Could you please share some more detail about this issue along with screenshots of problematic section of output document? We will then provide you more information on this.

Hi Ahmed,

Thanks for your patience.

It is to inform you that our product team has completed the work on the issue (WORDSNET-12244)
and has come to a conclusion that this issue and the
undesired behavior you’re observing is actually not a bug in
Aspose.Words. So, we have closed this issue as ‘Not a Bug’. I am quoting developer’s comments and solution of the shared issue here for your reference:

MS Word 2013 opens customer’s document as he expects. To mimic MS Word’s behavior customer should optimize document for MS Word 2013 version after loading HTML:

doc.CompatibilityOptions.OptimizeFor(MsWordVersion.Word2013);

and before saving to DOCX customer should provide following save options:

OoxmlSaveOptions so = new OoxmlSaveOptions();
so.Compliance = OoxmlCompliance.Iso29500_2008_Transitional;
Document doc = new Document(MyDir + @"Part 2 - June 4, Testing Idle for 20 minutes.html");
doc.CompatibilityOptions.OptimizeFor(MsWordVersion.Word2013);
OoxmlSaveOptions so = new OoxmlSaveOptions();
so.Compliance = OoxmlCompliance.Iso29500_2008_Transitional;
doc.Save(MyDir + "Out.docx", so);