Problem in paragraph formatting while using HTML Tags

Hi Alexey,

Thanks for your continous support & service.

I am having a problem while formatting a paragraph.

The text in paragraph is formatted using HTML tags.

I am using Aspose Word version 7.0.0.0

Please refer attached word document containing the actaul problem.

If you observe the document carefully you will come to know that, the paragraph left,top & right spacing is applied only for the initial text. The paragraph formatting is not applied for the bulleted text & the paragraph after the bulleted text.

The sample code is as follows

private void button19_Click(object sender, EventArgs e)

{

string path = Path.GetDirectoryName(System.Windows.Forms.Application.ExecutablePath);

//Initialize Aspose license.

Aspose.Words.License lic = new Aspose.Words.License();

lic.SetLicense(path + @"\Aspose.Total.lic");

Aspose.Pdf.License lic1 = new Aspose.Pdf.License();

lic1.SetLicense(path + @"\Aspose.Total.lic");

Document doc = new Document();

DocumentBuilder documentBuilder = new DocumentBuilder(doc);

documentBuilder.ParagraphFormat.LeftIndent = 45;

documentBuilder.ParagraphFormat.RightIndent = 40;

documentBuilder.ParagraphFormat.SpaceBefore = 45;

documentBuilder.ParagraphFormat.SpaceAfter = 0;

string htmlText = "Now the company has released an ‘open source programming language’ for the developers and named it ‘Go’. It is an excellent news for the developers yet sources stated that the programming language is still in an experimental stage. Experts are conducting studies on it and one can expect the language to become better once it reaches the final stage.

  • one
  • Two
  • Three

In a blog post, Google officials have given an overview of the new programming language that is expected to benefit the developers. It has been stated that ‘Go’ is launched with the aim of combining C++, a compiled language, with another computer language called Python.";

documentBuilder.InsertHtml(htmlText);

doc.Save(“sample.doc”);

StartWord(“sample.doc”);

}

1> Is there any workaround to solve this problem ?

Waiting for your reply,

Dwarika.

Hi

Thanks for your request. When you insert paragraph using API (WriteLine method or InsertBreak), the inserted paragraph inherits formatting of the previous paragraph or formating you specified in DocumentBuilder properties. On other hand, when you use InsertHtml, paragraph formatting is taken from HTML, so you can get different formatting of paragraphs.

As a workaround, you can perform some post-processing to apply the same formatting to all paragraphs in your document.

Best regards.

Hi Alexey,

Thank you for your suggestion.

Will you please send us some sample code by using which we can achieve below mentioned functionality ( workaround solution )?

Requirements:

1> The paragraph formatting of initial text should get applied to HTML formatted bulleted text & the text after bulleted text.

2> It should not get applied to all the paragraph within the document. but it should consider current paragraph only ( since document may consist of no of paragraph with different formatting).

Waiting for your reply…!!!

Thanks & Regards,

Dwarika.

Hi Dwarika,

Thanks for your request. I think, in your case you can try using Document.NodeInserted event handler. Please see the following link for more information:

http://www.aspose.com/documentation/.net-components/aspose.words-for-.net-and-java/aspose.words.documentbase.nodeinserted.html

Here is sample code:

Document doc = new Document();

DocumentBuilder builder = new DocumentBuilder(doc);

builder.ParagraphFormat.LeftIndent = 45;

builder.ParagraphFormat.RightIndent = 40;

builder.ParagraphFormat.SpaceBefore = 45;

builder.ParagraphFormat.SpaceAfter = 0;

// Create InsertParagraphHelper.

InsertParagraphHelper helper = new InsertParagraphHelper(builder);

string htmlText = "Now the company has released an ‘open source programming language’ for the developers and named it ‘Go’. It is an excellent news for the developers yet sources stated that the programming language is still in an experimental stage. Experts are conducting studies on it and one can expect the language to become better once it reaches the final stage.

  • one
  • Two
  • Three

In a blog post, Google officials have given an overview of the new programming language that is expected to benefit the developers. It has been stated that ‘Go’ is launched with the aim of combining C++, a compiled language, with another computer language called Python.";

helper.InsertHtml(htmlText);

// Save output document.

doc.Save(@“Test001\out.doc”);

===================================================================

private class InsertParagraphHelper

{

public InsertParagraphHelper(DocumentBuilder builder)

{

mBuilder = builder;

mFormat = builder.ParagraphFormat;

builder.Document.NodeInserted += doc_NodeInserted;

}

public void InsertHtml(string html)

{

mBuilder.InsertHtml(html);

ResetFormatting();

}

private void doc_NodeInserted(object sender, NodeChangedEventArgs e)

{

// Check if th einserted node is paragraph.

if (e.Node.NodeType == NodeType.Paragraph)

mParagraphs.Add(e.Node);

}

private void ResetFormatting()

{

foreach (Paragraph paragraph in mParagraphs)

{

paragraph.ParagraphFormat.LeftIndent = mFormat.LeftIndent;

paragraph.ParagraphFormat.RightIndent = mFormat.RightIndent;

paragraph.ParagraphFormat.SpaceBefore = mFormat.SpaceBefore;

paragraph.ParagraphFormat.SpaceAfter = mFormat.SpaceAfter;

paragraph.ParagraphFormat.SpaceAfterAuto = mFormat.SpaceAfterAuto;

paragraph.ParagraphFormat.SpaceBeforeAuto = mFormat.SpaceBeforeAuto;

}

}

private DocumentBuilder mBuilder;

///

/// Paragraph format, which should be inhereted by inserted paragraphs.

///

private ParagraphFormat mFormat;

///

/// Collection, which contaions paragraphs, which should be processed.

///

private ArrayList mParagraphs = new ArrayList();

}

Hope this helps.

Best regards.

Hi Alexey,

Thanks for your suggestion.

I tried your code in my system.But i am sorry to tell you that the workaround solution suggested by you is not giving required output.

If the text contain

&

tags then the entire text rendered according to the parent paragraph formatting wihout recognising

&

If the text contain

&

then it must treat it as a new paragraph & do not apply the parent paragaraph formatting to it.

1> Is there any otherway by which the HTML bulleted text will follow the paragraph formatting when it is not surrounded by

&

?

2>If the text contain

&

tags then this will be treated as a new paragraph which do not contain any formatting of the parent paragraph.

3>Is this problem arrived due to HTML limitation or It is an Aspose limitation ?

Thanks & Regards,

Dwarika.

Hi Dwarika,

Thank you for additional information.

  1. No, there is no other way. As I mentioned earlier, when you insert HTML into your document, all formatting is taken from HTML, but not from the paragraph where DocuemntBuilder cursor is located. This is correct behavior.

  2. The same answer for this question.

tags should be treated as paragraphs and formatting is taken from

tag. Is no formatting is specified, the default formatting is applied.

Best regards.