Problem with Aspose Words with extra spacing after bulleted list

AsposeScreenshot.png (18.6 KB)
Hello,

We are having an issue where extra space is being inserted after a bulleted list in Aspose Word. We have tried several combinations of html combinations but cannot seem to remove the extra space. Can you help?

Paul

@pdeangelo Be aware that some visual changes may occur when converting from MS Word to HTML (flow document). Please use HtmlFixedSaveOptions if you want both documents to look the same.

Additionally, can you please attach the base document (doc, docx, etc.) that you are converting to HTML or if you are inserting the bulleted list programmatically can you please post that piece of code.

Hi - we have a Word document that we are inserting HTML into and we are getting the extra space after the HTML that was inserted. Below is a snippet of code where we are inserting the html into the word doc and I have attached the word doc template.

Thank you for your help!

Document mainDoc = (Aspose.Words.Document)parentNode.Document;
// Add a handler to MergeField event
mainDoc.MailMerge.FieldMergingCallback = new InsertDocumentAtMailMergeHandler();
DocumentBuilder builder = new DocumentBuilder(mainDoc);
builder.MoveTo(childNode);

// This CSS is used to fix the line spacing issue which could occur with the list insertion into the old versioned Word templates
//string defaultCSS = "<style type=\"text/css\"> ol,ul,li {margin: 0px 0px 0px 0px;}</style>";
string defaultCSS = "<style type=\"text/css\"> ol,ul {margin: 0px 0px -100px 0px;} </style>";
html = string.Format("{0}<span style='font-family: \"{1}\"; font-size: {2}pt;'>{3}</span>", defaultCSS, builder.Font.Name, builder.Font.Size, html);

builder.InsertHtml(html, HtmlInsertOptions.UseBuilderFormatting);

ABC Health Plan - Provider Approval Fax_1013 2022 Update.docx (21.7 KB)

@pdeangelo in the code that you posted you are using MailMerge operations, but the document don’t contain any MailMerge field. Can you please check that the document is the correct?
Additionally, can you please add the fragment of code where you insert the list.

This is where I am inserting the list
builder.InsertHtml(html, HtmlInsertOptions.UseBuilderFormatting);

@pdeangelo can you please add the string representation of the HTML code of the list that you are inserting.
Also, please answer the following 3 points:

  1. In your code you are using MailMerge operations, but the document that you posted don’t contain any MailMergeFiled. Can you please verify that you posted the right document?
  2. Could you please add the code in where you assign value to the variable childNode.
  3. Additionally will be helpful if you add your current output file.

Hi,

Here is the html:

<style type="text/css"> 
    ol,ul {margin: 0px 0px -100px 0px;} 
</style>
<span style='font-family: "Times New Roman"; font-size: 12pt;'>
    Test Drug 8/31/2022 a quantity of 90 for 30 days has been approved for 6 months from 4/18/2023 to 10/15/2023.
    <br>    <br>
    New paragraph test.
    <br>    <br> 
    <ol>
         <li>Line 1</li>
         <li>Line 2</li> 
    </ol>
</span> 

We have two types or word documents, one using mail merge and one use a find replace. This example is find replace.

In this code we are finding the ChildNode that contains $Rationale, then we insert the HTML into that child node and remove the $Rationale text.

This is the HTML we are inserting. You can see that there is no extra line break after the ordered list:

<style type="text/css"> 
    ol,ul {margin: 0px 0px -100px 0px;} 
</style>
<span style='font-family: "Times New Roman"; font-size: 12pt;'>
    Test Drug 8/31/2022 a quantity of 90 for 30 days has been approved for 6 months from 4/18/2023 to 10/15/2023.
    <br>    <br> 
    New paragraph test.
    <br>    <br> 
    <ol>
         <li>Line 1</li>
         <li>Line 2</li>
    </ol>
</span>

This is the original document template with the location we are inserting the html in yellow. You can see there is only one line between $Rationale and Date + 60, but after the html is inserted, there are two lines between.

This is a screenshot of the resulting image with the extra space in yellow:

This is a screenshot of the resulting image with the extra space in yellow:

@pdeangelo thank you for the additional information, everything is clear now. To remove that extra space you can use the HtmlInsertOptions.RemoveLastEmptyParagraph option in the InsertHtml method. Please check the following example:

var html = @"<style type=""text/css""> 
                ol,ul {margin: 0px 0px -100px 0px;} 
            </style>
            <span style='font-family: ""Times New Roman""; font-size: 12pt;'>
                Test Drug 8/31/2022 a quantity of 90 for 30 days has been approved for 6 months from 4/18/2023 to 10/15/2023.
                <br>    <br> 
                New paragraph test.
                <br>    <br> 
                <ol>
                        <li>Line 1</li>
                        <li>Line 2</li>
                </ol>
            </span>";
Document doc = new Document("C:\\Temp\\input.docx");

FindReplaceOptions opt = new FindReplaceOptions()
{
    ReplacingCallback = new ReplaceWithHtmlHandler()
};


doc.Range.Replace("$Rationale", html, opt);


doc.Save("C:\\Temp\\output.docx");
public class ReplaceWithHtmlHandler : IReplacingCallback
{
    ReplaceAction IReplacingCallback.Replacing(ReplacingArgs args)
    {
        DocumentBuilder builder = new DocumentBuilder((Document)args.MatchNode.Document);
        Paragraph parent = ((Run)args.MatchNode).ParentParagraph;
        parent.RemoveAllChildren();

        Run tempRunNode = new Run(parent.Document);
        parent.AppendChild(tempRunNode);
        builder.MoveTo(tempRunNode);
        builder.InsertHtml(args.Replacement, HtmlInsertOptions.RemoveLastEmptyParagraph);
        tempRunNode.Remove();

        return ReplaceAction.Skip;
    }
}

output.docx (22.8 KB)

I tried builder.InsertHtml(html, HtmlInsertOptions.RemoveLastEmptyParagraph); last week and that didn’t work, why is this version different?

@pdeangelo using the code that I posted generate the expected output (you can check it in the file that I attached). If you post the version of the code (please don’t use images for it, that make the code really hard to read and use) in where you add that property I’ll review it and point the differences.

I tried that and it works for the list, but we also put special code to remove

because it was adding extra line breaks. I tried that and I get spaces before and after:
image.png (14.5 KB)

  var html = @"<style type=""text/css""> 
                ol,ul {margin: 0px 0px -100px 0px;} 
            </style>
            <span style='font-family: ""Times New Roman""; font-size: 12pt;'>
                <p>Test Drug 8/31/2022 a quantity of 90 for 30 days has been approved for 6 months from 4/18/2023 to 10/15/2023.
                </p>
                <p>New paragraph test.</p>
                <br>    <br> 
                <ol>
                        <li>Line 1</li>
                        <li>Line 2</li>
                </ol>
            </span>";
namespace ConsoleApp1
{
    class Program
    {
        static void Main(string[] args)
        {
            var html = @"<style type=""text/css""> 
                ol,ul {margin: 0px 0px -100px 0px;} 
            </style>
            <span style='font-family: ""Times New Roman""; font-size: 12pt;'>
                <p>Test Drug 8/31/2022 a quantity of 90 for 30 days has been approved for 6 months from 4/18/2023 to 10/15/2023.
                </p>
                <p>New paragraph test.</p>
                <br>    <br> 
                <ol>
                        <li>Line 1</li>
                        <li>Line 2</li>
                </ol>
            </span>";
            Document doc = new Document("C:\\Projects\\Paul\\ABC Health Plan - Provider Approval Fax.docx");

            FindReplaceOptions opt = new FindReplaceOptions()
            {
                ReplacingCallback = new ReplaceWithHtmlHandler()
            };


            doc.Range.Replace("$Rationale", html, opt);


            doc.Save("C:\\Temp\\output.docx");
           

        }
    }

    public class ReplaceWithHtmlHandler : IReplacingCallback
    {
        ReplaceAction IReplacingCallback.Replacing(ReplacingArgs args)
        {
            DocumentBuilder builder = new DocumentBuilder((Document)args.MatchNode.Document);
            Paragraph parent = ((Run)args.MatchNode).ParentParagraph;
            parent.RemoveAllChildren();

            Run tempRunNode = new Run(parent.Document);
            parent.AppendChild(tempRunNode);
            builder.MoveTo(tempRunNode);
            builder.InsertHtml(args.Replacement, HtmlInsertOptions.RemoveLastEmptyParagraph);
            tempRunNode.Remove();

            return ReplaceAction.Skip;
        }
    }
}

@pdeangelo The HTML code here appears to be different, as explicit paragraphs are added in the HTML code. If you want to remove the extra space after the paragraphs, you can also use the option HtmlInsertOptions.UseBuilderFormatting , which preserves the paragraph styles that you have been using in the document.

builder.InsertHtml(args.Replacement, HtmlInsertOptions.RemoveLastEmptyParagraph | HtmlInsertOptions.UseBuilderFormatting);

Please notice that you have <br> <br> tags which add extra space in between the paragraph and the list.

Yes, I changed the xml because we have code that strips the <p> tags because of this issue but we need them back in. It didn’t work as expected with your change see below:

@pdeangelo that is the expected output for the HTML that you are using. Can you post a word (doc, docx) document with your expected output and the real version of the HTML that you want to use.

Additionally will be appreciated if you format the HTML code in your posts as code (using the editor toolbar or the markup syntax). The forum interpret it as real HTML code and produces undesired output in your posts.

Hi @eduardo.canal it has been a while but I have another question. After I insert the html using RemoveLastEmptyParagraph, the last paragraph has paragraph spacing of 12pt after, is there a way to remove that?

@pdeangelo you can use SpaceAfterAuto and SpaceAfter properties of the ParagraphFormat to remove the space after a paragraph.

Thanks. Is there an easy way to find that last paragraph that was inserted by the htmlinsert?

I looped through the paragraphs before the html insert and the spacing after was 8, but after the insert it was changed to 12. Do you know why that is?

@pdeangelo when you insert an element using the DocumentBuilder class it automatically will move the cursor to the last element inserted, so you can use the following code to get the last paragraph:

builder.InsertHtml(args.Replacement, HtmlInsertOptions.RemoveLastEmptyParagraph | HtmlInsertOptions.UseBuilderFormatting);

if(builder.CurrentParagraph != null)
{
    builder.CurrentParagraph.ParagraphFormat.SpaceAfter = 0;
    builder.CurrentParagraph.ParagraphFormat.SpaceAfterAuto = false;
}