Convert embedded html like <p>,<strong><lt> from string to proper formating and styling in TextFragment

How to convert embedded HTML code from string into plain string.
In XSL, we can use “<xsl:value-of select=“Value” disable-output-escaping=“yes”/>”
similar logic and properties to change into plain string with proper formatting and styling.

Aspose.Pdf.Document doc = new Aspose.Pdf.Document();
// Specify the left margin info for the PDF file
doc.PageInfo.Margin.Left = 40;
// Specify the Right margin info for the PDF file
doc.PageInfo.Margin.Right = 40;
Aspose.Pdf.Page page = doc.Pages.Add();
string s = “Header”;

        HtmlFragment heading_text = new HtmlFragment(s);
        page.Paragraphs.Add(heading_text);

        FloatingBox box = new FloatingBox();
        // Add four columns in the section
        box.ColumnInfo.ColumnCount = 2;
        // Set the spacing between the columns
        box.ColumnInfo.ColumnSpacing = "5";
        box.ColumnInfo.ColumnWidths = "250 250";

        TextFragment text1 = new TextFragment(@"embedded html like <p>,<strong><lt>");
        
        text1.TextState.FontSize = 8;
        text1.TextState.LineSpacing = 4;
        box.Paragraphs.Add(text1);
        text1.TextState.FontSize = 10;
        text1.TextState.FontStyle = FontStyles.Italic;
        // Create a graphs object to draw a line
        Aspose.Pdf.Drawing.Graph graph2 = new Aspose.Pdf.Drawing.Graph(200, 10);
        // Specify the coordinates for the line
        float[] posArr2 = new float[] { 1, 10, 100, 10 };
        Aspose.Pdf.Drawing.Line l2 = new Aspose.Pdf.Drawing.Line(posArr2);
        graph2.Shapes.Add(l2);

        // Add the line to paragraphs collection of section object
        box.Paragraphs.Add(graph2);
        HtmlFragment text2 = new HtmlFragment(html);
        box.Paragraphs.Add(text2);
        page.Paragraphs.Add(box);
        TextFragment text12 = new TextFragment(lt.value);
        page.Paragraphs.Add(text12);

        string xmlpath1 = xmlpath + "CreateMultiColumnPdf_19.1.pdf";
        var ms1 = new MemoryStream();
        doc.Save(ms1, SaveFormat.Pdf);
        

        foreach (Aspose.Pdf.Page p in doc.Pages)
        {
            p.Header = new HeaderFooter();
            p.Header.Margin = marginInfo(5f, 5f, 0f, 0f);
            p.Header.Paragraphs.Add("header");
            p.Footer = new HeaderFooter();
            p.Footer.Margin = marginInfo(5f, 5f, 0f, 0f);
            p.Footer.Paragraphs.Add("footer");
        }
        // Save PDF file
        doc.Save(xmlpath1);

Please suggest on this…Thanks in advance.

@jitendrapatil2514

Thanks for contacting support.

As per our understandings, you want to display HTML tags as is in PDF document so that they would not effect any text surrounded by them. You may use TextFragment in order to add such text inside PDF like following:

    Aspose.Pdf.Document doc = new Aspose.Pdf.Document();
    var page = doc.Pages.Add();
    page.Paragraphs.Add(new TextFragment(@"embedded html like<p>,<strong><lt>"));
    doc.Save(dataDir + "plainHTML.pdf"); 

plainHTML.pdf (1.9 KB)

We also tried to run your code snippet but it showed some error as some values i.e. html, lt were missing in it. In case our understanding are different than what you require, please share some sample expected output PDF with us along with complete working code snippet. We will test the scenario in our environment and address it accordingly.

Hi @asad.ali

Thanks to reply.

Please look into below updated code, you can see here below text1 contain embedded html code in string like paragraph and strong -

Aspose.Pdf.Document doc = new Aspose.Pdf.Document();
doc.PageInfo.Margin.Left = 40;
doc.PageInfo.Margin.Right = 40;
Aspose.Pdf.Page page = doc.Pages.Add();

FloatingBox box = new FloatingBox();
box.ColumnInfo.ColumnCount = 2;
box.ColumnInfo.ColumnSpacing = “5”;
box.ColumnInfo.ColumnWidths = “250 250”;

TextFragment text1 = new TextFragment(@"<p><strong>The Opportunity</strong></p><p>Africa is on the move: with some of the fastest expanding economies in the world with a rapidly growing young population; in Kenya the population is set to double to 85 million by 2050. Kenya is one of the most stable economies and is a business hub for East Africa, with an average GPD growth of 5 per cent per annum.<Kenya’s achievement of lower-middle-income status in 2015, combined with a predominantly young population accounting for two thirds of its inhabitants, presents a unique opportunity to leverage the skills, resources and capacities of millions of children and adolescents. With the right government policies, investment, and strong technical support, Kenya’s children and adolescents can yield a ‘demographic dividend’ and contribute to a national vision of prosperity.</p><p>Schools for Africa (SFA) is a global initiative which aims to achieve quality education across sub-Saharan Africa, ensuring that all children, including the most marginalized, are learning and gaining the skills for succeeding in life and work. SFA is a global partnership convened with business, governments and individuals, and has a proven track record in partnering with the private sector to achieve education results&nbsp;for children.&nbsp;</p>");
box.Paragraphs.Add(text1);
page.Paragraphs.Add(box);

string xmlpath1 = xmlpath + “CreateMultiColumnPdf_19.1.pdf”;
var ms1 = new MemoryStream();
doc.Save(ms1, SaveFormat.Pdf);
doc.Save(xmlpath1);

using <xsl:value-of select=“Value” disable-output-escaping=“yes”/> in XSL with XML data it’s working as expected please check below PDF. Using this approach there is issue with Aspose.HTML to convert to html to PDF. I have already raised issues ( PDFNET-45892 and HTMLNET-1708)with Aspose team. May be it will take time to resolve I am trying other approach without XSL and XML.
ComponentCopy.pdf (42.5 KB)

but using above mentioned code without XSL embedded HTML is appearing in PDF, please check below PDF.
CreateMultiColumnPdf_19.1.pdf (69.7 KB)

Please suggest on this.

@jitendrapatil2514

Thanks for writing back.

In case you need to render content according to HTML tags, please use HtmlFragment instead of TextFragment with HTML string. Please check following code snippet and attached output PDF for your kind reference:

Aspose.Pdf.Document doc = new Aspose.Pdf.Document();
var page = doc.Pages.Add();
page.Paragraphs.Add(new HtmlFragment(@"<p><strong>The Opportunity</strong></p><p>Africa is on the move: with some of the fastest expanding economies in the world with a rapidly growing young population; in Kenya the population is set to double to 85 million by 2050. Kenya is one of the most stable economies and is a business hub for East Africa, with an average GPD growth of 5 per cent per annum.<Kenya’s achievement of lower-middle-income status in 2015, combined with a predominantly young population accounting for two thirds of its inhabitants, presents a unique opportunity to leverage the skills, resources and capacities of millions of children and adolescents. With the right government policies, investment, and strong technical support, Kenya’s children and adolescents can yield a ‘demographic dividend’ and contribute to a national vision of prosperity.</p><p>Schools for Africa (SFA) is a global initiative which aims to achieve quality education across sub-Saharan Africa, ensuring that all children, including the most marginalized, are learning and gaining the skills for succeeding in life and work. SFA is a global partnership convened with business, governments and individuals, and has a proven track record in partnering with the private sector to achieve education results&nbsp;for children.&nbsp;</p>"));
doc.Save(dataDir + "plainHTML.pdf");

plainHTML.pdf (80.9 KB)

In case you face any issue using suggested approach, please feel free to let us know.

Hi @asad.ali

Thanks to reply.
I have tried with above code it’s working fine but when I use floating box it has issue. I have to add Floating Box for multiple column text flow (flow data into 2 columns)

        FloatingBox box = new FloatingBox();
        // Add four columns in the section
        box.ColumnInfo.ColumnCount = 2;

Please check test with floating box and suggest.

@jitendrapatil2514

Thanks for contacting support.

Would you please share complete updated code snippet which you have tried in your environment and faced an issue. Also, please share generated PDF document with it. We will test the scenario in our environment and address it accordingly.

@asad.ali

Please check below updated code which I have tried and getting embedded HTML -
Aspose.Pdf.Document doc = new Aspose.Pdf.Document();
doc.PageInfo.Margin.Left = 40;
doc.PageInfo.Margin.Right = 40;
Aspose.Pdf.Page page = doc.Pages.Add();

FloatingBox box = new FloatingBox();
box.ColumnInfo.ColumnCount = 2;
box.ColumnInfo.ColumnSpacing = “5”;
box.ColumnInfo.ColumnWidths = “250 250”;

TextFragment text1 = new TextFragment(@"<p><strong>The Opportunity</strong></p><p>Africa is on the move: with some of the fastest expanding economies in the world with a rapidly growing young population; in Kenya the population is set to double to 85 million by 2050. Kenya is one of the most stable economies and is a business hub for East Africa, with an average GPD growth of 5 per cent per annum.<Kenya’s achievement of lower-middle-income status in 2015, combined with a predominantly young population accounting for two thirds of its inhabitants, presents a unique opportunity to leverage the skills, resources and capacities of millions of children and adolescents. With the right government policies, investment, and strong technical support, Kenya’s children and adolescents can yield a ‘demographic dividend’ and contribute to a national vision of prosperity.</p><p>Schools for Africa (SFA) is a global initiative which aims to achieve quality education across sub-Saharan Africa, ensuring that all children, including the most marginalized, are learning and gaining the skills for succeeding in life and work. SFA is a global partnership convened with business, governments and individuals, and has a proven track record in partnering with the private sector to achieve education results&nbsp;for children.&nbsp;</p>");
box.Paragraphs.Add(text1);
page.Paragraphs.Add(box);

string xmlpath1 = xmlpath + “CreateMultiColumnPdf_19.1.pdf”;
var ms1 = new MemoryStream();
doc.Save(ms1, SaveFormat.Pdf);
doc.Save(xmlpath1);

@jitendrapatil2514

We have noticed in your latest code snippet that you were using TextFragment instead of HtmlFragment to render HTML content inside. As shared earlier, you need to use HtmlFragment so that HTML tags in string would take effect over enclosed text.

We have used your code snippet in our environment after replacing TextFragment with HtmlFragment as follows and were unable to notice any issue. For your kind reference, an output PDF is also attached.

Aspose.Pdf.Document doc = new Aspose.Pdf.Document();
doc.PageInfo.Margin.Left = 40;
doc.PageInfo.Margin.Right = 40;
Aspose.Pdf.Page page = doc.Pages.Add();

FloatingBox box = new FloatingBox();
box.ColumnInfo.ColumnCount = 2;
box.ColumnInfo.ColumnSpacing = "5";
box.ColumnInfo.ColumnWidths = "250 250";

HtmlFragment text1 = new HtmlFragment(@"<p><strong>The Opportunity</strong></p><p>Africa is on the move: with some of the fastest expanding economies in the world with a rapidly growing young population; in Kenya the population is set to double to 85 million by 2050. Kenya is one of the most stable economies and is a business hub for East Africa, with an average GPD growth of 5 per cent per annum.<Kenya’s achievement of lower-middle-income status in 2015, combined with a predominantly young population accounting for two thirds of its inhabitants, presents a unique opportunity to leverage the skills, resources and capacities of millions of children and adolescents. With the right government policies, investment, and strong technical support, Kenya’s children and adolescents can yield a ‘demographic dividend’ and contribute to a national vision of prosperity.</p><p>Schools for Africa (SFA) is a global initiative which aims to achieve quality education across sub-Saharan Africa, ensuring that all children, including the most marginalized, are learning and gaining the skills for succeeding in life and work. SFA is a global partnership convened with business, governments and individuals, and has a proven track record in partnering with the private sector to achieve education results&nbsp;for children.&nbsp;</p>");
            
box.Paragraphs.Add(text1);
page.Paragraphs.Add(box);

string xmlpath1 = dataDir + "CreateMultiColumnPdf_19.1.pdf";
doc.Save(xmlpath1);

CreateMultiColumnPdf_19.1.pdf (81.1 KB)

In case you find any issue in the shared output, please share a screenshot by highlighting errors. We will further look into the scenario and proceed to help you accordingly.

@asad.ali

Thank you so much.It is working fine.
I have some query and issue related to same code-

  1. When I am replacing HtmlFrament in place of TextFrament it’s working fine but not able to apply styling. With TextFragemnet below code is working file…Please use above code.

    HtmlFragment text1 = new HtmlFragment(“text”);
    text1.TextState = new TextState();
    text1.TextState.FontSize = 12;
    text1.TextState.Font = FontRepository.FindFont(“TimesNewRoman”);
    text1.TextState.BackgroundColor = Aspose.Pdf.Color.LightGray;
    text1.TextState.ForegroundColor = Aspose.Pdf.Color.Cyan;

  2. I want to add image at background with styling on same HtmlFragment text.

Please suggest on this.

@jitendrapatil2514

Thanks for getting back to us.

We were able to notice that the TextState was not honoring the Foreground and Background color values. We have logged an issue as PDFNET-45972 in our issue tracking system for the sake of correction. We will further look into details of the issue and keep you posted with the status of its correction. Please be patient and spare us little time.

Furthermore, you may use inline CSS style as a workaround in order to format your HTML string like following.

Aspose.Pdf.Document doc = new Aspose.Pdf.Document(); 
doc.PageInfo.Margin.Left = 40; 
doc.PageInfo.Margin.Right = 40; 
Aspose.Pdf.Page page = doc.Pages.Add(); 
FloatingBox box = new FloatingBox(); 
box.ColumnInfo.ColumnCount = 2; 
box.ColumnInfo.ColumnSpacing = "5"; 
box.ColumnInfo.ColumnWidths = "250 250"; 
HtmlFragment text1 = new HtmlFragment(@"<p style='color:cyan;background-color:lightgray;font-family:serif;font-size:12pt;'><strong>The Opportunity</strong></p>"); 
box.Paragraphs.Add(text1); 
page.Paragraphs.Add(box); 
string xmlpath1 = dataDir + "CreateMultiColumnPdf_19.1.pdf"; 
doc.Save(xmlpath1);

We are sorry for the inconvenience.