Client Logo and CSS is not applied

Hi AsposeTeam,

The requirement of my project is to convert a html file to images and insert these images into a pdf document.

THe code I am using is as follows:


Aspose.Words.License license = new Aspose.Words.License();
license.SetLicense(@"C:\Jetstream\500\Resources\net3.5_ClientProfile\Aspose.Words.lic");

Aspose.Pdf.License license2 = new Aspose.Pdf.License();
license2.SetLicense(@"C:\Jetstream\500\Resources\Aspose.Pdf.Kit.lic");

//load the html file into Aspose.Words
Aspose.Words.LoadOptions lo = new Aspose.Words.LoadOptions();
lo.LoadFormat = Aspose.Words.LoadFormat.Html;

//Aspose.Words.Document doc = new Aspose.Words.Document(@"D:\Aspose\RDP\NEWFiles\Depp_backup.htm", lo);
Aspose.Words.Document doc = new Aspose.Words.Document(@"D:\Aspose\RDP\NEWFiles\ImagePDF2.html", lo);

//Make the text display in individual rows

NodeCollection tables = doc.GetChildNodes(NodeType.Table, true);
foreach (Aspose.Words.Tables.Table tbl in tables)
{
    tbl.AllowAutoFit = true;
    tbl.AutoFit(AutoFitBehavior.AutoFitToWindow);
    // tbl.AutoFit(AutoFitBehavior.AutoFitToContents);
    //tbl.AutoFit(AutoFitBehavior.FixedColumnWidths);
    //tbl.Alignment = TableAlignment.Left;
    tbl.Alignment = TableAlignment.Left;

    foreach (Aspose.Words.Tables.Row r in tbl.Rows)
    {
        r.RowFormat.AllowBreakAcrossPages = false;
        // r.RowFormat.AllowAutoFit = true;

        //NodeCollection cells = r.Cells;
        foreach (Aspose.Words.Tables.Cell c in r.Cells)
        {

            c.CellFormat.WrapText = false;
            //CellFormat.PreferredWidth = PreferredWidth.FromPercent(100);
            //c.CellFormat.PreferredWidth = PreferredWidth.FromPoints(1000); 
        }
    }
}
////Generate the images
doc.Save(@"D:\Aspose\RDP\NEWFiles\Depp_DOc.doc");
for (int pageCounter = 0, stop = doc.PageCount; pageCounter < stop; pageCounter++)
{
    Aspose.Words.Saving.ImageSaveOptions options2 = new Aspose.Words.Saving.ImageSaveOptions(SaveFormat.Png);
    options2.PageIndex = pageCounter;
    options2.PrettyFormat = true;
    // images are of the format \<0-padded-page-index>.png, e.g. (somepath\myfile02.png)
    doc.Save(string.Format("{0}{1}{2}{3:d2}.png", "D:\\Aspose\\RDP\\NEWFiles\\Images\\", "", "MyImage", pageCounter + 1), options2);
}

//Aspose.Words.License license = new Aspose.Words.License();
//license.SetLicense(@"D:\jetstream\500\Resources\Aspose.Words.lic");

Aspose.Pdf.Generator.Pdf pdf = new Aspose.Pdf.Generator.Pdf();
string[] fileEntries = Directory.GetFiles(@"D:\Aspose\RDP\NEWFiles\Images\", "*.png");

int length = fileEntries.GetLength(0);
int counter;

for (counter = 0; counter < length; counter++)
{
    // Create a section object
    Aspose.Pdf.Generator.Section section1 = pdf.Sections.Add();
    // creat an image object
    section1.PageInfo.PageWidth = Aspose.Pdf.Generator.PageSize.A4Width;
    section1.PageInfo.PageHeight = Aspose.Pdf.Generator.PageSize.A4Height;

    section1.PageInfo.Margin.Top = 0;
    section1.PageInfo.Margin.Left = 0;
    section1.PageInfo.Margin.Bottom = 0;
    section1.PageInfo.Margin.Right = 0;

    Aspose.Pdf.Generator.Image image1 = new Aspose.Pdf.Generator.Image(section1);
    image1.ImageInfo.File = fileEntries[counter];
    image1.ImageInfo.ImageFileType = Aspose.Pdf.Generator.ImageFileType.Png;
    image1.ImageInfo.Alignment = Aspose.Pdf.Generator.AlignmentType.Center;

    image1.ImageInfo.FixHeight = section1.PageInfo.PageHeight - section1.PageInfo.Margin.Top - section1.PageInfo.Margin.Bottom;
    // specify the image Width information eqaul to Section Width minus Left and Right margin of page
    image1.ImageInfo.FixWidth = section1.PageInfo.PageWidth - section1.PageInfo.Margin.Left - section1.PageInfo.Margin.Right;

    // Create a BitMap object in order to get the information of image file
    //Bitmap myimage = new Bitmap(fileEntries[counter]);

    //// check if the width of the image file is greater than Page width or not
    //if (myimage.Width > section1.PageInfo.PageWidth)
    //// // if the Image width is greater than page width, then set the page orientation to Landscape
    // section1.IsLandscape = true;
    //else
    //// // if the Image width is less than page width, then set the page orientation to Portrait
    section1.IsLandscape = false;
    //// add the image to paragraphs collection of the PDF document
    //section1.IsLandscape = false;
    section1.Paragraphs.Add(image1);
}
pdf.Save(@"D:\Aspose\RDP\NEWFiles\Images_to_PDF_Conversion2.pdf");

The above code is working fine for ImagePdf2.htm…I am able to obtaing the CSS but not the client logo.(PLease see the attached folder by name “working”)
The above code is not working for “Depp_backup.htm”.I am not able to get the CSS applied as well as the client logo.(Please see the attached folder by “No Working”).

If you observer the “Depp_backup.htm” the columns are placed side by side in the browser.i.e in multiple columns.
THe CSS used for this purpose is: https://staging.brassring.com/WelcomePages.UserInterface/CSS/LayoutForms.css.

Can you please look into the code and suggest me the changes need to be implemented to have the client logo as well as the fields to be displayed in multiple columns.

Thanks,
siddi.

Hi Siddi,

Thanks for your query. Your code seems correct to me. The only issue is you are using html from local D drive and html file contain CSS and logo from live URLs. You can achieve your requirement by following two methods.

  1. Save the html file, CSS and logos at local hard drive and use the same code to achieve correct output. Please see the attachment.
  2. Use the method OpenDocumentFromUrl to load document from URL.

Hope this helps you. Let me know, If you have any more queries.

Hi Tahir,

Thanks for looking into this.

I have tried both the above mentioned approaches.But with no success.

I am still not able to get the CSS applied.Below are the details of my implementation.

Please look into this and suggest me any changes.

Approach 1:
-----------
1.Save the html file, CSS and logos at local hard drive and use the same code to achieve correct output. Please see the attachment.

With this approach I am able to get the Client logo but not the CSS.

shows labels and form options in one column left-justified, instead of two columns.

Expected: The output PDF should look like the input html, including column display, font type, font style, etc.

Please see the attached pdf file “NoCSSApplied.pdf”.

Approach 2:Use the method OpenDocumentFromUrl to load document from URL.


I have used the following code:

private Document OpenDocumentFromUrl(string url)
{
    //Prepare the web page we will be asking for
    HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
    request.Method = "GET";
    request.ContentType = "text/html";
    request.UserAgent = "Mozilla/4.0+(compatible;+MSIE+5.01;+Windows+NT+5.0";

    //Execute the request
    HttpWebResponse response = (HttpWebResponse)request.GetResponse();

    //We will read data via the response stream
    Stream resStream = response.GetResponseStream();

    //Write content into the MemoryStream
    BinaryReader resReader = new BinaryReader(resStream);
    MemoryStream docStream = new MemoryStream(resReader.ReadBytes((int)response.ContentLength));
    // Open document from stream.

    string filename = @"D:\Aspose\test.htm";
    //save the file to the local disk inoreder to replace the relative paths to absolute paths
    SaveMemoryStream(docStream, @"D:\test.htm");
    Document doc = new Document(docStream);

    return doc;
}

private void SaveMemoryStream(MemoryStream ms, string FileName)
{
    FileStream outStream = File.OpenWrite(FileName);
    ms.WriteTo(outStream);
    outStream.Flush();
    outStream.Close();
}
Aspose.Words.LoadOptions lo = new Aspose.Words.LoadOptions();
lo.LoadFormat = Aspose.Words.LoadFormat.Html;

string readText = File.ReadAllText(@" D:\test.htm");

string dataToRemove = "href=\"/jetstream";
string dataToRemove2 = "src=\"/jetstream";
string dataToRemove3 = "../../../../../../..";

//replace the relative paths with absolute paths.
readText = Regex.Replace(readText, dataToRemove, "href=\"https://" + "VIZSPOTHUGADT01" + "/jetstream", RegexOptions.IgnoreCase);
readText = Regex.Replace(readText, dataToRemove2, "src=\"https://" + "VIZSPOTHUGADT01" + "/jetstream", RegexOptions.IgnoreCase);
readText = Regex.Replace(readText, dataToRemove2, "src=\"https://" + "VIZSPOTHUGADT01" + "/jetstream", RegexOptions.IgnoreCase);
readText = Regex.Replace(readText, dataToRemove3, "https://VIZSPOTHUGADT01");

File.WriteAllText(@" D:\test2.htm", readText);

Aspose.Words.Document doc2 = new Aspose.Words.Document(@" D:\test2.htm", lo);

//conver the html to image files.

for (int pageCounter = 0, stop = doc2.PageCount; pageCounter < stop; pageCounter++)
{
    Aspose.Words.Saving.ImageSaveOptions options2 = new Aspose.Words.Saving.ImageSaveOptions(SaveFormat.Png);
    options2.PageIndex = pageCounter;
    options2.PrettyFormat = true;
    // images are of the format \<0-padded-page-index>.png, e.g. (somepath\myfile02.png)
    doc2.Save(string.Format("{0}{1}{2}{3:d2}.png", "D:\\Aspose\\RDP\\NEWFiles\\Images\\", "", "MyImagetest", pageCounter + 1), options2);
}

With this approach also I am not able to get the CSS applied.

Can you please look into this and help me to convert the html file to image file with all the CSS applied.

Thanks,
Siddi.

Hi Siddi,

I have download the CSS and logo file to local disk and have successfully converted to .doc and images files. Please see in attachment. Your html, CSS and logo file should be located at local disk or at remote server.

Hi Siddi,

Please find the doc and images files generated from http://www.aspose.com/ web page in attachment.

Document doc = OpenDocumentFromUrl("http://www.aspose.com/");
//Make the text display in individual rows
NodeCollection tables = doc.GetChildNodes(NodeType.Table, true);
foreach (Aspose.Words.Tables.Table tbl in tables)
{
    tbl.AllowAutoFit = true;
    tbl.AutoFit(AutoFitBehavior.AutoFitToWindow);
    // tbl.AutoFit(AutoFitBehavior.AutoFitToContents);
    //tbl.AutoFit(AutoFitBehavior.FixedColumnWidths);
    //tbl.Alignment = TableAlignment.Left;
    tbl.Alignment = TableAlignment.Left;
    foreach (Aspose.Words.Tables.Row r in tbl.Rows)
    {
        r.RowFormat.AllowBreakAcrossPages = false;
        // r.RowFormat.AllowAutoFit = true;
        //NodeCollection cells = r.Cells;
        foreach (Aspose.Words.Tables.Cell c in r.Cells)
        {
            c.CellFormat.WrapText = false;
            //CellFormat.PreferredWidth = PreferredWidth.FromPercent(100);
            //c.CellFormat.PreferredWidth = PreferredWidth.FromPoints(1000); 
        }
    }
}
////Generate the images
doc.Save(MyDir + "Depp_DOc.doc");
for (int pageCounter = 0, stop = doc.PageCount; pageCounter < stop; pageCounter++)
{
    Aspose.Words.Saving.ImageSaveOptions options2 = new Aspose.Words.Saving.ImageSaveOptions(SaveFormat.Png);
    options2.PageIndex = pageCounter;
    options2.PrettyFormat = true;
    doc.Save(string.Format("{0}{1}{2}{3:d2}.png", MyDir, "", "MyImage", pageCounter + 1), options2);
}

Hi Siddi,

I have tested the shared URL and found that the logo and web page URL (Domain) is not same. Its mean, the web page and logo are at different domains. Please see the attachment. In this case, you need to download complete web page with HttpWebRequest and pass stream object to Document constructor.

Hi Tahir,

Sorry for the trouble.May I know any update on this?

Thanks,

Siddi.

Hi Siddi,

I regret to share with you that the requested feature is not available in Aspose.Words at the moment. However, We had already logged this feature request in our issue tracking system. You will be notified via this forum thread once this feature is available.

Hi Tahir,

Thanks for your reply…If possible can you please tell me the timeframe when I can expect this feature …and also I request the AsposeTeam to update me once this feature is fully supported…

Thanks,

Siddi.

Hi Siddi,

Thanks for your inquiry. Unfortunately, your issues are pending for analysis. Once our developers analyze these issues, we will be able to provide you an estimate. You will be notify as soon as it is fixed. Sorry for inconvenience.

Hi Siddi,

Thanks for your request. I reinvestigated your issue and
here is what I found:

  1. The problem with logo:

a) The logo image disappears
because in your HTML ‘img’ is direct child of ‘table’. Aspose.Words simply
ignores the image in such case. I suppose you would agree that image should not
be a direct child of ‘table’. Here is snippet of your HTML:

<table class="guardAgainstInvalidMarkup">
    <tbody class="guardAgainstInvalidMarkup">
        <tr class="guardAgainstInvalidMarkup">
            <td class="guardAgainstInvalidMarkup">
                <font class="FONTMedium">
                    <table style="background-color:            #ffffff;" class="backg">
                        <img src="https://sstagingjobs.brassring.com/img/images_25411_5501/images/DGlogo.jpg" style="align: left; position: absolute; left: 1px; top: 1px;">

                    </table>

                    <span class="helpLink"><span class="questionHelpText"></span></span>
                </font>
            </td>
        </tr>
    </tbody>
</table>

Highlighted tags are redundant.
If you remove them the logo image will be displayed.

b) Logo image in your HTML is
absolutely positioned. Unfortunately, Aspose.Words does not support importing floating
content form HTML yet. That is why position of such image might be incorrect.

I logged both of these issue into our defect database. We
will let you know once they are resolved.

  1. Problem with css. Aspose.Words does not support inheriting formatting from parent
    elements. Currently, Aspose.Words expects that font formatting is set in <span>, <i>, <b> or <u> element, formatting of paragraph – in <p> or <h1>…<h6> elements etc…

In your case all elements are deeply nested into divs and
tables that is why formatting that must be inherited from the parent elements
is lost.

Best regards.

Hi Siddi,

Thanks for sharing helpful information. Our development team will look into these issues and you will be updated via this forum thread once these issues are resolved.

Hi Tahir,

Can I get any update on this issue…

Thanks,
Siddi.

Hi Siddi,

Thanks for your patience.

I am afraid your issues have now been postponed till a later date due to some other important issues and new features. We will inform you as soon as there are any further developments.

We apologize for your inconvenience.

Hi,

Thanks for your reply…

If possible can you please look into the below thread also and update me the status…
https://forum.aspose.com/t/93287

Thanks,
Siddi.

Hi Siddi,

Thanks for your query. The shared query at following forum link is related to Aspose.PDF. My colleagues from Aspose.PDF will reply you shortly.
https://forum.aspose.com/t/93287

Hi Aspose Team,

May I have an update on this defect?

Thanks,

Siddi.

Hi Siddi,

Thanks for your patience. I have verified the status of your issues from our issue tracking system and regret to share with you that these issues are still postponed. Our development team is busy over important issues and new features. We will inform you as soon as there are any further developments.

We apologize for your inconvenience.

The issues you have found earlier (filed as WORDSNET-5873) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.

The issues you have found earlier (filed as WORDSNET-3163;WORDSNET-39) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.