Add styling to HTML before converting it to PDF in JAVA

Hi @alexey.noskov and Team,

I have sample HTML as below

<!DOCTYPE html>
<html>
<head>
    <link rel="stylesheet" href="mystyle.css">
</head>
<body>

    <h1>This is a heading</h1>
    <p>This is a paragraph.</p>

</body>
</html>

and I have external css file as below

body {
    background-color: lightblue;
}

h1 {
    color: navy;
    margin-left: 20px;
}

Now I want to convert this HTML(along with above external css applied) to PDF.

Could you please let me know how can I apply styling and convert my html to pdf by using Aspose.Words for Java?

I am using below code to convert html to pdf

Document doc = new Document();

// Create a document builder
DocumentBuilder builder = new DocumentBuilder(doc);

// Insert HTML
builder.insertHtml("<ul>\r\n" + 
    "<li>Item1</li>\r\n" + 
    "<li>Item2</li>\r\n" + 
    "</ul>");

// Save as DOCX
doc.save("html-string-to-word.docx", SaveFormat.DOCX);

Thanks
Ramesh

@ramesh676 Just put the CSS file next to HTML file and use the following code for conversion:

Document doc = new Document("C:\\Temp\\in.html");
doc.save("C:\\Temp\\out.pdf");

out.pdf (32.8 KB)

Hi @alexey.noskov,

I have my CSS file in maven project resources folder, PFB.

image.png (3.5 KB)

Now I want to read this css file and I want use this file while converting html to pdf. How can I include this css in aspose APIS? To let you know I will get HTML string from third party API and will use this html string to convert into PDF.

I can also get test.css file as binary from third party API, is there any to API in aspose which reads binary format css and will apply when converting?

Thanks
Ramesh

@ramesh676 There is no way to feed CSS file to Aspose.Words before loading HTML. However, you can implement IResourceLoadingCallback and load CSS file while loading HTML document.

Hi @alexey.noskov,

Could you give me an example code snippet that how can load external CSS using this IResourceLoadingCallback?

Thanks.

@ramesh676 Sure, please see the following code:

LoadOptions opt = new LoadOptions();
opt.setLoadFormat(LoadFormat.HTML);
opt.setResourceLoadingCallback(new CssResourceLoadingCallback());
Document doc = new Document("C:\\Temp\\in.html", opt);
doc.save("C:\\Temp\\out.pdf");
private static class CssResourceLoadingCallback implements IResourceLoadingCallback
{
    @Override
    public int resourceLoading(ResourceLoadingArgs args) throws Exception {

        String url = args.getOriginalUri();
        if(args.getResourceType() == ResourceType.CSS_STYLE_SHEET &&
                url.equals("https://my.cool.css.url")) {
            // Load CSS from external resource.
            // For demonstration purposes load CSS from file.
            args.setData(Files.readAllBytes(Paths.get("C:\\Temp\\mystyle.css")));
            return ResourceLoadingAction.USER_PROVIDED;
        }

        return ResourceLoadingAction.DEFAULT;
    }
}

Hi @alexey.noskov,

does aspose support scss? If yes do I need to use same

opt.setResourceLoadingCallback(new CssResourceLoadingCallback());

Thanks.

@ramesh676 If it is required to implement custom logic for loading external resource, it is required to use IResourceLoadingCallback.

Hi @alexey.noskov,

I have implemented IResourceLoadingCallback and I am able to load external css but I have one observation when I run the same CSS and HTML in browser I am seeing proper styling applied as below

But when I send same HTML and CSS to aspose and after converting it to PDF, in the generated PDF the styling is getting broken as below

Also the for the same CSS and HTML the svg icons are loading properly on browser as below

but when I converted my html css to PDF using aspose my svg icon is coming differently as below

does aspose support all css like css3 , css2 or other like SCSS.

@ramesh676 Could you please attach the problematic input and output documents here for testing? We will check the issue and provide you more information.

Also, please note, Aspose.Words is designed to work with MS Word documents at first. While loading HTML document, it is converted to Aspose.Words DOM and due to differences in HTML documents and MS Word documents object models it is not always possible to provide 100% fidelity after processing HTML document. When loading HTML document, Aspose.Words in most cases mimics MS Word behavior, not browser behavior.

Hi @alexey.noskov,

please find below html along with styling.

<html>
<head>
    <style>
        .footnodeslist {
            color: var(--primary-grey-500-confident-grey-2-e-2-e-38, #2E2E38);
            font-feature-settings: 'clig' off, 'liga' off;
            font-family: Ariel;
            font-style: normal;
            font-size: 16px;
            font-weight: 300;
            line-height: 22px;
            display: flex;
            align-items: baseline;
        }

            .footnodeslist [data-class="branch"] p:first-child {
                display: none;
            }

            .footnodeslist .footnote-num {
                color: #fff;
                background: #333;
                padding: 0 5px;
                margin: 2px 6px 0 0;
                width: 24px;
                height: 24px;
            }

            .footnodeslist .close {
                display: none;
            }
    </style>
</head>
<body>

    <div role="document" data-tag="footnote" class="footnote collapse in" id="fnsrc_1689005772849" aria-expanded="true">
        <div class="arrow"></div>
        <div class="well footnodeslist" data-tag="footnotetable">
            <div>
                <span data-tag="fntnum" class="footnote-num"> 64</span>
                <div data-tag="fntclose" class="atlas-icon close"></div>
            </div>
            <div data-class="branch" data-tag="fnt">
                <p></p>
                <div style="display: inline;">
                    <div class="p wrapper">
                        <div id="SL389239732_SL390127569" class="p">
                            <div style="display: inline;">
                                Number 64 and this para should be side by side.
                            </div>
                        </div>
                    </div>
                </div>
                <p></p>
            </div>
        </div>
    </div>
</body>
</html>

If run above html in chrome then 64 number and para text will come side by side. If I process this html via aspose words to pdf format then I am seeing issue mentioned in above post i.e. 64 number will come up and para text will come down. I want the same look that we see in chrome. Could you let me know why aspose words is unable to process correct CSS with above HTML?

When I process above HTML- via aspose words my output pdf should look like below

@ramesh676 Aspose.Words in this case mimics MS Word behavior. If you convert the attached HTML document to DOCX or PDF using MS Word the result will be the same:
ms.docx (12.5 KB)
aspose.docx (7.8 KB)

So this is not a bug in Aspose.Words, it is expected behavior.

Hi @alexey.noskov,

If I use aspose.pdf for java and convert my html to pdf , will I get same view as browser?

@ramesh676 You can try using Aspose.HTML to convert HTML to PDF, it is designed to work with HTML documents and should render HTML as it is rendered in the browser. But it is better to test the behavior on your side with your documents.

HI @alexey.noskov,

Could you please verify the conversion of above HTML to pdf using Apose HTML. I do not have aspose html code handy to process the document. Could you please process above html and share me the output file using aspose html for java? I tried online conversion and i am getting some unknown error so please do the needful so that I can proceed further to buy proper license.

Thanks!

@ramesh676 I will move your request in Apsose.Total category, my colleagues from Aspose.PDF and Aspose.HTML teams will help you shortly.

Hi @alexey.noskov,

I will be waiting for response from your colleagues!!

Thanks!

Hi @alexey.noskov,

I want to pass string in below args while loading resource

```
if (args.getResourceType() == ResourceType.IMAGE) {
``` 

is there any way I can pass my own string into this args?

@ramesh676

We tested the scenario using both Aspose.HTML and Aspose.PDF for Java in our environment with the latest versions available.

With Aspose.HTML for Java, we faced java.lang.RuntimeException and this issue has been logged as HTMLJAVA-1595 in our issue tracking system for further analysis.

With Aspose.PDF for Java, we were able to generate the output attached below. However, the alignment was not correct in the generated PDF. Therefore, a ticket as PDFJAVA-43161 has been generated in order to rectify this issue.

test.pdf (81.9 KB)

We will further look into the details of the logged tickets and inform you as soon as they are resolved. Please be patient and spare us some time.

We apologize for the inconvenience.

@alexey.noskov will be getting back to you about it.

@ramesh676

No, there is no way to pass your own string into these args, the values are read from the HTML which is passes as an input to Aspose.Words.