Aspose.Words for Java - RTF to HTML or PDF to HTML produces HTML with no content

Hi,

I am using aspose-words via Maven (see below for code snippet).

But the code produces boiler plate HTML without content.
I looked into the example code on GitHub.

What am I missing?

I don’t have a license. Is that there reason that my RTF/Word/PDF content is not converted to HTML?

How do I get a trial license?

If this works, I will buy a license.

    <dependency>
        <groupId>com.aspose</groupId>
        <artifactId>aspose-words</artifactId>
        <version>20.11</version>
        <type>pom</type>
    </dependency>
    <dependency>
        <groupId>com.aspose</groupId>
        <artifactId>aspose-words</artifactId>
        <version>20.11</version>
        <classifier>jdk17</classifier>
    </dependency>
    <dependency>
        <groupId>com.aspose</groupId>
        <artifactId>aspose-words</artifactId>
        <version>20.11</version>
        <classifier>javadoc</classifier>
    </dependency>

and calling it as:

public static void RtfToHtml() {
    InputStream in = new FileInputStream("C:\\MyRtfDoc.rtf");
    Document doc = new Document(in);
    RtfSaveOptions rtfSaveOptions = new RtfSaveOptions();
    rtfSaveOptions.setSaveImagesAsWmf(true);
    doc.save("MyRtfDoc.html", rtfSaveOptions);
}

@harshapr,

To convert RTF Word document to HTML file, please use the following code of Aspose.Words for Java API:

Document doc = new Document("C:\\temp\\MyRtfDoc.rtf");

HtmlSaveOptions htmlSaveOptions = new HtmlSaveOptions(SaveFormat.HTML);
htmlSaveOptions.setPrettyFormat(true);

doc.save("C:\\temp\\awjava-20.11.html", htmlSaveOptions); 

If you want to test ‘Aspose.Words for Java’ without the evaluation version limitations, then you can also request a 30-day Temporary License. Please refer to How to get a Temporary License?

For more information, see: Evaluate Aspose.Words for Java

Hi,

I just tried the above code but not getting the expected output. can you help to figure out what htmlSaveOption configuration would output new line for any new tag. below image shows aspose vs expected output

@Waadhaf Unfortunately, currently, there is no way to get the output that looks like your expected output. We will consider improving a behavior of HtmlSaveOptions.PrettyFormat option. I have logged this issue as WORDSNET-23892. We will keep you informed regarding this improvement.

1 Like

@Waadhaf We have completed the analysis and concluded to close this as Won't Fix, because the two HTML fragments are not equivalent in the general case, because the HTML parser turns whitespace (line breaks and indentation) between inline elements into space characters:

<html>
    <p><span>ABC</span><span style='font-weight:bold'>DEF</span><span>GHI</span></p>
    <p>
        <span>ABC</span>
        <span style='font-weight:bold'>DEF</span>
        <span>GHI</span>
    </p>
</html>

Renders as follows:

hi @alexey.noskov the picture you attached is not loading.

but the image I attached are exact same html created using aspose. one is pretty printed using notepad++ plugin the other isn’t. not sure what you meant by them not being equivalent.

@Waadhaf The provided HTML is rendered as the following:

ABCDEFGHI
ABC DEF GHI

Note spaces between groups of letters.