Document Formatting Problem With HTML text

I am creating a document with HTML text as given in the attachment (test.txt).

Requirments:
I need to retain the indentation of the text.
I need to have the text justified.
I need to remove the space after the text.
And If need to retain the bullet style.

Current Code used :
String clauseText = clause.getClauseHTMLText();
clauseText = clauseText + “\r”;
builder.getParagraphFormat().setStyle(listStyleLevel1);
// builder.getListFormat().setList(list);
builder.getListFormat().setListLevelNumber(2);
builder.getFont().setSize(12);
builder.getFont().setName(“Calibri”);
builder.getFont().setBold(false);
builder.getParagraphFormat().setAlignment(ParagraphAlignment.JUSTIFY);
// builder.getParagraphFormat().setLeftIndent(0);
// builder.getParagraphFormat().setRightIndent(0);
builder.getParagraphFormat().setSpaceAfter(0);
// builder.getParagraphFormat().setStyleIdentifier(StyleIdentifier.NORMAL);
clauseText = clauseText.replaceAll(“font-family:”, “”);

clauseText = clauseText.replaceAll(“font-size:”, “”);
clauseText = clauseText.replaceAll(“text-align: center”, “”);
clauseText = clauseText.replaceAll(“text-align: left”, “”);
clauseText = clauseText.replaceAll(“text-align: justify”, “”);
clauseText = clauseText.replaceAll(“text-justify: inter-ideograph”, “”);
clauseText = clauseText.replaceAll(“text-indent: -”, "text-indent: “);
clauseText = clauseText.replaceAll(”

 

", “”);

builder.insertHtml(clauseText);
builder.getFont().clearFormatting();

Iterating through this code and writing in doc para by para.

Currently I am getting the document as in test.doc.

I am unable to get the formatting as requirements.

I am currently using aspose version 4.0.2.

Hi,


Thanks for your inquiry.

While using the latest version of Aspose.Words i.e. 11.0.0, I was unable to reproduce these issues on my side. The formatting applied in HTML was preserved when converting to DOC format. Moreover, I have attached the DOC file i.e. generated on my side here for your reference. Also, I would suggest you please visit the following link for downloading and using the latest version of Aspose.Words:
http://www.aspose.com/community/files/72/java-components/aspose.words-for-java/default.aspx

I hope, this will help.

Best Regards,

Hi,

I am also able to genrate document from the html text.

I need the the generated document in specific format.

e.g.

1. text should be justified

2. orignal text indentation should me mantained.

3. bullets should be mantained.

4. text should be of size 12 and font face calibri

I need have to implement these things for a document generated from html Text.

Hi,


Thanks for your inquiry. Regarding points 2 & 3, please note that the text indentation and bullets will remain preserved when converting from HTML to DOC format with Aspose.Words v11.0.0. You don’t need to manually adjust these.

Secondly, for text justification and for applying different font formatting, I would suggest you please read the following articles:
http://docs.aspose.com/display/wordsjava/ParagraphFormat (Check for ParagraphAlignment property)
http://docs.aspose.com/display/wordsjava/Specifying+Formatting

Please let us know if you need more information, we are always glad to help you.

Best Regards,

I am getting HTML in div format. i have read in some of the posts that aspose dose not take styling in div tag. can u confirm the same if so.

Hi,


Thank you for inquiry. It would be great if you share your scenario here. For more details please visit some informational threads below:

http://www.aspose.com/community/forums/282387/css-file-where-to-start/showthread.aspx#282387

http://www.aspose.com/community/forums/354487/style-problems-when-converting-html-to-pdf/showthread.aspx#354487

Hope this will help.

Hi,


Thanks for your request. Yes, Aspose.Words does not support DIV elements (e.g. style) upon importing HTML into DOM (Document Object Model). I have linked your request to the appropriate issue. You will be notified as soon as it is resolved. Sorry for inconvenience.

Secondly, please note that Aspose.Words interprets DIVs as paragraphs, may be you can try specifying formatting to these paragraphs once the HTML document is loaded.

Please let me know if I can be of any further assistance.

Best Regards,

I am getting HTML text from FCK editor and trying to create a document from that html text. i have attached the html text sample in test2.txt

I am able to genrate the document attached as test.doc

I wish to create a document with
font size 12
font face calibri
retain the indentation as that the html is extracted.
the inserted text should be justified.
and the bullets should be preserved.

Hi,


Thanks for your inquiry. Please note that, when you insert content by using Insert HTML, whole formatting is taken from HTML snippet. However, I think you can achieve the desired formatting by using the following code:

Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);

doc.NodeChangingCallback = new HandleFontNodeChanging();
builder.InsertHtml(“HTML FROM FCK EDITOR”);
doc.NodeChangingCallback = null;

// To justify text
NodeCollection paras = doc.GetChildNodes(NodeType.Paragraph, true);
foreach (Paragraph p in paras)
{
p.ParagraphFormat.Alignment = ParagraphAlignment.Justify;
}

doc.Save(@“c:\test\out.docx”);

public class HandleFontNodeChanging : INodeChangingCallback
{
void INodeChangingCallback.NodeInserted(NodeChangingArgs args)
{
if (args.Node.NodeType == NodeType.Run)
{
Run run = (Run)args.Node;

// Set your desired font settings
run.Font.Name = “calibri”;
run.Font.Color = Color.Red;
run.Font.Size = 12;
}
}

void INodeChangingCallback.NodeInserting(NodeChangingArgs args)
{
// Do Nothing
}

void INodeChangingCallback.NodeRemoved(NodeChangingArgs args)
{
// Do Nothing
}

void INodeChangingCallback.NodeRemoving(NodeChangingArgs args)
{
// Do Nothing
}
}

I hope, this will help.

Best Regards,

The issues you have found earlier (filed as WORDSNET-2021) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.
(18)