Query regarding default font

Hi Team,

As we are using our existing ms-word template to write content, and its having own syles(class) based on tag like (heading 1, heading 2, default, Body, etc).

Here we are expecting when we write content it takes style based on the template style. But when try to add or update HTML content to word it’s taking default font “Times New Roman”.

Is there any way to take style and formatting from a source document?

Please share api or code sample.query for font-formatting.zip (171.3 KB)

Find the attached template and it’s output.

Thanks
Purushottam Sadh

@purusadh

If you are inserting HTML into document using DocumentBuilder.InsertHtml, please use the value of second parameter as true.

If you still face problem, please create a standalone console application ( source code without compilation errors ) that helps us to reproduce your problem on our end and attach it here for testing. We will investigate the issue and provide you more information on it.

Hi Tahir,

Thanks for the update.

The above solution is not working for me.

Please find attached the sample code with input and output.

I am expecting out-put as per template formatting, but in my case, it’s taking default font and formatting.

Thanks,
Purushottam Sadh
Query for template formatting.zip (258.4 KB)

@purusadh

Please use the following code example to get the desired output.

Document doc = new Document(MyDir + "category-1.docx");
DocumentBuilder builder = new DocumentBuilder(doc);
builder.moveToDocumentEnd();
builder.insertHtml(readAllText(MyDir + "sae_temp.html"), true);
doc.save(MyDir + "output.docx");

private static String readAllText(String filePath)
{
    String content = "";
    try
    {
        content = new String ( Files.readAllBytes( Paths.get(filePath) ) );
    }
    catch (IOException e)
    {
        e.printStackTrace();
    }

    return content;
}

Hi Tahir,

I am not inserting HTML into a document using DocumentBuilder.InsertHtml.

In actual scenario, I am appending document object using below code as I already shared code with you.

dstDoc.appendDocument(doc, ImportFormatMode.USE_DESTINATION_STYLES);

Like:
for( int i=1; i<10;i++ ){
srcDoc = AsposeUtils.convertXMLTODocument((byte[] binaryArray));
dstDoc.appendDocument(srcDoc, ImportFormatMode.USE_DESTINATION_STYLES);

}
dstDoc.save(Constants.SOURCE_DIR + “output_2.docx”, SaveFormat.DOCX);


public static Document convertXMLTODocument(byte[] xmlData) {

	try {
		ByteArrayInputStream docInStream =  new ByteArrayInputStream(xmlData);		
		Document outDoc = new Document(docInStream);		
		return outDoc;
	} catch (Exception e) {
		// TODO Auto-generated catch block
		e.printStackTrace();
	}
	return null;
}

Note: byte[] array of data, is stored in database.

Thanks
Purutshottam

@purusadh

Please note that Aspose.Words mimics the behavior of MS Word. If you open the HTML document in MS Word and copy its content into “category-1.docx”, you will get the same output as generated by Aspose.Words. To get the desired output, you need to use DocumentBuilder.InsertHtml as shared in my previous post.

Thanks for the update.

The below code is working for me, but the table of content is not generated properly.

builder.insertHtml(“html_content”, true);

Please find attached the demo code for reference which I used for TOC with output.

Thanks
Purushottam

Query for generate document with table of contents.zip (270.0 KB)

@purusadh

Please make sure that you have moved the cursor to the end of document. You are inserting the HTML after calling Document.updateFields method. Please call this method before saving the document or after inserting the HTML.

Hi Tahir,

After applying suggested changes,it also not working. All the alignment of document has disturbed with table of content.
com.aspose.words.DocumentBuilder builder = new com.aspose.words.DocumentBuilder(dstDoc);
builder.moveToDocumentEnd();
// Need to add table of contents
ParagraphFormat paragraphFormat = builder.getParagraphFormat();

		paragraphFormat.setAlignment(ParagraphAlignment.CENTER);

		builder.writeln();
		builder.writeln("TABLE OF CONTENTS");
		builder.writeln();
		builder.insertTableOfContents(" \\o \"1-3\" \\h \\z \\u ");			
		
		builder.insertBreak(BreakType.PAGE_BREAK);			
		
		
		// Below for loop for getting HTML content from database
		for (int i = 1; i <= 3; i++) {				
			File input = new File(Constants.SOURCE_DIR + docName + i+".html");
			org.jsoup.nodes.Document doc1 = Jsoup.parse(input, "UTF-8", "http://example.com/");
			builder.insertHtml(doc1.html(), true);	
		}
		dstDoc.updateFields();
		dstDoc.save(Constants.SOURCE_DIR + "output.docx", SaveFormat.DOCX);	

Please find attached output with java code.

Thanks
Purushottam Sadh

Query for generate document with table of contents.zip (270.4 KB)

@purusadh

Could you please share the following resources here for testing?

  • Please share the screenshots of problematic section of output document.
  • Please share the expected output document.
  • Please share the HTML returned by doc1.html().

As soon as you get these pieces of information ready, we will start investigation into your issue and provide you more information. Thanks for your cooperation.

PS: To attach these resources, please zip and upload them.

Hi Tahir,

I have attached all of information which you needed.

-> Please share the screenshots of a problematic section of the output document.
if you see the output result you will get an idea of what is going wrong with the result. I have also attached a screenshot.

-> Please share the expected output document.
Output result should be with proper formatting. Please find the attachment for the expected result.

-> Please share the HTML returned by doc1.html().
I have already attached three input HTML file, you can use it in the program as HTML content.
(sae_temp_1.html, sae_temp_2.html, sae_temp_3.html)

I have attached all the required documents which you mention in the post.

Thank you!!

Query for generate document with table of contents.zip (407.8 KB)

@purusadh

We have tested the scenario using the latest version of Aspose.Words for Java 19.10 with following modified code and have not faced the shared issue. So, please use Aspose.Words for Java 19.10. This code example clears the paragraph formatting before inserting the TOC field. We have attached the output DOCX with this post for your kind reference. output.zip (82.4 KB)

com.aspose.words.Document dstDoc = new com.aspose.words.Document(MyDir + "category-3.docx");

com.aspose.words.DocumentBuilder builder = new com.aspose.words.DocumentBuilder(dstDoc);

builder.moveToDocumentEnd();

// Need to add table of contents
ParagraphFormat paragraphFormat = builder.getParagraphFormat();

paragraphFormat.setAlignment(ParagraphAlignment.CENTER);
builder.writeln();
builder.writeln("TABLE OF CONTENTS");
builder.writeln();
builder.insertTableOfContents(" \\o \"1-3\" \\h \\z \\u ");


builder.insertBreak(BreakType.PAGE_BREAK);

builder.getParagraphFormat().clearFormatting();
builder.writeln();
builder.getParagraphFormat().setAlignment(ParagraphAlignment.LEFT);

// Below for loop for getting HTML content from database
for (int i = 1; i <= 3; i++) {
    builder.insertHtml(readAllText(MyDir + "sae_temp_"+i+".html"), true);
}

dstDoc.updateFields();
dstDoc.save(MyDir + "output.docx");

Thanks a lot Tahir for you support.

@purusadh

Thanks for your feedback. Please feel free to ask if you have any question about Aspose.Words, we will be happy to help you.