Word Documents Comparison - Preserve Drawing & Shape Content Layout - Convert DOCX to HTML - Java API

Hello,

Hope you are doing well!

I am trying to compare two documents in internet explorer. The moment I click “Compare” button, comparison screen shows up with results only for a second or two and then it goes blank. I am facing this issue only on IE and Edge browser. This issue is not happening on google chrome. Please find below a piece of code that I am using to generate “difference” file and display comparison.

Document doc1 = new Document(path);
doc1.save(path+".docx", SaveFormat.DOCX);

Document doc2 = new Document(path);
doc2.save(path+".docx", SaveFormat.DOCX);

if(doc1.hasRevisions())
{
doc1.acceptAllRevisions();
}

if(doc2.hasRevisions())
{
doc2.acceptAllRevisions();
}

com.aspose.words.HtmlSaveOptions saveOptions = new com.aspose.words.HtmlSaveOptions(SaveFormat.HTML);
saveOptions.setPrettyFormat(true);
saveOptions.setExportImagesAsBase64(true);
saveOptions.setExportXhtmlTransitional(true);
saveOptions.setExportRoundtripInformation(false);

doc2.compare(doc1, “user”, new java.util.Date());
doc2.getLayoutOptions().getRevisionOptions().setInsertedTextColor(RevisionColor.BRIGHT_GREEN);
doc2.getLayoutOptions().getRevisionOptions().setDeletedTextColor(RevisionColor.RED);
doc2.save(path, saveOptions);

com.aspose.words.Document temp_doc1 = new com.aspose.words.Document(temp_path+".docx");

com.aspose.words.Document temp_doc2 = new com.aspose.words.Document(temp_path+".docx");

//save temp_doc1 and temp_doc2 in html format…
temp_doc1.save(temp_path+".htm", SaveFormat.HTML_FIXED);
temp_doc2.save(temp_path+".htm", SaveFormat.HTML_FIXED);

Note: I am using aspose-words version 19.7

I am also attaching the documents that I am trying to compare for your reference. I could not upload the .docx file as it is not accepting that format while uploading. So I have converted those files to .pdf format. I’m looking forward to hearing from you.

Regards,
Amey

doc1.pdf (658.0 KB)
doc2.pdf (580.0 KB)

@avaidya,

Thanks for your inquiry. Have you also tried the latest version of Aspose.Words for Java i.e. 19.11 on your end? In case the problem still remains, please ZIP and upload your input Word documents and Aspose.Words 19.11 generated PDF files causing the issue here for testing. We will then investigate the issue on our end and provide you more information.

How are you comparing documents inside IE or Edge browsers. We see no issue on our end when viewing your “doc1.pdf” and “doc1.pdf” files on our end. Please make sure you have enough memory (RAM) installed in your machine. Please also list the complete steps that we can perform inside IE or Edge browsers to be able to observe the same behavior. Thanks for your cooperation.

Hello Hafeez,

Thank you for your quick response.

As you suggested I tried version 19.11 of aspose-words and I am still able to reproduce the issue on my end. Also, I made sure that I have enough memory installed on my machine.

Steps: When I open my application in IE, I checked 2 checkboxes and clicked on Compare button (refer image provided below). After that, a new window is supposed to open up showing the difference between two documents.

compare docs.JPG (11.8 KB)

NOTE - As per my investigation this is a document-specific issue because I am able to compare other plain simple documents in IE without any problem.

I am attaching a ZIP containing original .docx files that I am trying to compare.

docs to compare.zip (429.5 KB)

Regards,
Amey

@avaidya,

Please check which of the following HTML_FIXED files go blank when viewing with Internet Explorer or Microsoft Edge web browsers on your end? What IE version are you getting this problem with? Have yo tried to upgrade your IE and Edge browsers to the latest versions?

Aspose.Words for Java 19.11 code:

Document doc = new Document("E:\\Temp\\docs to compare\\doc1.docx");
doc.save("E:\\Temp\\docs to compare\\doc1.html", SaveFormat.HTML_FIXED);

Aspose.Words for Java 19.11 code:

Document doc = new Document("E:\\Temp\\docs to compare\\doc2.docx");
doc.save("E:\\Temp\\docs to compare\\doc2.html", SaveFormat.HTML_FIXED);

Aspose.Words for Java 19.11 code:

Document doc1 = new Document("E:\\Temp\\docs to compare\\doc1.docx");
Document doc2 = new Document("E:\\Temp\\docs to compare\\doc2.docx");

if (doc1.hasRevisions()) {
    doc1.acceptAllRevisions();
}

if (doc2.hasRevisions()) {
    doc2.acceptAllRevisions();
}

doc1.compare(doc2, "user", new java.util.Date());
doc1.getLayoutOptions().getRevisionOptions().setInsertedTextColor(RevisionColor.BRIGHT_GREEN);
doc1.getLayoutOptions().getRevisionOptions().setDeletedTextColor(RevisionColor.RED);

doc1.save("E:\\Temp\\docs to compare\\doc1-doc2-compare_HTML_FIXED.html", SaveFormat.HTML_FIXED);

Aspose.Words for Java 19.11 code:

Document doc1 = new Document("E:\\Temp\\docs to compare\\doc1.docx");
Document doc2 = new Document("E:\\Temp\\docs to compare\\doc2.docx");

if (doc1.hasRevisions()) {
    doc1.acceptAllRevisions();
}

if (doc2.hasRevisions()) {
    doc2.acceptAllRevisions();
}

doc2.compare(doc1, "user", new java.util.Date());
doc2.getLayoutOptions().getRevisionOptions().setInsertedTextColor(RevisionColor.BRIGHT_GREEN);
doc2.getLayoutOptions().getRevisionOptions().setDeletedTextColor(RevisionColor.RED);

doc2.save("E:\\Temp\\docs to compare\\doc2-doc1-compare_HTML_FIXED.html", SaveFormat.HTML_FIXED);

Can you please also ZIP and share your HTML_FIXED file which goes blank in IE and Edge browsers here for further testing and what simplified Aspose.Words for Java source code did you use to produce it? Thanks for your cooperation.

Hello Hafeez,

Appreciate your service.
I am using Internet Explorer version 11 and I checked all HTML_FIXED files provided by you. I am able to open all of them on IE successfully. However, I cannot use “SaveFormat.HTML_FIXED” parameter in save method as you suggested because it was causing problem to some other documents. I am currently using the code that I mentioned in my first post (on 26th Nov). I am using “SaveFormat.HTML” parameter in save method as below

com.aspose.words.HtmlSaveOptions saveOptions = new com.aspose.words.HtmlSaveOptions(SaveFormat.HTML);
saveOptions.setPrettyFormat(true);
saveOptions.setExportImagesAsBase64(true);
saveOptions.setExportXhtmlTransitional(true);
saveOptions.setExportRoundtripInformation(false);

doc2.compare(doc1, “user”, new java.util.Date());
doc2.getLayoutOptions().getRevisionOptions().setInsertedTextColor(RevisionColor.BRIGHT_GREEN);
doc2.getLayoutOptions().getRevisionOptions().setDeletedTextColor(RevisionColor.RED);
doc2.save(path, saveOptions);

You will find above code in my original post as well.

Important:
If I use saveOptions as a parameter to save method (as mentioned above) then I do not get blank screen on IE and comparison/difference between two documents (provided to you) shows up fine on IE except one problem. Some components in the document are not generated correctly like some data is getting out of the table or some components are overlapping each other. You can refer images below to get better idea

img1.JPG (22.7 KB)

img2.JPG (66.4 KB)

I believe, this is happening because saveOptions object is generated using SaveFormat.HTML parameter. Can you suggest any method that I can apply on saveOptions to tackle problems shown in images above?

Regards,
Amey

@avaidya,

We are checking this scenario and will get back to you soon.

@avaidya,

Thanks for being patient. We tested the above scenarios and have managed to reproduce the same problems on our end. For the sake of corrections, we have logged the following issues in our issue tracking system.

WORDSNET-19701: Content inside Drawing overlaps when converting to HTML (img2.jpg)
WORDSNET-19702: Content goes beyond the boundaries of a Drawing Box in HTML (img1.jpg)

We will further look into the details of these issues and will keep you updated on the status of corrections. We apologize for your inconvenience.

Regarding the blank page issue in Internet Explorer, please check if the following HTML file goes blank when viewing with Internet Explorer or Microsoft Edge web browsers on your end?

Aspose.Words for Java 19.12 code used on our end to produce above HTML:

Document doc1 = new Document("E:\\Temp\\docs to compare\\doc1.docx");
Document doc2 = new Document("E:\\Temp\\docs to compare\\doc2.docx");

if (doc1.hasRevisions()) {
    doc1.acceptAllRevisions();
}

if (doc2.hasRevisions()) {
    doc2.acceptAllRevisions();
}

doc2.compare(doc1, "user", new java.util.Date());
doc2.getLayoutOptions().getRevisionOptions().setInsertedTextColor(RevisionColor.BRIGHT_GREEN);
doc2.getLayoutOptions().getRevisionOptions().setDeletedTextColor(RevisionColor.RED);

HtmlSaveOptions saveOptions = new HtmlSaveOptions(SaveFormat.HTML);
saveOptions.setPrettyFormat(true);
saveOptions.setExportImagesAsBase64(true);
saveOptions.setExportXhtmlTransitional(true);
saveOptions.setExportRoundtripInformation(false);

doc2.save("E:\\Temp\\docs to compare\\doc2.html", saveOptions);

Hello Awais,

Hope you are doing well! Thank you for creating a ticket for those issues in your issue tracking system. Regarding HTML file that you provided above, I am able to open it on Internet Explorer or Microsoft Edge browsers and it does not get blank. However, it has those two issues - WORDSNET-19701 and WORDSNET-19702.

Do you have any update for me on tickets you created in your system or estimated time that could be required to fix those issues?

Thank you,
Amey

@avaidya,

I am afraid, your issues (WORDSNET-19701 and WORDSNET-19702) are currently pending for analysis and are in the queue. There are no estimates available at the moment. Once the analysis of these issues will be completed, we may then be able to calculate and share the ETAs of these issues with you. We apologize for any inconvenience.

@avaidya,

Regarding WORDSNET-19701, this issue occurs because the flow HTML format does not support multi-column layout. I am afraid, we do not have a good solution to this issue so far and it is not going to be fixed anytime soon.

You may try SaveFormat.HTML_FIXED instead.

Can you please provide more details about this issue, ZIP and share what Word document are you getting this problem with? Please also ZIP and share output HTML_FIXED file and a comparison screenshot highlighting the problematic area in it here for our reference. Thanks for your cooperation.