Output Document as HTML "converts" a paragraph to an anchor!


#1

We have a very simple Word file that contains 2 paragraphs with bookmarks around both each of them. When we save the file to HTML, the first Paragraph gets “converted” to an anchor within the 2nd paragraph and the bookmark end gets lost!

If I save the SAME Word file to HTML form within Word, it generates what I would expect (two

tags).

I have a stand-alone App that reproduces this and I have tested it with the latest release of Aspose.Word.


#2

@bsant

Thanks for your inquiry. Please ZIP and attach your input document and code here for testing. We will investigate the issue on our side and provide you more information.


#3

I’ve attached the zipped project that contains the code and sample file. You’ll just need to grab the NuGet packages.

Thanks,
AndyWordToHtml.zip (70.1 KB)


#4

@bsant

Thanks for sharing the document. We are working over your query and will get back to you soon.


#5

@bsant

You are facing the expected behavior of Aspose.Words. The HTML <a> and </a> tags exist in the output document. Please check the attached image for detail.


#6

Hi Tahir,
I agree that syntactically it’s right, but that is NOT the problem. That first tag should not be there at all. It should be a

tag that is a sibling to the

tag that is there. I’ve attached what Microsoft Word will generate and also what I expected Aspose to generate to show the difference.Data.zip (30.8 KB)

Thanks,
Andy


#7

@bsant

Thanks for your inquiry. We have logged this problem in our issue tracking system as WORDSNET-17612. You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.


#8

Hi,
Is there any news on this issue? We have a very large customer who needs this fixed and it’s been 10 months since we reported the defect.

Thanks,
Andy


#9

@bsant

Unfortunately, there is no update available at the moment. We are investigating this issue. We will inform you via this forum thread once there is an update available on it.


#10

Is there any way to get visibility into how that investigation is going? Again, it’s been 10 months! We are getting pressure from our customer to get this fixed.


#11

@bsant

Please accept my apologies for your inconvenience.

Unfortunately, you cannot access our issue tracking system. However, you can ask for the update in Aspose.Words’ forum.

Please spare us some time on this issue. We will share an update on this issue as soon as possible.


#12

I’ve still not heard anything and our customer needs to get a reply. Can we get a status on this please.


#13

@bsant

We regret to share with you that there is no update available on this issue at the moment. Please note that you reported this issue in free support forum and it will be treated with normal priority. We will be sure to inform you once there is any update available on it.

We apologize for your inconvenience.


#14

@bsant

It is to inform you that the issue which you are facing is actually not a bug in Aspose.Words. So, we have closed this issue (WORDSNET-17612) as ‘Not a Bug’.

Please check the following analysis of this issue.

Bookmark with name “Q_bfc82640-643c-4146-9eeb-ba4679583415” starts in first paragraph and finishes in last paragraph.

The first paragraph is not exported to HTML because it is empty and contains only the beginning of the bookmark. An empty paragraph is ignored by the browser.

MS Word exports the first paragraph but it does not affect the display. MS Word has its own approach for bookmark export and it does not look correct.


#15

OK, I totally disagree with this. Lets try a different file, with a more detailed explanation of what the problem is. The attached project just contains one Word file (BookmarksTest.docx) and the output file (BookmarksTest.html) generated by running the code using the latest Word API.

I’ve included a Word document (ProblemDetails.docx) in the ZIP file that shows exactly what the problems are and explains them too. I’ve attached the ZIP file:
WordToHtml.zip (343.6 KB)


#16

@bsant

Please unzip your document and check the document.xml. We have attached the screenshot for your kind reference.

In the screenshot, you can see that the bookmarkStrat is in first paragraph and bookmarkEnd is in second paragraph.

You can check this by opening the HTML in browser.

The case is same for your second document.


#17

Thanks for the reply, but that does not look like the same document I submitted as part of my last reply - that bookmark is not in the document.

Did you look at the last ZIp file and Word Document detailing the problem that I added to my last reply? WordToHtml.zip.


#18

@bsant

Yes, the second document is not same. However, the test case is same for both documents. The BookmarkStart and BookmarkEnd are not in the same paragraph. Please unzip your document and check the document.xml for detail.


#19

They never are, that is how it works. The BookmarkEnd is in the next paragraph. The problem is that Aspsoe does not output the second paragraph at all, so it tries to adjust the bookmarks at fails. It’s correctly added the BookmarkEnd from the beginning of the 2nd paragraph to the beginning of the third, but then it moves the BookmarkStart from the beginning of the excluded 2nd paragraph to the beginning of the 3rd paragraph (that would potentially be OK), but then it NEVER adds the Bookmark end for that bookmark. So, now it looks like that 2nd bookmark is going from the beginning of the 3rd paragraph to the end of the document.


#20

@bsant

Thanks for your inquiry. Please note that this is not a bug. The analysis of this issue already shared in this thread. Please check my following reply on this issue.