We are converting pdf to docx and applying our template based styles and noticed that hyperlinks are not identifying the excat text where in pdf and causing links to other normal text as well.
While reading the paragraphs of the Docx file, we tried to identify the text containing the Hyperlinks and set the Url
After setting the Urls, the output docx has the text appended with the unwanted text (Such as ‘HYPERLINK’) and also it appends the complete url for each character of the text.
Tried below possible ways but unable to achive, Could you please help us on identifying hyperlink in docx file.
|a.|After reading the document, we tried to get all the Hyperlinks in the document
|b.|Tried to match with the text which contains the hyperlink and set the url.|
|c.|We are unable to match the text with the text containing the actual hyperlinks.|