We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Hyperlink duplication while .doc to .docx conversion

While converting a simple 2003 .doc document with hyperlink to 2007 .docx document, I have noticed that an extra set of tags are produced. Although it has no impact on the rendering of the text in Word2007, it has some hindrance while reading the OOXML. These tags are not produced while converting the same document in Word 2007.

The text given below in the box is the text that I gave in the .doc document.

This is a hyperlink

http://www.google.com

The above text is a hyperlink.

<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

OOXML part for the hyperlink while converting using Word2007.

- <w:p w:rsidR="001D69CA" w:rsidRDefault="001D69CA">

- <w:hyperlink r:id="rId4" w:history="1">

- <w:r w:rsidRPr="006E47C0">

- <w:rPr>

<w:rStyle w:val="Hyperlink" />

</w:rPr>

<w:t>http://www.google.com</w:t>

</w:r>

</w:hyperlink>

</w:p>

OOXML part for the hyperlink while converting using Aspose.Word (version 5.2.0.0)

- <w:p w:rsidR="001D69CA">

- <w:r>

<w:fldChar w:fldCharType="begin" />

</w:r>

- <w:r>

<w:instrText xml:space="preserve">HYPERLINK "http://www.google.com"</w:instrText>

</w:r>

- <w:r>

<w:fldChar w:fldCharType="separate" />

</w:r>

- <w:r w:rsidRPr="006E47C0">

- <w:rPr>

<w:rStyle w:val="Hyperlink" />

</w:rPr>

<w:t>http://www.google.com</w:t>

</w:r>

- <w:r>

<w:fldChar w:fldCharType="end" />

</w:r>

</w:p>

Question : Why is the extra tag “w:instrText” coming up along with the “w:rStyle” tag? This is hindering our progress as these tags are also being processed giving unexpected results.

Attached is the document which I used before conversion

Hi<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thanks for your inquiry. In the DOC format hyperlinks are represented as HYPERLINK fields. All fields consist of FieldStart, FieldSeparator, FieldEnd, Field code and Field value. In DOCX hyperlinks could be represented in two different ways: field and n w:hyperlink node. Both is correct.

Press Alt+F9 to see field code in your document. Aspose.Words does not change anything in source document so hyperlink is still represented as HYPERLINK field.

<w:p w:rsidR="001D69CA">

<w:r>

<w:fldChar w:fldCharType="begin" />

</w:r>

<w:r>

<w:instrText xml:space="preserve">HYPERLINK "http://www.google.com"</w:instrText>

</w:r>

<w:r>

<w:fldChar w:fldCharType="separate" />

</w:r>

<w:r w:rsidRPr="006E47C0">

<w:rPr>

<w:rStyle w:val="Hyperlink" />

</w:rPr>

<w:t>http://www.google.com</w:t>

</w:r>

<w:r>

<w:fldChar w:fldCharType="end" />

</w:r>

</w:p>

Field Start

Field Code

Field Separator

Field Value

Field End

Best regards.