Hello there,
We use Aspose.Words NuGet package to convert html to docx, the issue I face is of:
- If I have hyperlink separately,it gets converted properly using this approach:HYPERLINK “https://dev.azure.com/” https://dev.azure.com/(this I could see in innerxml content of convertedelement)
2.If the hyperlink is in middle of sentence, it is been dealt differently using <w:hyperlink> as below:
<w:r xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<w:t xml:space="preserve">Hi </w:t>
</w:r>
<w:hyperlink w:history="1" r:id="rId4" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<w:r>
<w:rPr>
<w:color w:val="0000EE" />
<w:u w:val="single" w:color="0000EE" />
</w:rPr>
<w:t>It is Me Jenifer</w:t>
</w:r>
</w:hyperlink>
<w:r xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<w:t>,hope you are good</w:t>
</w:r>
The second one doesn’t work, that is when I click on the hyperlink it doesn’t redirect to the expected link instead it takes :/word/settings.xml
Sample pseudo code:
private static void ConvertHtmlToWordML(string html, MemoryStream wordMLStream)
{
MemoryStream htmlStream = new MemoryStream();
try
{
StreamWriter writer = new StreamWriter(htmlStream, Encoding.UTF8);
try
{
writer.Write(String.Format("<!DOCTYPE html><html><head><meta charset=\"UTF-8\"><head/><body>{0}</body></html>", html));
writer.Flush();
Aspose.Words.Document tempDoc = new Aspose.Words.Document(htmlStream, new Aspose.Words.LoadOptions { LoadFormat = Aspose.Words.LoadFormat.Html });
tempDoc.Save(wordMLStream, Aspose.Words.SaveFormat.Docx);
}
finally
{
// writer.Dispose() also disposes the stream it was instantiated with (i.e. htmlStream)
writer.Dispose();
htmlStream = null;
}
}
finally
{
if (htmlStream != null)
htmlStream.Dispose();
}
}
Any idea why it doesn’t?
@JeniferR
Could you please ZIP and attach your input HTML, problematic output DOCX, and expected output document here for testing? We will investigate the issue and provide you information on it.
@tahir.manzoor,thanks for the reply!
DataToBeSent.zip (497 Bytes)
The above zip file contains the html file to be validated, my use-case isn’t direct docx conversion. Let me add few points to make the query more clear:
-
Conversion from html to wordML stream happens first using aspose.words API
-
With having WordML stream,openxml processing happens which helps to alter/clone the node based on the requirement
-
Finally the altered/cloned openxml document in the format of docx gets converted into html using aspose.words API
Note:When the html contains just only hyperlink without any text ,just like “My Link” it works, the way that was handled for hyperlink alone and for link with text is different.
- Hyperlink alone use case is handled via HYPERLINK
- Hyperlink with text is handled via w:hyperlink
@JeniferR
We have tested the scenario using the latest version of Aspose.Words for .NET 21.3 with the following code example and have not found the shared issue.
Document doc = new Document(MyDir + "Sample-Html.html");
doc.Save(MyDir + "21.3.xml", SaveFormat.WordML);
Document doc2 = new Document(MyDir + "21.3.xml");
doc2.Save(MyDir + "21.3.docx");
Please create a standalone console application (source code without compilation errors) that helps us to reproduce your problem on our end and attach it here for testing. We will investigate the issue and provide you more information on it.
It is bit difficult to create standalone application since it involves lot of code into it.
-
I have a template file dotx whose content is been replaced using OpenXml, here specifically html content that has hyperlink.Like "Hi,myself jenifer how are you?
-
The content of document.docx that is saved automatically in the intermediate level,post openxml replacement would be:
document.zip (82.3 KB)
-
There in .rels file where I am able to see rid4 with settings.xml which is been referred in w:hyperlink node of document.xml
-
When i have code to view the file in docx format:
Aspose.Words.Document wordDocument1 = new Aspose.Words.Document(TemplateFile, new Aspose.Words.LoadOptions { LoadFormat = Aspose.Words.LoadFormat.Docx }); wordDocument1.Save("DirPath\document.docx", Aspose.Words.SaveFormat.Docx);
-
The resultant content of document.docx would be as in below zip folder:
document.zip (78.1 KB)
where I could see w:hyperlink refers to rId5 with External attribute set to “Yes” but Target pointing to something else.
-
This is the issue I am facing with,somehow hyperlink relationship Target is lost though it retains other attributes like External and all.
Can you please help me with this?
@JeniferR
To ensure a timely and accurate response, please attach the following resources here for testing:
- Your valid input document that can be opened by MS Word.
- Please attach the output file that shows the undesired behavior.
- Please share the screenshots of problematic sections of document.
- Please attach the expected output file that shows the desired behavior.
- We need sample code that generates incorrect output using Aspose.Words API.
As soon as you get these pieces of information ready, we will start investigation into your issue and provide you more information. Thanks for your cooperation.
PS: To attach these resources, please zip and upload them.