Image is lost after conversion from HTML to DOCX using .NET

Hi,

I am trying to give the image and converting that html file into word -
<img src=“https://upload.wikimedia.org/wikipedia/commons/6/6b/Taka_Shiba.jpg” alt="" width=“195” height=“146” data-max-width=“501”>

var doc = new Document(@".\CN_6835.htm");
doc.Save(@".\out.docx");

Resultant word file is missing the above image given as input to ASPOSE.

test.zip (79.0 KB)

I have attached the both(html and converted docx file) files above.

Please look into this issue.

Regards,
Shanmukh.

@ServerSide527

Please use HtmlLoadOptions.WebRequestTimeout as shown below to increase the web request times out. Hope this helps you.

Aspose.Words.HtmlLoadOptions options = new Aspose.Words.HtmlLoadOptions();
options.WebRequestTimeout = 10000000;
Document doc = new Document(MyDir + "CN_6835.html", options);
doc.Save(MyDir + @"20.10_.docx");

Hi @tahir.manzoor,

I have tried your solution, but it doesn’t seems to be working for me.
Made the change as follows -

HtmlLoadOptions options = new HtmlLoadOptions(){ WebRequestTimeout = 10000000  };
Document doc = new Document(@".\CN_5766.html", options);
doc.Save(@".\CN_5766.docx");

Image appears like this -
image.png (8.0 KB)

Please follow up on this.

Regards,
Shanmukh.

@ServerSide527

We have logged this problem in our issue tracking system as WORDSNET-21287. You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

@ServerSide527

We have closed the issue WORDSNET-21287 as ‘Not a bug’. We have tested the scenario using the latest version of Aspose.Words for .NET 20.11 and have not found the shared issue. So, please use Aspose.Words for .NET 20.11. The ‘Taka_Shiba.jpg’ is the image of dog and it is visible in the output DOCX. Please check the attached output document. 20.11.zip (637.0 KB)

The issues you have found earlier have been fixed in this Aspose.Words for .NET 20.12 update and this Aspose.Words for Java 20.12 update.

Hi,
Just upgraded the ASPOSE to 20.12 and tested the same, still able to replicate the problem.
Our application targets the Dotnet Framework 4.5.2.
Note: Framework are not upgraded as it affects many of our clients base.
Please look into this.

Regards,
Shanmukh.

@ServerSide527

We closed the issue WORDSNET-21287 as ‘Not a bug’. The ‘Taka_Shiba.jpg’ is the image of dog and it is visible in the output DOCX.

Could you please elaborate your query along with the screenshots of issue that you are facing?

I have created a console application targeting the dotnet framework 4.5.2, added a reference of ASPOSE.WORDS version 20.12.0 and ran the same code to convert HTML file Docx including external image.

var doc = new Document(@"..\..\CN_5817.htm");
doc.Save(@"..\..\CN_5817.docx");

When the same html is converted to docx, i can still see the image is not saved in docx with the upgraded ASPOSE.

image.jpg (253.5 KB)

Attached the same files to replicate the problem.
Image_Not_Saved.zip (27.1 KB)

@ServerSide527

We have tested the scenario and have not found the shared issue. Please check the attached output DOCX. 20.12.zip (575.4 KB)

Please get the 30 days temporary license and apply it before document conversion.
https://docs.aspose.com/words/net/licensing/

Hi @tahir.manzoor,

I am attaching the solution file and the project itself to replicate the problem again.
IMAGE_NOT_SAVED.zip (8.0 KB)

Please build the project ,run the solution file and check the file CN_5817.doc file created by ASPOSE.
You will see image is not saved.

Please revert to us if anything isn’t done correctly in the above solution project.

Regards,
Shanmukh.

@ServerSide527

We have tested the scenario and have managed to reproduce the same issue at our side. For the sake of correction, we have logged this problem in our issue tracking system as WORDSNET-21583. You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

@ServerSide527

It is to inform you that the issue which you are facing is actually not a bug in Aspose.Words. So, we have closed this issue (WORDSNET-21583) as ‘Not a Bug’.

Source HTML contains a reference to an image with secure URL:

<img src="https://upload.wikimedia.org/wikipedia/commons/6/6b/Taka_Shiba.jpg"/>

The server that hosts the image uses a newer version of TLS that is not supported by the .NET Framework version that is installed on your machine. The following article might help you to resolve the issue.

https://docs.microsoft.com/en-us/dotnet/framework/network-programming/tls