We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Converting from html to rtf. How to link images instead of embedding

Hi
I am trying to evaluate aspose for converting HTML to RTF. I downloaded the latest aspose version and using that. I have 2 issues.
If I specify a http base url in the setBaseUri() method of the HtmlLoadOptions class and pass it to the Document() constructor, it is not embedding images in the generated RTF. I need to download all the images to my local and specify a local path to the images, than only images appear in the generated RTF file.
Second issue is I do not want to embed the images in the generated RTF, I want a link only image in the RTF, how can I achieve that using aspose.
Please help me resolve these issues asap, as we need to decide if we can buy aspose for converting HTML to RTF.
Thanks

@parulagg,

Thanks for your inquiry. To ensure a timely and accurate response, please ZIP and attach the following resources here for testing:

  • Your simplified input HTML file
  • Aspose.Words 18.10 generated output RTF file showing the undesired behavior
  • Your expected RTF Word document showing the correct output. You can create expected document by using MS Word.
  • Please also create a standalone simple Java application (source code without compilation errors) that helps us to reproduce your current problem on our end and attach it here for testing. Please do not include Aspose.Words JAR files in it to reduce the file size.

As soon as you get these pieces of information ready, we will start further investigation into your issue and provide you more information. Thanks for your cooperation.

Hi
Thank you for the quick reply. I am attaching for_aspose.zip file, which contains the code and the sample file.
It contains four files
_Arabic_Content.html
arabic_aspose.rtf
kh-education-partner.jpg
TestAspose.java

_Arabic_Content.html and kh-education-partner.jpg are sample files that I have downloaded from our server.
The TestAspose.java file reads the file from our server and tries to generate the RTF file, that when the images are missing. Attached arabic_aspose.rtf shows the image missing in the RTF file. You will not be able to access our servers from your end, as you need to VPN to our network first.

If I run the class with the downloaded HTML file and give the baseuri as local directory, it works. We need to pass the input and output file name if we want to run this class using the local files.
java TestAspose E:/for_aspose/_Arabic_Content.html E:/for_aspose/arabic_aspose.rtf

I am using aspose-words-18.10-jdk16.jar for testing.
I need to find a way to tell aspose that the generated RTF should not embed images, but have a linkOnly image, that is INCLUDEPICTURE tag pointing to a image in the same directory or the image in a sub-directory. I can download all the images referred in the HTML to my local and than I want aspose to have links in the RTF to these images and do not embed them.

Please let me know if the information provided is sufficient, or if you need any other details.
Thanks for looking this this. Really appreciate the quick responsefor_aspose.zip (24.8 KB)
.

@parulagg,

We are checking this scenario and will get back to you soon.

Hi

Any update on this??

Thanks
Parul Aggarwal

@parulagg,

Thanks for being patient. We have locally hosted your HTML file at http://localhost/_Arabic_Content.html and the JPG image at http://localhost/images/kh-education-partner.jpg locations. After when we executed the following code, the image is embedding inside RTF correctly. (See awjava-18.10.zip (21.5 KB))

HtmlLoadOptions options = new HtmlLoadOptions();
options.setLoadFormat(LoadFormat.HTML);
options.setBaseUri("http://localhost/images/");

Document doc = new Document("http://localhost/_Arabic_Content.html", options);
doc.save("D:\\temp\\for_aspose\\awjava-18.10.rtf");

Hi

What about the option to not embed images. I do not want the resulting RTF to have embedded images, but they should be link only to the image on local disk.

Thanks
Parul Aggarwal

@parulagg,

Please also ZIP and attach your expected RTF Word document showing the desired output. We will investigate the structure of your expected document as to how you want your final output be generated like. You can create expected document by using MS Word.

Hi

What we are looking for me the generated RTF should have images that are referring to local directory and use INCLUDEPICTURE tag as shown below.

{\field{*\fldinst{INCLUDEPICTURE “images/kh-education-partner.jpg” \* MERGEFORMAT \d}}}

I will send you a sample document as well. Right now we are using PD4ML, using which I am able to set the INCLUDEPICTURE tag, but the Arabic content alignment is not working. With aspose, the content alignment is working perfectly, but now sure how to use INCLUDEPICTURE tag to include images in the generated RTF document.

Thanks
Parul Aggarwal

Please find attached a sample RTF file that I have created manually in Word using the option “Link To File” to link the image in the document to the local image on the drive.
Please let me know if we can achieve this using aspose, when converting HTML to RTF.RTFDocumentWithLinkedOnlyImage.zip (54.6 KB)

@parulagg,

You can replace all images in loaded document with FIELDINCLUDEPICTURE field by using the following code:

HtmlLoadOptions options = new HtmlLoadOptions();
options.setLoadFormat(LoadFormat.HTML);
options.setBaseUri("http://localhost/images/");

Document doc = new Document("http://localhost/_Arabic_Content.html", options);
DocumentBuilder builder = new DocumentBuilder(doc);

int i = 0;
ImageSaveOptions opts = new ImageSaveOptions(SaveFormat.JPEG);
for (Shape shape : (Iterable<Shape>) doc.getChildNodes(NodeType.SHAPE, true)) {
    if (shape.hasImage()) {
        builder.moveTo(shape);

        ShapeRenderer renderer = shape.getShapeRenderer();
        renderer.save("D:\\temp\\for_aspose\\images\\img_" + i + ".jpg", opts); // the folder "D:\\temp\\for_aspose\\images" should be present on disk where you want to store linked images

        FieldIncludePicture fieldIncludePicture = (FieldIncludePicture) builder.insertField(FieldType.FIELD_INCLUDE_PICTURE, false);
        fieldIncludePicture.setSourceFullName("D:\\temp\\for_aspose\\images\\img_" + i + ".jpg");
        fieldIncludePicture.isLinked(true);

        shape.remove();
    }
}

doc.updateFields();

doc.save("D:\\temp\\for_aspose\\awjava-18.10.rtf");

Hope, this helps.

Hi

Thank you for the code. That does the trick and converts the images to linked only images. Really appreciate your help on this.

There are still 2 issues,
1: How to read the value of the src attribute of the image. We have src attribute in the <img tag, from which I want to find the image name, to use save the image to the local disk.
KidsHealth Educational Partner
I did not find function in the Shape class that will return the value of the src tag.

2: Still aspose is not reading and downloading images from our servers. If I setBaseUri to our server, the images are not downloaded.
options.setBaseUri(“http://www01-dev:4503/”);
Only if I download the image to my local disk and set the base uri to the local path, the images appear in the document.
Is there some issue with aspose connecting to servers via VPN. All our servers are intranet and need to VPN to our servers to connect to them. I can download images using normal HTTP url connection, but not sure why aspose is not able to get images from our server.

Thanks
Parul Aggarwal

@parulagg,

If the images are stored on a private network and require authentication in order to load, the HtmlLoadOptions.ResourceLoadingCallback property can be used to pass the needed credentials. A class implementing IResourceLoadingCallback is used to control how resources such as images or CSS are handled when they need to be downloaded from an external source i.e a network or internet. Inside IResourceLoadingCallback.ResourceLoading method, you first need to connect through proxy to fetch image data and then pass image data to the ResourceLoadingArgs.SetData method. Hope, this helps.

Thanks. I will try and let you know if I find any issues.
What about point no. 1, where I want to read the src attribute value for the image tag, so that I can read the image name before saving to local disk.

@parulagg,

I think for this purpose, you can use getOriginalUri inside the class implementing IResourceLoadingCallback.

static class HandleResourceLoadingCallback implements IResourceLoadingCallback
{
    public int resourceLoading(ResourceLoadingArgs args)
    {
        if (args.getResourceType() == ResourceType.IMAGE){
            System.out.println(args.getOriginalUri());
        }

        return ResourceLoadingAction.DEFAULT;
    }
}
/////////////////////////////////////////////
HtmlLoadOptions options = new HtmlLoadOptions();
options.setLoadFormat(LoadFormat.HTML);
options.setResourceLoadingCallback(new HandleResourceLoadingCallback());

Document doc = new Document("D:\\temp\\imgtest.html", options);

Thank you for the help. I have found a workaround to fetch image name when downloading image using resource loader. Our team is taking a look into the resulting RTF’s, if we find no issues, we will go ahead with the buying process.
Let me know who will be the right person to contact in sales department.

Thanks
Parul Aggarwal

@parulagg,

You can contact our sales team via Aspose.Purchase forum.

For any further technical queries, please use Aspose.Words forum.

Hi

We got the licensee for Aspose.words and I using it in our project. Even after using the licensed copy of Aspose.words I see “(image not displayed)” message besides every image. Not sure why. Please the attached screenshot.
image_message.jpg (363.4 KB)

Please let me know why this message is appearing against every image in the document.

Thanks
Parul

@parulagg,

Thanks for your inquiry. To ensure a timely and accurate response, please ZIP and attach the following resources here for testing:

  • Your simplified input document
  • Aspose.Words 18.11 generated output document showing the undesired behavior
  • Your expected document showing the correct output. You can create expected document by using MS Word.
  • Please also create a standalone simple Java application (source code without compilation errors) that helps us to reproduce your current problem on our end and attach it here for testing. Please do not include Aspose.Words JAR files in it to reduce the file size.
  • Any additional steps that you think might be required to reproduce this issue on our end.

As soon as you get these pieces of information ready, we will start further investigation into your issue and provide you more information. Thanks for your cooperation.

Its the same program I am using that I had sent earlier. Look at comment no. 3 where I sent a zip file named for_aspose.zip.
Attaching the RTF file(Arabic_Content.zip) that contains the message beside each image and the desired RTF file.

rtf.zip (17.4 KB)