We are facing issue if the image path contains (secure) https protocol .
Just an FYI …same code works fine with http protocol.
Here is a part of code:-
doc.getMailMerge().execute(new String[] {"userImage"},
new Object[] {"https://a.imeetbeta.net/san/users/000/000/032/521/user_avatar.jpg"});
// Save the document in Word format.
// }
outputStream=new java.io.ByteArrayOutputStream();
doc.save(outputStream, com.aspose.words.SaveFormat.RTF);
end = System.currentTimeMillis();
The following error we are getting:-
Error …Cannot load image from field ‘userImage’. The field contains data in unsupported format. sun.security.validator.
ValidatorException: PKIX path validation failed: java.security.cert.CertPathValidatorException: basic constraints check
failed: pathLenConstraint violated - this cert must be the last cert in the certification path
Please let me know if any thing we need to set to support https to avoid this error.
Thanks for your request. I managed to reproduce the problem on my side. Your request has been linked to the appropriate issue. You will be notified as soon as it is resolved. This issue does not occur in NET version of Aspose.Words. Once we finish synchronization Java and .NET versions of Aspose.Words, all functionality which is supported in, NET version will be supported in Java version, including the fix of this issue. Hopefully it will be somewhere at the end of this month or at the beginning of the next month. You will be notified.
We are happy to inform you that the first auto-ported version of Aspose.Words for Java is ready. This version supports converting documents to PDF. You can get it from here.
I’m moving an app from Aspose.Words for Java v4.0.2 to v10.1.0. I was having a problem with converting an HTML documents with https image URLs to Word document with v4.0.2. Would this same fix likely apply to my problem?
Thanks for your request. Both issues you have reported earlier are resolved. So please try using the latest version of Aspose.Words for Java and let us know in case of any issues.
Sadly, that didn’t help. I have a simple HTML page with an https image reference and when Aspose builds a Word doc, the image is missing. Should I post stuff here or start a new post/thread?
Thank you for additional information. I suppose this is not Aspose.Words issue but permissions issue. Aspose.Words just does not have rights to download the images and that is why the problem occurs. However, since, Aspose.Words supports base64 as image source, you can create your own method to get images and replace image path in src attribute with base64 representation of the image. Here is a very simple code that demonstrates the technique:
[Test]
public void Test001()
{
// Get Html string.
string html = File.ReadAllText(@"Test001\in.html");
// Create a regular expression that will help us to find image SRCs.
Regex urlRegex = new Regex("src\\s*=\\s*[\"']+(http(s)?://([\\w-]+\\.)+[\\w-]+(/[\\w- ./?%&=]*)?)[\"']+");
// Serch for SRCs.
MatchCollection matchs = urlRegex.Matches(html);
foreach (Match match in matchs)
{
// Replace urls with embedded base64 images.
html = html.Replace(match.Groups[1].Value, GetBase64(match.Groups[1].Value));
}
// Now you can insert HTML into the document. All images are embedded into the HTML string.
DocumentBuilder builder = new DocumentBuilder();
builder.InsertHtml(html);
builder.Document.Save(@"Test001\out.doc");
}
private string GetBase64(string imageUrl)
{
string base64Data = "";
try
{
// Prepare the web page we will be asking for
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(imageUrl);
request.Method = "GET";
request.ContentType = "image/jpeg";
request.UserAgent = "Mozilla/4.0+(compatible;+MSIE+5.01;+Windows+NT+5.0";
// Execute the request
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
//We will read data via the response stream
Stream resStream = response.GetResponseStream();
//Write content into the MemoryStream
BinaryReader resReader = new BinaryReader(resStream);
// Build base64 string.
base64Data = string.Format("data:image/jpeg;base64,{0}",
Convert.ToBase64String(resReader.ReadBytes((int)response.ContentLength)));
}
catch (Exception)
{
}
return base64Data;
}
I hope such approach could help you to achieve what you need.
Thanks for the suggestions. What do you mean about the permissions? To clarify, no authentication/authorization is necessary to access the images I’m referencing, just the https handshake. Also, although there’s just one image in my test file, the real files might have 200+ images. I don’t think it will be practical from a performance standpoint to base64-encode that many images.
Thank you for additional information. There must be something on your side. I can successfully insert an image from HTTPS location. For example, see the following code:
DocumentBuilder builder = new DocumentBuilder();
builder.insertHtml("<img src='https://encrypted.google.com/images/logos/ssl_logo_lg.gif' />");
builder.getDocument().save("C:\\Temp\\out.doc");
Could you please try running this code on your side? Do you see the image in the output document?
Interesting! I first replaced the image reference in my test page with the encrypted Google logo (sorry for not exactly following directions ) and it worked! I then put the two images side-by-side and the Google logo continued to appear, but my image reference still doesn’t. Other than being a PNG rather than a GIF, the other difference is that the server my image is coming from is using a self-signed (non-CA) cert. Think that’s the problem?
Have you tried opening the image locally to see if it works? Perhaps the image itself cannot be viewed by the browser. It would be a good bet to try a different browser to see if this is case as well.
Reasonable questions. Yup, it works fine if I just paste the image URL into a browser, even when it’s a separate one in which I’ve never logged into the webapp. I access our app in Firefox but tested the image URL in Chrome.
Thank you for additional information. Will I be able to view these image on my side? If so could you please provide one or few urls to the images? I will check on my side and provide you more information.
Unfortunately, the app isn’t hosted anywhere that’s publicly-accessible (it’s an intranet app). If I can think of a simple way to make some suitable image URLs (https, self-signed cert) available, I’ll do so, but it might be easier or faster for you to get something like that running in your own environment.
Thank you for additional information. Have you tried using code like the following to get the image from url:
// Here you should put an url to your image.
String imageUrl = "https://encrypted.google.com/images/logos/ssl_logo_lg.gif";
String outFileName = "C:\\Temp\\out.png";
BufferedImage image = ImageIO.read(new URL(imageUrl));
ImageIO.write(image, "png", new File(outFileName));
Aha! As you may have expected, that approach didn’t work. It resulted in the following stacktrace, which seems to suggest that it’s expecting a CA-signed cert rather than a self-signed one. I’ll see if can make this work on my end.
–Matt
Exception in thread "main" javax.imageio.IIOException: Can't get input stream from URL!
at javax.imageio.ImageIO.read(ImageIO.java:1369)
at com.sl.surveyor.ui.TestOfficeExport.testHttpsImageExport(TestOfficeExport.java:161)
at com.sl.surveyor.ui.TestOfficeExport.main(TestOfficeExport.java:45)
Caused by: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at com.sun.net.ssl.internal.ssl.Alerts.getSSLException(Alerts.java:174)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1649)
at com.sun.net.ssl.internal.ssl.Handshaker.fatalSE(Handshaker.java:241)
at com.sun.net.ssl.internal.ssl.Handshaker.fatalSE(Handshaker.java:235)
at com.sun.net.ssl.internal.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1206)
at com.sun.net.ssl.internal.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:136)
at com.sun.net.ssl.internal.ssl.Handshaker.processLoop(Handshaker.java:593)
at com.sun.net.ssl.internal.ssl.Handshaker.process_record(Handshaker.java:529)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:893)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1138)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1165)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1149)
at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:434)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:166)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1172)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:234)
at java.net.URL.openStream(URL.java:1010)
at javax.imageio.ImageIO.read(ImageIO.java:1367)
... 2 more
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:323)
at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:217)
at sun.security.validator.Validator.validate(Validator.java:218)
at com.sun.net.ssl.internal.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:126)
at com.sun.net.ssl.internal.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:209)
at com.sun.net.ssl.internal.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:249)
at com.sun.net.ssl.internal.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1185)
... 15 more
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:174)
at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:238)
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:318)
... 21 more
Thank you for additional information. It is perfect that the code suggested you a place where the reason of the problem lies. Please feel free to ask in case of any issues, we are always glad to help you.
I did a Google search for “PKIX path building failed” and found out that it might work to add the self-signed cert used for HTTPS to the JVM’s trust store, but several people hadn’t had much luck with that approach. Fortunately, in my case, I was able to figure out an acceptable way to use HTTP instead of HTTPs, rather than addressing the underlying problem. In contrast, a different library we use to convert HTML to PDF is able to access the same images with no issues. Although I don’t need a fix in Aspose.Words at this time, it might be worth further investigation at some point.