We are trying to use Aspose PDF and Apose word in Azure Databricks by using python. It is crashing when try to set liscence or try to use any function by below message. Is there any one successful
Fatal error: The Python kernel is unresponsive.
The Python process exited with exit code 134 (SIGABRT: Aborted).
The last 10 KB of the process’s stderr and stdout can be found below. See driver logs for full logs.
Last messages on stderr: int/futures/thread.py", line 89 in _worker
File “/usr/lib/python3.12/threading.py”, line 1010 in run
File “/usr/lib/python3.12/threading.py”, line 1073 in _bootstrap_inner
File “/usr/lib/python3.12/threading.py”, line 1030 in _bootstrap
Cause The issue you’re experiencing with Aspose libraries in Azure Databricks, resulting in the Python kernel becoming unresponsive, may be due to several factors, including compatibility issues, resource limitations, or incorrect configurations when setting the license or using the libraries. Solution Here are some steps to troubleshoot and resolve the issue: 1. Check Compatibility: Ensure that the versions of Aspose libraries you are using are compatible with the Python version running in Azure Databricks. Refer to the official Aspose documentation for compatibility details. 2. Resource Allocation: Azure Databricks may have resource limitations. Ensure that your cluster has sufficient memory and CPU resources allocated. Increase the cluster size or use a different instance type if necessary. 3. License Configuration: When setting the license, ensure that the license file is correctly referenced and accessible. Use the following code snippet to set the license: python import aspose.words as aw # Load the license license = aw.License() license.set_license("path/to/your/license.lic") Verify that the path to the license file is correct and that the file is accessible from your Databricks environment. 4. Error Handling: Implement error handling to catch exceptions that may arise when using Aspose functions. This can help identify specific issues causing the kernel to crash. python try: # Your Aspose code here except Exception as e: print(f"An error occurred: {e}") 5. Testing in Isolation: Test the Aspose functionality in a smaller, isolated notebook to determine if the issue persists. This can help identify whether the problem is related to the specific notebook or the overall environment. Next Steps - Review the logs for any specific error messages that might provide more insight into the issue.
If the problem continues, reach out to Aspose support with detailed logs and the code you are using for further assistance. Please implement these steps and let me know if you need any further help!
@baludharmarajan Can you share the code snippet where the error occurs? Also, please specify what versions of Aspose.PDF for Python and Aspose.Words for Python you are using? Once we have these details, we will start investigating and try to reproduce the issue.
We are using the latest version of aspose python releases. Databricks is having python 3.12. We are only trying Aspose in notebook. The driver and executor is both set for 28GB ram (standard Ds4_V2). Databricks runtime is 16.3. Please find the code below:
from typing import List
import logging, json, time, uuid, io, base64, functools
from datetime import datetime
from delta.tables import DeltaTable
from pyspark.sql import functions as F, types as T
Aspose
import aspose.words as aw
import aspose.pdf as ap
import aspose.slides as sl
class AsposeConverter:
def init(self):
lic_xml = “”
stream = io.BytesIO(base64.b64decode(lic_xml) if lic_xml.strip().startswith(“PGx”) else lic_xml.encode())
for lib in [aw, ap, sl]:
try:
lic = lib.License()
stream.seek(0)
lic.set_license(stream)
except Exception as e:
pass # some libs may not match licence type
def _output_folder(self, doc_path: str) -> str:
name = os.path.splitext(os.path.basename(doc_path))[0]
return os.path.join(CFG.IMAGE_ROOT, name)
def convert(self, doc_path: str) -> List[str]:
"""
Returns list of image paths written (one png per page)
"""
ext = os.path.splitext(doc_path.lower())[1]
out_dir = self._output_folder(doc_path)
page_images = []
if ext in (".pdf",):
pdf = ap.Document(doc_path)
for i in range(pdf.pages.count):
pb = ap.devices.PngDevice()
id_path = f"{out_dir}/page_{i+1:04}.png"
page_stream = io.BytesIO()
pb.process(pdf.pages[i], page_stream)
dbutils.fs.put(id_path, page_stream.getvalue(), overwrite=True)
page_images.append(id_path)
elif ext in (".ppt", ".pptx"):
prs = sl.Presentation(doc_path)
for i, slide in enumerate(prs.slides):
id_path = f"{out_dir}/slide_{i+1:04}.png"
img = slide.get_thumbnail(2, 2)
bio = io.BytesIO()
img.save(bio, "PNG")
dbutils.fs.put(id_path, bio.getvalue(), overwrite=True)
page_images.append(id_path)
elif ext in (".docx",):
doc = aw.Document(doc_path)
save_opts = aw.saving.ImageSaveOptions(aw.SaveFormat.PNG)
for i in range(doc.page_count):
save_opts.page_set = aw.saving.PageSet(i)
id_path = f"{out_dir}/page_{i+1:04}.png"
doc.save(id_path, save_opts)
page_images.append(id_path)
else:
raise ValueError(f"Unsupported extension {ext}")
return page_images
@baludharmarajan
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): PDFPYTHON-419
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.
Sets consent for sending user data to Google for online advertising purposes.
Sets consent for personalized advertising.
Cookie Notice
To provide you with the best experience, we use cookies for personalization, analytics, and ads. By using our site, you agree to our cookie policy.
More info
Enables storage, such as cookies, related to analytics.
Enables storage, such as cookies, related to advertising.
Sets consent for sending user data to Google for online advertising purposes.
Sets consent for personalized advertising.
Cookie Notice
To provide you with the best experience, we use cookies for personalization, analytics, and ads. By using our site, you agree to our cookie policy.
More info
Enables storage, such as cookies, related to analytics.
Enables storage, such as cookies, related to advertising.
Sets consent for sending user data to Google for online advertising purposes.