Aspose.Words conver pdf to markdown anomaly memory use

Use the following code for

try:
   document1 = aw.Document("自訂格式(電商匯入中心)11.pdf")
   save_options = aw.saving.MarkdownSaveOptions()
   save_options.image_resolution = 150
   document1.save(f"自訂格式(電商匯入中心)11.md", save_options)
except Exception:
   traceback.print_exc()

file:
自訂格式(電商匯入中心)11.pdf (228.2 KB)

memory use :

@David_Matin

When converting a PDF to Markdown using Aspose.Words, you may encounter high memory usage, especially with large documents. To help mitigate this issue, you can enable memory optimization in the MarkdownSaveOptions class. This option reduces memory consumption during the saving process, although it may increase the time taken to save the document.

Here’s how you can modify your code to enable memory optimization:

try:
    document1 = aw.Document("自訂格式(電商匯入中心)11.pdf")
    save_options = aw.saving.MarkdownSaveOptions()
    save_options.image_resolution = 150
    save_options.memory_optimization = True  # Enable memory optimization
    document1.save(f"自訂格式(電商匯入中心)11.md", save_options)
except Exception:
    traceback.print_exc()

By setting save_options.memory_optimization to True, you can significantly decrease the memory footprint during the conversion process. However, keep in mind that this may result in slower performance when saving the document.

For more details on the MarkdownSaveOptions class and its properties, you can refer to the official documentation here and for memory optimization specifics, check this page.

If you continue to experience issues, consider checking the size and complexity of the PDF document you are converting, as these factors can also impact memory usage.

Sources:
[1]: MarkdownSaveOptions Class | Aspose.Words for .NET
[2]: SaveOptions.MemoryOptimization | Aspose.Words for .NET

@David_Matin It looks like the problem occurs becasue the image cannot be loaded properly. On my side Aspose.Words throws an exception upon loading your PDF document.

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): WORDSNET-27991

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.