Issue with Arabic Text Conversion from PDF to Word using Aspose PDF for C#

Hi team,
We are facing issue while converting a pdf having arabic text to word format. It is converting the doc and format is also maintained but some text is breaking and some is overlapping.
Can you please help with the same?
We are using aspose pdf for C#
.

@asemalsaif

When converting a PDF containing Arabic text to Word format using Aspose.PDF for C#, issues such as text breaking and overlapping can occur. To address these problems, you can try the following approach:

  1. Use the Correct Save Options: Ensure that you are using the DocSaveOptions with the appropriate recognition mode. Specifically, set the mode to DocSaveOptions.RecognitionMode.Flow or DocSaveOptions.RecognitionMode.EnhancedFlow. This helps maintain the layout and flow of the text during conversion.

Here is a sample code snippet that demonstrates how to implement this:

using Aspose.Pdf;

class Program
{
    static void Main()
    {
        // Load the PDF document
        Document pdfDocument = new Document("path/to/your/input.pdf");

        // Create DocSaveOptions and set the format and mode
        DocSaveOptions saveOptions = new DocSaveOptions();
        saveOptions.Format = DocSaveOptions.DocFormat.DocX;
        saveOptions.Mode = DocSaveOptions.RecognitionMode.Flow; // or EnhancedFlow

        // Save the document to DOCX format
        pdfDocument.Save("path/to/your/output.docx", saveOptions);
    }
}
  1. Check for Anomalies: After conversion, if you still notice any issues, it would be helpful to share screenshots or the specific files that exhibit these problems. This allows for better troubleshooting and assistance.

By following these steps, you should be able to improve the conversion quality of Arabic text from PDF to Word. If issues persist, consider reaching out with specific examples for further support.

Sources:
[1]: Arabic pdf to word conversion
[2]: Arabic pdf to word conversion
[3]: Arabic pdf to word conversion