Dear Aspose team,
I am converting a pdf file to docx (both files are attached in a zip archive) and my problem is that in some cases the content of bullet lists is not represented in the xml markup as normal text runs with simple text, but each character is converted into a w:sym element.
Can you tell me what makes this text special, what prohibits it being just converted into w:r/w:t tags?
Thanks in advance,
Best regards,
Gergely Vándor
0028699
Hi Gergely,
Thanks for your inquiry. I have tested the conversion using Aspose.Pdf for .NET 12.0.0 with following code snippet and unable to notice sym markup in resultant DOCX file. I will appreciate it if you please download and try latest version of Aspose.Pdf for .NET with following code and share the results.
// Open the source PDF document
Document pdfDocument = new Document(@"orig.pdf");
// Save using save options
// Create DocSaveOptions object
Aspose.Pdf.DocSaveOptions saveOptions = new Aspose.Pdf.DocSaveOptions();
saveOptions.Format = Aspose.Pdf.DocSaveOptions.DocFormat.DocX;
// Set the recognition mode as Flow
saveOptions.Mode = Aspose.Pdf.DocSaveOptions.RecognitionMode.Flow;
// Set the Horizontal proximity as 2.5
saveOptions.RelativeHorizontalProximity = 2.5f;
// Enable the value to recognize bullets during conversion process
saveOptions.RecognizeBullets = true;
// Save the resultant DOC file
pdfDocument.Save(@"saveOptionsOutput_out_.docx", saveOptions);
Best Regards,