We have observed huge difference in result between converting pdf to word online and using the code. Online converted word file retains header, footer, fonts and alignments are good. Whereas code converted word file looks fonts missing, alignment & header are not maintained. We have tried different combination of setting and save methods. Nothing works like online conversion. Kindly let us know what different settings or conversion method is used in online tool.
using (MemoryStream inputStream = new MemoryStream())
{
using (MemoryStream outputStream = new MemoryStream())
{
inputStream.Write(pdfBytes, 0, pdfBytes.Length);
// Open pdf document
Document pdfDocument = new Document(inputStream);
//pdfDocument.Save(@"c:\code\TestAspose.docx", SaveFormat.DocX);
var docSaveOptions = new DocSaveOptions()
{
Format = DocSaveOptions.DocFormat.DocX,
Mode = DocSaveOptions.RecognitionMode.Flow,
RecognizeBullets = true,
RelativeHorizontalProximity = 2.5f
};
// Save output in docx
pdfDocument.Save(outputStream, docSaveOptions);
wordBytes = outputStream.ToArray();
}
}
Could you please ZIP and attach your input PDF and problematic output DOCX here for testing? We will investigate the issue and provide you more information on it.
The online PDF to Word converter uses Aspose.Words for .NET. The fonts of output DOCX are incorrect because fonts are not installed on the system that are used in your PDF.
Using Aspose.PDF for .NET, there is no issue with fonts and alignment of text in DOCX. However, the PDF header is not exported as header of Word document. For the sake of correction, we have logged this problem in our issue tracking system as PDFNET-51473 . You will be notified via this forum thread once this issue is resolved.
Thanks for logging this problem in issue tracker. It’s not only the header and footer, the whole flow of document is different. Please try switching to web layout in word in both (online & code) documents to see the difference. Pdf uses simple fonts (Times & Calibri). They are not converting in word properly.
You mentioned that online uses Aspose.Words for .NET, how does that work? you load pdf to Aspose.Words doc? Could you please share the snippet for this. Thanks.
The input PDF has other font for body text. Please check the attached image. image.png (358.0 KB)
Could you please ZIP and attach your expected output Word document here for our reference? We will investigate this issue further and provide you more information on it.