Converted Pdf to Word then combining word files


#1

Part1.pdf (56.4 KB)
Part2.pdf (74.9 KB)

I have a slight issue when combing word files using the attached code. I have combined word files for a long time with no problem for my application and now users are uploading pdfs to my app which I then combine into word files then append these word files together. Unfortunately they do not render correctly. I can guess the reason why as when aspose.pdf converts pdf to word it looks perfect but all the text and images are floating so when I combine this with other word files they float over one another. Is there a way of getting the output to look correct?

For example I have 2 pdf files Part1.pdf and Part2.pdf. These pdfs get converted to Word files. I then combine the word files to create a file called combined.docx. Looking at the final file some text overlaps for example I would expect the header Company 2 to be further down

Aspose.Pdf.Document FirstDoc = new Document(“Part1.pdf”);

        Document FirstDocWord = new Document();


        int createdFilePageCount = FirstDoc.Pages.Count;

        for (int s = 1; s < createdFilePageCount + 1; s++)
        {

            FirstDocWord.Pages.Add(FirstDoc.Pages[s]);



        }


        FirstDoc.Save("Part1.docx", SaveFormat.DocX);


        Aspose.Pdf.Document SecondDoc = new Document("Part2.pdf");

        Document SecondDocWord = new Document();


        int createdFilePageCountTwo = SecondDoc.Pages.Count;

        for (int s = 1; s < createdFilePageCountTwo + 1; s++)
        {

            SecondDocWord.Pages.Add(SecondDoc.Pages[s]);



        }


        SecondDoc.Save("Part2.docx", SaveFormat.DocX);

Aspose.Words.Document doc = new Aspose.Words.Document();

        Aspose.Words.License license = new Aspose.Words.License();
        license.SetLicense("Aspose.Words.lic");



        // We should call this method to clear this document of any existing content.
        doc.RemoveAllChildren();


        Aspose.Words.Document srcDocOne = new Aspose.Words.Document("Part1.docx");
        Aspose.Words.Document srcDocTwo = new Aspose.Words.Document("Part2.docx");
        srcDocTwo.FirstSection.PageSetup.SectionStart = Aspose.Words.SectionStart.Continuous;

        // Append the source document at the end of the destination document.
        doc.AppendDocument(srcDocOne, Aspose.Words.ImportFormatMode.UseDestinationStyles);
        doc.AppendDocument(srcDocTwo, Aspose.Words.ImportFormatMode.UseDestinationStyles);

        doc.Save("Combined.docx");

#2

@Matt_b,

Thank you for sharing details and sample files. We have used your code snippet to convert PDF to Word file format and the following code snippet to combine two Word file. The resultant file does not have any overlapping. Sample output (combine.docx) is also attached for your reference. Please use the latest version of the API along with given code snippet and update us with your feedback.

CODE:

Aspose.Words.Document doc = new Aspose.Words.Document();
doc.RemoveAllChildren();

        Aspose.Words.Document srcDocOne = new Aspose.Words.Document(@"Part1_18.5.docx");
        Aspose.Words.Document srcDocTwo = new Aspose.Words.Document(@"Part2_18.5.docx");
        doc.AppendDocument(srcDocOne, Aspose.Words.ImportFormatMode.UseDestinationStyles);
        doc.AppendDocument(srcDocTwo, Aspose.Words.ImportFormatMode.UseDestinationStyles);

        doc.Save(@"Combined.docx"); 

Combined.zip (83.6 KB)