Convert XFA Based PDF to DOCX and Concatenate using Aspose.PDF for .NET - data is lost in output

mpoole · July 12, 2017, 4:42pm

Aspose.pdf version 17.4 (for DotNet).
Code is in C#.

I have 5 pdfs, 4 of the PDFs are created from MS Word (saved as PDF), and one is an XFA based PDF.
When I concatenate all 5 pdfs into one PDF, the data from the XFA PDF is ‘removed’ from the final PDF.

Here is the code I’m using to concatenate the 5 pdfs into one:

_streamList is a List of the 5 pdfs as stream.

PdfFileEditor editor = new PdfFileEditor();
using (FileStream stream = File.Create(path))
{
editor.Concatenate(_streamList.ToArray(), stream);
}

Any help is much appreciated!
-Mark.

asad.ali · July 12, 2017, 7:56pm

@mpoole

Thanks for contacting support.

Please note that it is always recommended to use the latest version of the API which is Aspose.Pdf for .NET 17.7, because it includes all fixes and enhancement in it. Moreover, you may also use DOM (Document Object Model) approach to concatenate the PDF file(s), which is recommended as well instead of old Aspose.Pdf.Facades.

You may find more information about PDF concatenation via DOM in “Concatenate PDF Files” article in the API Documentation. In case if you still face any issue, we will really appreciate if you can share sample input document(s), so that we can try to replicate the issue at our end and address it accordingly.

Best Regards,
Asad Ali

mpoole · October 2, 2017, 4:34pm

Hello,
Using Aspose.pdf 17.9, I’m not able to successfully concatenate the following three Pdfs and maintain all data:

AcroForms.1.pdf (146.3 KB)
AcroForms.2.pdf (131.0 KB)
Original.XfaBased.pdf (464.4 KB)

The data from the XFA based Pdf is not included in the final concatenated PDF.
I’m using this code as the core of my concatenation method:
pdfEditor.Concatenate(listStream.ToArray(), stream);

Additionally, I’m not able to convert the Xfa based PDF to an Acro Forms document. I’ve tried the following methods:

var convertOptions = new PdfFormatConversionOptions(PdfFormat.PDF_X_1A);
doc.Convert(convertOptions);

doc.RemovePdfaCompliance();

Convert Pdf to word doc or docx (no data is preserved in the conversion).

Any help you can offer would much appreciated.

-Mark

asad.ali · October 2, 2017, 9:30pm

@mpoole

Thanks for your inquiry.

As you have also reported these issue(s) in this forum thread, so we have already logged the issues as PDFNET-43439 and PDFNET-43440 in our issue tracking system and shared relevant details with you in other thread.

We have observed that after converting XFA based PDF into DOCX format, the field values were not preserved, so we have generated a ticket as PDFNET-43441 against this issue as well, in our issue tracking system. As soon as we have some definite updates, regarding resolution of the logged ticket(s), we will inform you. Please be patient and spare us little time.

We are sorry for the inconvenience.