Issues with converting PDF with forms to JPEG and other formats

We are using Aspose.PDF to convert PDFs to JPEG format. This is working ok for PDFs without form data, but when we try and process PDFs with form data, we often get unexpected or unsatisfactory results.

One of the issues that occurs is that the PDF form entries/values are sometimes absent from the converted JPEG. When I open the PDF in Adobe Acrobat, the form elements are present and act normally. I’ve attached a sample PDF with this issue as well as the converted JPEG (issue #1).

Another issue that often happens when we try converting a PDF with forms to JPEG is that the form values are present, but they are duplicated and the duplicates are slightly offset from the initial value, resulting in unreadable text. The really strange thing about this issue is that some elements will not be duplicated and some will be within the same document. I am unable to attach a PDF with this issue as they contain sensitive information, but I’ve attached a small screenshot of some of the errored text from the converted JPEG (issue #2).

The C# code we are using to do this process is as follows:

using (var pdfDocument = new Document(dataStream))
using (var info = new PdfFileInfo(pdfDocument))
using (var outputMemoryStream = new MemoryStream())
{
	...
	
    pdfDocument.Form.Flatten();

    var resolution = new Resolution(72, 72);
    JpegDevice jpegDevice = new JpegDevice(pageSize, resolution, settings.JpgQuality);
    jpegDevice.Process(pdfDocument.Pages[pageNum + 1], outputMemoryStream);
    
    ...
}

We have tried with and without the pdfDocument.Form.Flatten() line, and it does not seem to make a difference.

The only “fix” we have found for either of these issues is that saving any of the problematic files with Adobe Acrobat (without making any changes) and then processing them with Aspose will always yield the correct result, but obviously that is not a usable solution.

Issue 1 PDF.pdf (64.2 KB)

Issue 1 JPEG.jpeg (76.8 KB)

Issue 2.jpg (69.2 KB)

@Jacob20
We are looking into it and will be sharing our feedback with you shortly.

@Jacob20

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-56807

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

1 Like