PDF to PDF/A convert issue


#1

Hello,

I have an issue for converting a PDF to PDF/A-1b when using com.aspose.pdf.Document.save()

I have this method

private void convertPDFToASPOSEPDFDocument(com.aspose.pdf.Document inputPDFDocument, int pdfformat) throws IOException, ConvertionToPDFException {
		if (LOGGER.isDebugEnabled()) {
			LOGGER.debug("convertPDFToASPOSEPDFDocument PDFDocument to PDFFormat " + pdfformat);
		}
		
		//create a temporary file to log conversion
		File conversionLog = File.createTempFile("ConversionLog", "xml");
		
		if (!this.validatePDF(inputPDFDocument, pdfformat) &&
				!inputPDFDocument.convert(new FileOutputStream(conversionLog), pdfformat, com.aspose.pdf.ConvertErrorAction.Delete)) {	
				//Log the convertion log to Windchill LOGGER
				List<String> convertionError = Files.readAllLines(conversionLog.toPath(), StandardCharsets.UTF_8);
				WTMessage message = new WTMessage(RESOURCE, ASPOSEMessages.CONVERTION_PDF_TO_PDF_FAILURE,	new Object[] {pdfformat,  convertionError});
				throw new ConvertionToPDFException(message.getLocalizedMessage());
		}
		
		**inputPDFDocument.save();**
		
		if (LOGGER.isDebugEnabled()) {
			LOGGER.debug("The content of convertion log \n:"  );
			LOGGER.debug(Files.readAllLines(conversionLog.toPath(), StandardCharsets.UTF_8));
		}
		
		//delete the temporary validation log
		Files.delete(conversionLog.toPath());		
		
	}

If the inputPDFDocument is instantiate from a File , the File is not compliant PDF/A-1B.
It only works if I save the document to another location or stream but not with this method com.aspose.pdf.Document.save()

Kind regards,

Tony.


#2

@tsuaudeau

Thank you for contacting support.

Would you please share PDF documents which reproduce the problem as well as which work fine, so that we may proceed to reproduce and investigate it in our environment. Before sharing requested data, please ensure using Aspose.PDF for Java 19.9.


#3

Hello @Farhan.Raza,
Thank you for your reply. yes we are using the version 19.9

with any non pdf/A you can reproduce the issue, but I have attached an example dummy.pdf (13.0 KB)

With the following code the docpdf.save(); is not working because in validation n°2 the PDF is not compliant pdf/a-1b

              /** NOT WORKING */
	com.aspose.pdf.Document docpdf = new com.aspose.pdf.Document("/opt/ptc/aspose/validation/dummy.pdf");		
	//create a file to log PDF validation
	File validationLog = File.createTempFile("validationLogPDF", ".xml");				
			
	boolean isValid1 = docpdf.validate(new FileOutputStream(validationLog), com.aspose.pdf.PdfFormat.PDF_A_1B);
	
	System.out.println("1 is valid =" + isValid1);
	//false
	if (!isValid1) {
		//create a temporary file to log conversion
		File conversionLog = File.createTempFile("ConversionLog", "xml");			
		boolean conversionok = docpdf.convert(new FileOutputStream(conversionLog), com.aspose.pdf.PdfFormat.PDF_A_1B, com.aspose.pdf.ConvertErrorAction.Delete);
		if (!conversionok) {
			System.out.println("Conversion KO");
		}else {				
			docpdf.save();
			docpdf.close();				
		}			
	}
	
	//We re-check the previous converted document
	com.aspose.pdf.Document docpdfrecheck = new com.aspose.pdf.Document("/opt/ptc/aspose/validation/dummy.pdf");
	//create a file to log PDF validation
	File validationLog2 = File.createTempFile("validationLogPDF2", ".xml");			
	boolean isValid2 = docpdfrecheck.validate(new FileOutputStream(validationLog2), com.aspose.pdf.PdfFormat.PDF_A_1B);
	docpdfrecheck.close();
	
	System.out.println("Previous converted document is valid =" + isValid2);
        //false should be true!!!!

The only way to make it working is to save the Document to an intermediate OuputStream

           /**WORKING */
	com.aspose.pdf.Document docpdf = new com.aspose.pdf.Document("/opt/ptc/aspose/validation/dummy.pdf");
	
	//create a file to log PDF validation
	File validationLog = File.createTempFile("validationLogPDF", ".xml");				
			
	boolean isValid1 = docpdf.validate(new FileOutputStream(validationLog), com.aspose.pdf.PdfFormat.PDF_A_1B);
	
	System.out.println("1 is valid =" + isValid1);
	//false
	if (!isValid1) {
		//create a temporary file to log conversion
		File conversionLog = File.createTempFile("ConversionLog", "xml");			
		boolean conversionok = docpdf.convert(new FileOutputStream(conversionLog), com.aspose.pdf.PdfFormat.PDF_A_1B, com.aspose.pdf.ConvertErrorAction.Delete);
		if (!conversionok) {
			System.out.println("Conversion KO");
		}else {
                            //Would like to avoid this copy				
			ByteArrayOutputStream output = new ByteArrayOutputStream();
			docpdf.save(output);
			docpdf.close();
			
			com.aspose.pdf.Document docpdf2 = new com.aspose.pdf.Document(new ByteArrayInputStream(output.toByteArray()));
			docpdf2.save("/opt/ptc/aspose/validation/dummy.pdf");
			docpdf2.close();
		}			
	}
	
	//We re-check the previous converted document
	com.aspose.pdf.Document docpdfrecheck = new com.aspose.pdf.Document("/opt/ptc/aspose/validation/dummy.pdf");
	//create a file to log PDF validation
	File validationLog2 = File.createTempFile("validationLogPDF2", ".xml");			
	boolean isValid2 = docpdfrecheck.validate(new FileOutputStream(validationLog2), com.aspose.pdf.PdfFormat.PDF_A_1B);
	docpdfrecheck.close();
	
	System.out.println("Previous converted document is valid =" + isValid2);
            //true

I would like to avoid using an intermediate OuputStream by just using the save() method after the conversion

Kind regards,


#4

@tsuaudeau

Thank you for elaborating further.

We have logged a ticket with ID PDFJAVA-38931 in our issue management system for further investigations. The ticket ID has been linked with this thread so that you will receive notification as soon as the ticket is resolved.

We are sorry for the inconvenience.


#5

Hello,

The ticket PDFJAVA-38931 is marked as resolved.
Do you know in which version?

Kind regards,
Tony.


#6

@tsuaudeau

Thank you for getting back to us.

We would like to update you that we have figured it out as expected behavior and not a bug. The save() method without any parameters means saving document incrementally. It means that the document’s data will not be changed but new information will be added at the end of the document’s file. This is not allowed by PDF/Archive validation because forbidden features are still presents in the document.

So there are two options we can advice:

  1. save the document’s data into temporally ByteArrayInputStream, then save the converted document into the source file:
...
 byte[] fileContent = Files.readAllBytes(new File(documentPath).toPath());
 com.aspose.pdf.Document docpdf = new com.aspose.pdf.Document(new ByteArrayInputStream(fileContent));
...
//docpdf.save();
docpdf.save(documentPath);
...
  1. Save the converted document into temporally ByteArrayOutputStream, then, write this stream into the source file:
...
ByteArrayOutputStream output = new ByteArrayOutputStream();
//docpdf.save();
docpdf.save(output);//Incrementally(filenameOut);
Files.write(new File(documentPath).toPath(),output.toByteArray());
...