PDF properties not updated in PDF created with older Aspose version

Hi,
We are trying to updating the properties of a PDF file in PDFs created with older version of Aspose.
A copy of the file is made, but the properties itself aren’t changed.
We are using the Aspose Java 20.6 version.
See settings in the original PDF:
setting2.png (51.6 KB)
pdfsetting1.png (52.3 KB)

The PDf were made with Aspose .NET 17.6

Some code snippets:

			Document pdfDocument = null;
			try {
			  pdfDocument = new Document(importfile );

				com.aspose.pdf.DocumentInfo docInfo = pdfDocument.getInfo();
				// set Author information
				docInfo.setAuthor("Backbook Migration");
				docInfo.setCreationDate(new java.util.Date());
				docInfo.setKeywords(referenceRate);
				docInfo.setModDate(new java.util.Date());
				docInfo.setSubject("Proposal");
				docInfo.setTitle("Company Contract 1234-3333");
				System.out.println("Adding metadata");
			} catch (Exception e)
			{
				
				logger.error("Error adding information to PDF");
			}
			try {
				logger.debug("Repairing PDF");
				pdfDocument.repair();
				logger.debug("Finished Repairing PDF");
			} catch (Exception e)
			{
				logger.error("Error repairing file.. Error "+ e.getMessage());
			}
			try {
				System.out.println("Saving PDF");
				
				pdfDocument.save(exportdir+"/company"+companyname()+".pdf");;
				System.out.println("Finished saving pdf");
			} catch (Exception e)
			{
				logger.error("Error saving file.. Error "+ e.getMessage());
				
			}

@kpboerema

Would you please share your sample source PDF document with us. We will test the scenario in our environment and address it accordingly.

samples170d3f4f-3510-4373-98e1-efca065f83c6.pdf (10.9 KB)
extracted_.pdf (10.6 KB)

Hi,

I had to strip them, because it contained confidentialy information. But with the extracted one, the same happens… The results (samples…) has no updated data. or updating the extracted isn’t containing the updated metadata.
Hopefully you can explain.

@kpboerema

We have tested the scenario using same code snippet with Aspose.PDF for Java 20.6 and were unable to notice any issue. Both PDF outputs were fine and showing the updated properties set by the code.

output.pdf (11.0 KB)
output1.pdf (11.0 KB)

It seems like some other part of your program is causing the issue at your end. Would you kindly try to test the scenario in a simple Console Application and in case issue still persists, please share that application with us. We will test the scenario in our environment again and address it accordingly.

Hello,

Maybe it is related with another issue, that some parts of the PDF is missing a Font.
test3.pdf (19.1 KB)

If we try to extract Text from the PDF, the extraction fails with error “Error processing and grabbing text from PDF. Invalid font name”. Maybe could that effect the saving of the File?

Is there a way to fix this… I tried the repair method, but that didn’t fix the missing font.
Is there a way to replace missing fonts, with default font, so we can extract and safe this?
Below the code that faills, when extracting the font…
Maybe related to having problem to safe this, as font is still missing?
I will make a little program to only update the properties and see what happens.

Document pdfDocument = new Document((isdirectory? importdir+"/" : “” )+pdffiles[i]);
try {
pdfDocument.repair();
} catch (Exception e)
{
System.err.println(“Hmm reparing failed…”);
}
System.out.println(“Processing file " + (isdirectory? importdir+”/" : “” )+pdffiles[i] + " number of pages: “+pdfDocument.getPages().size() + " number of images:”+nrOfImages(pdfDocument));

			// prevent out of memory by not reading everything at once...
			if (pdfDocument.getPages().size() < 100)
			{	
				TextAbsorber textAbsorber = new TextAbsorber(); 
			 
			 
				pdfDocument.getPages().accept(textAbsorber); 
				extractedText =  textAbsorber.getText();

Thanks,
Klaas Pieter.

@kpboerema

You can set the default font name for the PDF document. We tried to do that but, API still threw Invalid Font name exception. Therefore, we have logged an issue as PDFJAVA-39535 in our issue tracking system for the sake of investigation. We will further check this in detail and keep you posted with the status of its rectification. Please spare us some time.

We used following code snippet to set the default font and then extract text:

Document doc = new Document(dataDir + "test3.pdf");
com.aspose.pdf.PdfSaveOptions ops = new com.aspose.pdf.PdfSaveOptions();
// Set default font name
ops.setDefaultFontName("Arial");
// Save PDF file
doc.save(dataDir + "output_out.pdf", ops);
doc = new Document(dataDir + "output_out.pdf");
TextAbsorber textAbsorber = new TextAbsorber();
doc.getPages().accept(textAbsorber);
String extractedText =  textAbsorber.getText();

okay.

Another reason why the metadata is lost, if you do a repair before you save it.
So I was setting the properties, and did a repair before saving it.
then the property information seems to be lost and is not saved.
If I remove the repair, the metadata is saved.
Should the repair only be used before changing anything?

		com.aspose.pdf.DocumentInfo docInfo = pdfDocument.getInfo();
		// set Author information
		docInfo.setAuthor("R4 Backbook Migration");
		docInfo.setCreationDate(new java.util.Date());
		docInfo.setKeywords("LIBOR");
		docInfo.setModDate(new java.util.Date());
		//docInfo.setSubject("PDF Information");
		//docInfo.setTitle("Setting PDF Document Information");
		System.out.println("Adding metadata");
		try {
			System.out.println("Repairing PDF");
			pdfDocument.repair();
			System.out.println("Finished Repairing PDF");
		} catch (Exception e)
		{
			System.err.println("Error repairing file.. Error "+ e.getMessage());
			//e.printStackTrace();
		}

It is the failed textgrabbing that is causing that the metadata can’t be updated.
If I remove the text-grabbing part, the metadata is being updated.

@kpboerema

We have already logged the ticket for the excpetion that occurs during text extraction. As soon as it resolved, we will inform you. Meanwhile, you may please check with your other PDF documents and share them with us if same issue occurs.

The issues you have found earlier (filed as PDFJAVA-39535) have been fixed in Aspose.PDF for Java 20.10.