Hi Team,
Below logic we are using to covert OCR converted pdf to PDF/A compliance. However after the conversion we noticed pdf become non searchable. We have tried Aspose online converter tool and able covert file to PDA/A without any issue. Please advise do we need to set any additional parameter to below code ? or please share us the code used in online converter application. Thanks.
Aspose-pdf-20.8.jar
Java 8
com.aspose.pdf.Document doc = new com.aspose.pdf.Document(new ByteArrayInputStream(FileUtils.readFileToByteArray(new File(“sample.pdf”))));
while(!doc.validate( new PdfFormatConversionOptions(PdfFormat.PDF_A_2U, ConvertErrorAction.Delete))){
doc.convert( new PdfFormatConversionOptions(PdfFormat.PDF_A_2U, ConvertErrorAction.Delete));
}
doc .save(“sample-pdf-a.pdf”);
The shared PDF (which you obtained from online app) is PDF/A_2A compliant whereas you are converting the document into PDF/A_2U via code snippet. Nevertheless, we were able to notice the mentioned issue i.e. output document was not searchable anymore and have logged it as PDFJAVA-39766 in our issue tracking system. We will further look into its details and keep you informed about its rectification status. Please have patience and give us some time.
We really regret to inform that the issue has not been yet resolved due to other issues which were logged prior to it. However, we have recorded your concerns and will surely consider them during issue investigation. We will inform you once we have certain updates regarding issue rectification. Please give us some time.