PDF Detection

drolle · July 8, 2020, 6:46pm

Using the following to detect file type, we are expecting to be getting a 40 returned (PDF), but are getting a 70 (TEXT) returned.

import com.aspose.words.FileFormatInfo
import com.aspose.words.FileFormatUtil
import com.aspose.words.SaveFormat
…
FileFormatInfo fileFormatInfo = FileFormatUtil. detectFileFormat (inputStream)
int saveFormat = FileFormatUtil. loadFormatToSaveFormat (fileFormatInfo.loadFormat)
return Optional. of (saveFormat)

The files are PDFs that appear to have been created with Microsoft’s Print to PDF functionality. I can provide sample docs as needed.

tahir.manzoor · July 8, 2020, 7:28pm

@drolle

Please use the latest version of Aspose.Words for Java 20.6. If you still face problem, please ZIP and attach your input PDF file here for testing. We will investigate the issue and provide you more information on it.

drolle · July 8, 2020, 7:43pm

@tahir.manzoor

Please see zip for examples.

Archive.zip (558.5 KB)

tahir.manzoor · July 9, 2020, 8:20am

@drolle

We have tested the scenario using the latest version of Aspose.Words for Java 20.6 with following code example and have not found the shared issue. So, please use Aspose.Words for Java 20.6.

FileFormatInfo info = FileFormatUtil.detectFileFormat("C:\\Users\\temp\\19-608.pdf");
System.out.println(info.getLoadFormat());

The load format returns value 64. Please check the values of load format from here: