Convert Tiff to Searchable PDF

We have purchased the aspose for java, I am trying to covert Tif to searchable pdf. have followed the instructions to install jai-1_1_3-lib-windows-i586.exe, but 0 size PDF generated.

Source Code:
Pdf pdf1 = new Pdf();
Section sec1 = pdf1.getSections().add();
aspose.pdf.Image image = new aspose.pdf.Image(sec1);
java.net.URL url = new java.net.URL(“file:///D:/TIF.tif”);
BufferedImage bufferImage=ImageIO.read(url);
image.getImageInfo().setSystemImage(bufferImage);
image.getImageInfo().setTiffFrame(1);
pdf1.save(“D:/Coverted.pdf”);

0 size PDF generated. i have double checked the following JAR was imported:
jai_windows-i586.jar
aspose-pdf-jdk16.jar

Can anyone help me out? Thanks!

Hi Zhang,


We are sorry for the inconvenience caused. Please share your source TIFF image, we will test the scenario at our end and will provide you more information accordingly.

Best Regards,

As requested, Please see the attached docs. FYI, Noticed that bufferImage object is null when i print this object. No sure if this is the cause

Hi Zhang,


Thanks for sharing the resource files and sorry for the delayed response.

I have tested the scenario and have observed that when trying to load source TIFF image using new java.net.URL(“file:///D:/pdftest/TIF.tif”); code line, javax.imageio.IIOException: Can’t get input stream from URL! error is being generated. However when loading the TIFF image using image.getImageInfo().setFile(“D:/pdftest/TIF.tif”); code line, no such exception/error is generated. Please take a look over following code snippet and I have also attached the resultant PDF generated over my end.

Correct approach

[Java]

aspose.pdf.Pdf pdf1 = new
aspose.pdf.Pdf();<o:p></o:p>

aspose.pdf.Section sec1 = pdf1.getSections().add();

aspose.pdf.Image image = new aspose.pdf.Image(sec1);

// java.net.URL url = new java.net.URL("file:///D:/pdftest/TIF.tif");

//BufferedImage bufferImage=ImageIO.read(url);

image.getImageInfo().setFile("D:/pdftest/TIF.tif");

image.getImageInfo().setTiffFrame(-1);

image.getImageInfo().setImageFileType(aspose.pdf.ImageFileType.Tiff);

sec1.getParagraphs().add(image);

pdf1.save(“c:/pdftest/TIFF_Coverted.pdf”);


You may also consider using the new Document Object Model (DOM) approach of com.aspose.pdf package.

DOM approach
[Java]

// instantiate Document object<o:p></o:p>

com.aspose.pdf.Document doc = new com.aspose.pdf.Document();

// add page to PDF file

doc.getPages().add();

// create Image object

com.aspose.pdf.Image img = new com.aspose.pdf.Image();

// load TIFF image

img.setFile("c:/pdftest/TIF.tif");

// add image to paragraphs collection of first page

doc.getPages().get_Item(1).getParagraphs().add(img);

// save resultant PDF

doc.save(“c:/pdftest/DOM_Approach.pdf”);

Great, It works for me. Thanks a lot for your help.

Hi Zhang,


We are glad to hear that your problem is resolved. Please continue using our API and in the event of any further query, please feel free to contact.

The DOM approach handles multipage tiff to multipage pdf. But what I am getting now that my 1bpp Tiff file all pages CCITT4 compression is being converted to PDF with color. Can I control that by querying Tags within the tiff file and setting save options?

Hi Mike,


Thanks for your inquiry. I have tested your shared document with Aspose.Pdf for Java 10.0.0 and unable to notice the issue i.e. colored output. However, output file size is big so optimized the file size as following.

// instantiate Document object<o:p></o:p>

com.aspose.pdf.Document doc = new com.aspose.pdf.Document();

// add page to PDF file

doc.getPages().add();

// create Image object

com.aspose.pdf.Image img = new com.aspose.pdf.Image();

// load TIFF image

img.setFile(myDir+"3pagetiff.tif");

// add image to paragraphs collection of first page

doc.getPages().get_Item(1).getParagraphs().add(img);

ByteArrayOutputStream outputStream = new ByteArrayOutputStream();

// save resultant PDF

doc.save(outputStream);

doc= new Document(new ByteArrayInputStream(outputStream.toByteArray()));

Document.OptimizationOptions opt = new Document.OptimizationOptions();

opt.setRemoveUnusedObjects ( false );

opt.setLinkDuplcateStreams ( false );

opt.setRemoveUnusedStreams ( false );

// Enable image compression

opt.setCompressImages ( true );

// Set the quality of images in PDF file

opt.setImageQuality (10);

doc.optimizeResources(opt);

doc.save(myDir+"DOM_Approach_opt.pdf");


Please feel free to contact us for any further assistance.


Best Regards,

Is there a way to query the Tags in the Tiff image. The original 3pagetiff.tif file uses 1 bit per pixel. When the above approach completes the conversion to PDF 24 bit per pixel has been introduced. I would like the resulting PDF to still use 1 bit per pixel. So the resulting PDF is still about 9 times larger than it should be.



So 2 questions:

1. Can I query the Tiff Tags?

2. Can I use those results when saving the PDF?



Is querying Tiff Tags only available using Aspose.Imaging? Goal is to convert Tiff to PDF, but if the Tiff is CCITT4 compression 1bpp then the PDF should also use only 1bpp.

Mike.Oakley:
Is there a way to query the Tags in the Tiff image. The original 3pagetiff.tif file uses 1 bit per pixel. When the above approach completes the conversion to PDF 24 bit per pixel has been introduced. I would like the resulting PDF to still use 1 bit per pixel. So the resulting PDF is still about 9 times larger than it should be.

So 2 questions:

  1. Can I query the Tiff Tags?
  2. Can I use those results when saving the PDF?

Is querying Tiff Tags only available using Aspose.Imaging? Goal is to convert Tiff to PDF, but if the Tiff is CCITT4 compression 1bpp then the PDF should also use only 1bpp.

Hi Mike,

As you have stated above, we have Aspose.Imaging API to specifically deal with image formats and as per your requirement, you can first parse TIFF image through Aspose.Imaging API, get compression information. However when converting TIFF image to PDF format, the default compression by Aspose.Pdf is used.