Generating Tiffs controlling format and base tags

We are currently reviewing Java.Aspose as a potential solution for converting multiple input file types (word/pdf etc) to Tiff.

However we require additional control over the Tiff generation with respect to configuring the Header and Base Tiff tags than seems to be available via ImageSaveOptions. Is it possible to specify tiff header format (Big/Little Endian, e.g. first two bytes of TIFF file set to "II" for "Intel" or "MM" for Motorola) details and Base Tiff tags directly via the provide API. If so, can you point me to some info on this or examples.

Appreciate any assistance you can provide.

Regards John


Hi
John,


Thank you for your interest in Aspose products.

First of all, please note that Aspose.Words for Java is a class library that enables your applications to perform a great range of document processing tasks. Aspose.Words supports DOC, DOCX, RTF, HTML, OpenDocument, PDF, XPS, TIFF, EPUB and other formats. With Aspose.Words you can generate, modify, convert, render and print Word documents without utilizing Microsoft Word®. For more information, please read the following link:
http://www.aspose.com/docs/display/wordsjava/Introducing+Aspose.Words+for+Java

Secondly, you can specify various ImageSaveOptions during saving a Word document as a multi-page TIFF. I would suggest you please read the following API page:
http://www.aspose.com/docs/display/wordsjava/How+to++Save+Document+as+a+Multipage+TIFF

Moreover, regarding converting PDF documents to TIFF format and manipulating TIFF file itself, I will move your thread in Aspose.Total forum. My colleagues from other Aspose products will answer you shortly.

Please let me know if I can be of any further assistance.

Best Regards,

Hi John,

Thanks for your interest in our products.

I am a representative from Aspose.Pdf team. Please note that we have a product named Aspose.Pdf.Kit for Java which provides the capability to convert PDF files into TIFF format. I am afraid currently it does not support the feature to specify the ImageSaveOptions. However for the sake of implementation, I have logged this requirement as PDFKITJAVA-33204 in our issue tracking system. We will further look into the details of this requirement and will keep you updated on the status of correction. We are sorry for this inconvenience.

Please visit the following link for further details on Convert the PDF Document to Specified Images

Thanks for the timely response. Its very much appreciated.

I had reviewed these resources and hence the concern that this feature was not fully supported. Presumably if this is considered as a future enhancement to the pdf kit, it would still require a two way transformation Word -> PDF -> Tiff.

Would this also require purchasing the PDF kit Module aswell as Aspose Word.

We could consider Aspose.Word for Word to PDF conversion and then look at other options for PDF-> Tiff however we would obviously prefer to have a complete solution rather than having to consider a possible later migration.

Hi John,

Please note that Aspose.Words for Java also supports the feature to read Word document and save it into TIFF format (while saving the output file, specify the SaveFormat as TIFF). So you don’t need to use two products for conversion of Word files into TIFF format. In case of any further query, please feel free to contact.

Hi everybody,

I’m with you, having the same problem. We have to convert several input formats (text, html, graphics, doc/x, xls/x, …) to both PDF and TIFF (CCITT G4) for a customer DMS.

I’m still testing your stuff. :wink: Some comments from my side:

- While Aspose.words generates great rendered output, the TIFF-Files are most probably written with “photometric interpretation = MinIsBlack”, which means that nearly every picture viewer will show the TIFF in white letters on black backround (only IrfanView showed it correctly). As far as I remember, the .NET versions are able to change this and other base properties. Not so the Java version. 8-/

- So then I had a look at your Aspose.pdf.kit… Since we have to create both PDF and TIFF files anyway, this could have been an alternative. In our conversion routine some PDF are created using iText, so we would not bother the TIFF-output from Aspose.words, but rather use Aspose.pdf.kit for this conversion step. But the possibility to change properties within the PdfConverter are even less. Only (huge) multicolour TIFFs can be created. No way to produce any small G4-fax TIFFs.

Will there be a feature update in the near future, that will allow finer TIFF-tuning? If not, we will have to find another solution for this step. Best idea Aspose could do would probably be to take the TIFF-conversion routines from Aspose.words, add the missing flag operations, and put the whole stuff into Aspose.pdf.kit. :wink:

Anyway, since this is a very urgent task: if there are any features or other possibilities to create G4-TIFFs from PDFs yet unknown to me, I’d appreciate your help very much!

BTW, there is some other issue with Aspose.words: if I import a DOCX-file and export it as PDF, then convert this to TIFF, the contained images in TIFF are black blocks (+ some minor other problems). The PDF itself looks perfect. I didn’t find this problem with other import formats yet. As a workaround I loaded the DOCX, saved it to stream as DOC and created a new document via this stream. TIFF conversion was all right then. The problem could partly lie in some other libraries because converting the PDF via PdfBox into PNG files shows the same image problem with DOCX based PDFs.

This might probably also be a problem with Aspose.cells. I haven’t tested that yet, but at least images seem to be ignored also (with both xls and xlsx).

Thanks in advance and best regards,
Andreas

Hi Andreas,

Arrows:

This might probably also be a problem with Aspose.cells. I haven't tested that yet, but at least images seem to be ignored also (with both xls and xlsx).

I am from Aspose.Cells team and would be curious to evaluate your issue regarding XLS/XLSX file formats. Please attach/post your input XLS/XLSX and output Tiffsor PDFs here. Also, paste your sample code that you are using to generate the PDFs or Tiffs. We will check your issue soon.

By the way, we recommend you to kindly try our latest version/fix (.NET/JAVA): Aspose.Cells for .NET v7.3.0.1 or Aspose.Cells for JAVA v7.3.0 if it makes any difference.

Thank you.
Arrows:
I'm with you, having the same problem. We have to convert several input formats (text, html, graphics, doc/x, xls/x, ...) to both PDF and TIFF (CCITT G4) for a customer DMS.

- So then I had a look at your Aspose.pdf.kit... Since we have to create both PDF and TIFF files anyway, this could have been an alternative. In our conversion routine some PDF are created using iText, so we would not bother the TIFF-output from Aspose.words, but rather use Aspose.pdf.kit for this conversion step. But the possibility to change properties within the PdfConverter are even less. Only (huge) multicolour TIFFs can be created. No way to produce any small G4-fax TIFFs.

Hi Andreas,

Thanks for using our products.

I am a representative from Aspose.Pdf.Kit for Java team and as per your observations, I am afraid the PdfConverter class does not support any option/feature to convert PDF files into (CCITT G4) TIFF images. However for the sake of implementation, I have logged this requirement as PDFKITJAVA-33211 in our issue tracking system. We will further look into the details of this requirement and as soon as we have made significant progress towards its implementation, we would be more than happy to update you with the status of correction. Please be patient and spare us little time. We are sorry for this inconvenience.

Arrows:
BTW, there is some other issue with Aspose.words: if I import a DOCX-file and export it as PDF, then convert this to TIFF, the contained images in TIFF are black blocks (+ some minor other problems). The PDF itself looks perfect. I didn't find this problem with other import formats yet. As a workaround I loaded the DOCX, saved it to stream as DOC and created a new document via this stream. TIFF conversion was all right then. The problem could partly lie in some other libraries because converting the PDF via PdfBox into PNG files shows the same image problem with DOCX based PDFs.

I think you can load the DOCX file using Aspose.Words for Java and directly save the output in TIFF format rather than first saving the output in PDF format and then transforming the PDF file into TIFF image. I hope my fellow worker from Aspose.Words team would be in better position to further comment on issues related to Aspose.Words. Thanks for contacting support.
Hi Andreas,

Thanks for your inquiry.
Andreas:
- While Aspose.words generates great rendered output, the TIFF-Files are most probably written with "photometric interpretation = MinIsBlack", which means that nearly every picture viewer will show the TIFF in white letters on black backround (only IrfanView showed it correctly). As far as I remember, the .NET versions are able to change this and other base properties. Not so the Java version. 8-/
Firstly, I would suggest you please read the following article on saving Word document as a Multipage TIFF:
http://www.aspose.com/docs/display/wordsjava/How+to++Save+Document+as+a+Multipage+TIFF

Secondly, could you please attach one such Word document, you're getting this problem with, here for testing? I will investigate the issue on my side and provide you more information.
Andreas:
BTW, there is some other issue with Aspose.words: if I import a DOCX-file and export it as PDF, then convert this to TIFF, the contained images in TIFF are black blocks (+ some minor other problems). The PDF itself looks perfect. I didn't find this problem with other import formats yet. As a workaround I loaded the DOCX, saved it to stream as DOC and created a new document via this stream. TIFF conversion was all right then. The problem could partly lie in some other libraries because converting the PDF via PdfBox into PNG files shows the same image problem with DOCX based PDFs.
Please share the DOCX file here as well for testing. I will investigate the issue on my side and provide you more information on this.

Best Regards,

We have the same problem converting to Tiff directly (white letter/black background when using CCITT_3 or CCITT_4 compression). As the API provided limited control over other aspects of the Tiff format we required, we didn't investigate this approach much further, but if you come up with a solution we would be interested. The recommended threads don't cover this to this level of detail as far as I can see.

Also, encountered problems going from Word to PDF using Aspose.Word and then PDF to Tiff using alternative solutions. While the Aspose generated PDF looked ok, it did have an impact on subsequent conversion to Tiff. May not be related to Aspose but we didn't see this with normal PDF to Tiff conversion. <?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Finally we briefly attempt to wrap the Aspose BufferedImage with a basic JAI implementation to convert to the Tiff as required by us in terms of Tiff Format /Tag configuration. While this worked, unfortunately the quality of embedded images was impacted. This was just a quick concept test and not sure if this was a problem with our basic implementation or a known weakness with the JAI.

Interested to hear if you make any progress.

Regards

John

jmaguire54:
Also, encountered problems going from Word to PDF using Aspose.Word and then PDF to Tiff using alternative solutions. While the Aspose generated PDF looked ok, it did have an impact on subsequent conversion to Tiff. May not be related to Aspose but we didn't see this with normal PDF to Tiff conversion.
Hi John,

Thanks for contacting support.

As far as I have understood from above problem description, you are using Aspose.Words to convert Word file into PDF format and then using basic JAI implementation to convert PDF file into TIFF format/Tag configuration. As per my understanding, you are not using Aspose.Pdf.Kit for Java for PDF to TFF conversion because currently it does not support the feature to specify TIFF header format or it does not support the capability to generate tagged TIFF image. Please correct me if I have not properly understood your problem.

In case you are facing an issue while converting PDF file into TIFF format, please share the source PDF file so that we can test the scenario at our end. We are sorry for your inconvenience.

Hi John,

Thanks for your inquiry.

John:

We have the same problem converting to Tiff directly (white letter/black background when using CCITT_3 or CCITT_4 compression). As the API provided limited control over other aspects of the Tiff format we required, we didn’t investigate this approach much further, but if you come up with a solution we would be interested. The recommended threads don’t cover this to this level of detail as far as I can see.

Firstly, I would suggest you please read the following articles on specifying additional options when rendering document pages or shapes to images:

It would be great if you attach your Word document, you’re getting this problem with during rendering to TIFF, here for testing. I will investigate the issue on my side and provide you more information.

Best Regards,