BitsPerSample is incorrect after conversion from DOCX to TIFF using Java

A while ago we received a fix in 20.08, because we were not able to convert from DOCX to TIFF when we had more than one page (filed as WORDSJAVA-2428). What we recon now is that the generated TIFF format makes trouble for TIFF files with more than one page. The Imaging for Windows OCX is pretty picky on those Tags and we think we know what the problem is. The tag BitsPerSample is missing on Page1 and has a strange value on Page2. Normally that should be sth. like BitsPerSample = 1
If you look at the Tags from Page one you see:
TAGS from Page1:

ImageWidth (1 Short): 2481
ImageLength (1 Short): 3508
Compression (1 Short): Group 4 Fax (aka CCITT FAX4)
Photometric (1 Short): MinIsWhite
StripOffsets (135 Long): 1232, 1239, 1246, 1253, 1260, 1267, 1274,…
SamplesPerPixel (1 Short): 1
RowsPerStrip (1 Short): 26
StripByteCounts (135 Long): 7, 7, 7, 7, 7, 7, 1828, 2021, 2188, 1306,…
XResolution (1 Rational): 300
YResolution (1 Rational): 300

TAGS from Page2:

ImageWidth (1 Short): 2481
ImageLength (1 Short): 3508
BitsPerSample (4 Short): 1, 0, 1, 23048
Compression (1 Short): Group 4 Fax (aka CCITT FAX4)
Photometric (1 Short): MinIsWhite
StripOffsets (135 Long): 88584, 88591, 88598, 88605, 88612, 88619,…
SamplesPerPixel (1 Short): 1
RowsPerStrip (1 Short): 26
StripByteCounts (135 Long): 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,…
XResolution (1 Rational): 300
YResolution (1 Rational): 300

We tried this with Version Aspose.Word.20.08 and 20.10 with the same results.

Error message with the OCX is attached and also the ASPOSE generated one with the suspicious TAGs
Imaging4WinOCXError.jpg (22.7 KB)
100021064-1.zip (138.0 KB)

@pkvogt

Please ZIP and attach the input Word document here for testing. Please also share the steps that you are using to get the shared values.

Please share how you are getting this error message. We will investigate the issue and provide you more information on it.

Hi Tahir,

I think it’s a general problem and any Word Document with more than 1 pae will do it. I attched the one from our tests.

100aAsugefüllt.zip (46.4 KB)

Sharing how we get the error message is not possible. It’s an OCX of “Imaging for Windows” embedded in our legacy application. The TAG I got via AsTiffTagViewer.exe.

and here the code with which we generate the TIFFs out of the docx:

	private static JsonConversionResponse convertDocToTiff(String sourceFile) throws Exception {

	logger.info("DOC to TIFF Conversion Started");

	JsonConversionResponse ret = new JsonConversionResponse();
	
	String destFile = replaceFileExtensionWith(sourceFile, "tiff");

	Document doc = new Document(sourceFile);
	
	ImageSaveOptions opts = new ImageSaveOptions(SaveFormat.TIFF);
	opts.setResolution(300);
	
	opts.setPageCount(doc.getPageCount());
	opts.setTiffCompression(TiffCompression.CCITT_4);
	opts.setImageColorMode(ImageColorMode.GRAYSCALE);
	opts.setTiffBinarizationMethod(ImageBinarizationMethod.FLOYD_STEINBERG_DITHERING);
	opts.setThresholdForFloydSteinbergDithering((byte) 128);
	doc.save(destFile, opts);

	ret.setSourceFile(sourceFile);
	ret.setDestFile(destFile);
	ret.setPageCount(doc.getBuiltInDocumentProperties().getPages());
	
	logger.info("Conversion Completed");

	return ret;
}

@pkvogt

We have converted the shared document to TIFF and and it renders correctly in image viewer.

Windows defender does not allow to run this .exe file. Could you please share why you need the value of these tags? Please also share some other way to get these tags.

Hi Tahir, I feel this will be another hard one :slight_smile:

You can also use the tool exiftool from https://exiftool.org/

Also with this tool you won’t find the Bits Per Sample for the first page. It’s only reported after the X Res/Y Res with the same funny values. As said, it’s just a finding that could be the solution.

CU Patrick
------------------ EXIFTOOL output ---------------
Here is what is reported by that tool:
ExifTool Version Number : 12.08
File Name : 252910 [2].tif
Directory : .
File Size : 117 kB
File Modification Date/Time : 2020:10:26 17:56:50+01:00
File Access Date/Time : 2020:10:26 18:00:32+01:00
File Creation Date/Time : 2020:10:26 18:00:32+01:00
File Permissions : rw-rw-rw-
File Type : TIFF
File Type Extension : tif
MIME Type : image/tiff
Exif Byte Order : Big-endian (Motorola, MM)
Image Width : 2481
Image Height : 3508
Compression : T6/Group 4 Fax
Photometric Interpretation : WhiteIsZero
Samples Per Pixel : 1
Rows Per Strip : 26
X Resolution : 300
Y Resolution : 300
Bits Per Sample : 1 0 1 22192
Strip Offsets : (Binary data 916 bytes, use -b option to extra
ct)
Strip Byte Counts : (Binary data 374 bytes, use -b option to extra
ct)
Image Size : 2481x3508
Megapixels : 8.7

I also want to provide the TAGs we get if we first go from DOCX to PDF and then to TIFF:
Page1:
ImageWidth (1 Short): 2480
ImageLength (1 Short): 3507
BitsPerSample (1 Short): 1
Compression (1 Short): Group 4 Fax (aka CCITT FAX4)
Photometric (1 Short): MinIsWhite
StripOffsets (135 Long): 170, 174, 178, 182, 186, 332, 714, 1618,…
SamplesPerPixel (1 Short): 1
RowsPerStrip (1 Short): 26
StripByteCounts (135 Short): 4, 4, 4, 4, 146, 382, 904, 1387, 1403, 1527,…
XResolution (1 Rational): 300
YResolution (1 Rational): 300
ResolutionUnit (1 Short): Inch
Predictor (1 Short): 1

Page2:
ImageWidth (1 Short): 2480
ImageLength (1 Short): 3507
BitsPerSample (1 Short): 1
Compression (1 Short): Group 4 Fax (aka CCITT FAX4)
Photometric (1 Short): MinIsWhite
StripOffsets (135 Long): 115546, 115550, 115554, 115558, 115562,…
SamplesPerPixel (1 Short): 1
RowsPerStrip (1 Short): 26
StripByteCounts (135 Short): 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,…
XResolution (1 Rational): 300
YResolution (1 Rational): 300
ResolutionUnit (1 Short): Inch
Predictor (1 Short): 1

Here you can see that the TAG’s are available on both pages and especially BitsPerSample is set to 1. And this TIFF file is compatible also with the OCX Image Viewer. Hope this helps
Patrick

@pkvogt

We have logged this problem in our issue tracking system as WORDSJAVA-2481. You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

@tahir.manzoor I have an important update for you:
The tool exiftool can also be used to change TAGs. What I did:

The next command just shows the wanted TAG in Page2/Group1:BitsPerSample
exiftool -IFD1:BitsPerSample 100021064-1-error.tif
Output: Bits Per Sample : 1 0 1 32140

The next command just shows the wanted TAG in Page1/Group0:BitsPerSample
exiftool -IFD0:BitsPerSample 100021064-1-error.tif
--> as expected

Now I Update the BitsPerSample for page2/group1:
exiftool -IFD1:BitsPerSample=1 100021064-1-error.tif

Now I Add the BitsPerSample for page1/group0:
exiftool -IFD0:BitsPerSample=1 100021064-1-error.tif

And then I opened the file again in the Viewer and the error was gone.

Hope that helps :slight_smile:
Patrick

@pkvogt

We have logged this detail in our issue tracking system. We will inform you via this forum thread once there is an update available on this issue.

Hi @tahir.manzoor , any info you can share regarding WORDSJAVA-2481? We would like to plan our upcoming release. THX
Patrick

@pkvogt

We try our best to deal with every customer request in a timely fashion, we unfortunately cannot guarantee a delivery date to every customer issue. We work on issues on a first come, first served basis. We feel this is the fairest and most appropriate way to satisfy the needs of the majority of our customers.

Currently, your issue is pending for analysis and is in the queue. Once we complete the analysis of your issue, we will then be able to provide you an estimate.

@tahir.manzoor - It’s now been more than a quarter of year nothing is moving here. What can we do to push this?

@pkvogt

Unfortunately, there is no update available on this issue at the moment. You reported this issue in free support forum and it will be treated with normal priority. To speed up the progress of issue’s resolution, we suggest you please check our paid support policies from following link.
Paid Support Policies

The issues you have found earlier (filed as WORDSJAVA-2481) have been fixed in this Aspose.Words for .NET 21.3 update and this Aspose.Words for Java 21.3 update.