Hi Aspose team!
I need to extract TIFF image frames from a multiple-paged TIFF image. The only library that does not seem to require `RenderedImage`s I could find is the Aspose Imaging library. Now, I have the following test code like to make a microbenchmark:
final Stopwatch stopwatch = createStarted();
final File inputFile = new File("../input.tiff");
try ( final InputStream inputStream = new FileInputStream(inputFile);
final TiffImage sourceTiffImage = (TiffImage) Image.load(inputStream) ) {
final TiffFrame[] frames = sourceTiffImage.getFrames();
System.out.printf("%sms: %d frames, %d bytes\n", stopwatch.elapsed(MILLISECONDS), frames.length, inputFile.length());
for ( int i = 0; i < frames.length; i++ ) {
final File outputFile = new File("../output." + i + ".tiff");
try ( final OutputStream outputStream = new FileOutputStream(outputFile) ) {
final TiffFrame sourceTiffFrame = frames[i];
final TiffFrame destinationTiffFrame = copyFrame(sourceTiffFrame);
try ( final TiffImage destinationTiffImage = new TiffImage(destinationTiffFrame) ) {
destinationTiffImage.save(outputStream, sourceTiffFrame.getFrameOptions());
}
System.out.printf("%sms: frame #%d, %d bytes\n", stopwatch.elapsed(MILLISECONDS), i + 1, outputFile.length());
}
}
}
The report output:
1176ms: 32 frames, 85797844 bytes
3391ms: frame #1, 3096072 bytes
5959ms: frame #2, 2308852 bytes
7825ms: frame #3, 1867124 bytes
9933ms: frame #4, 5288200 bytes
12022ms: frame #5, 2135232 bytes
13928ms: frame #6, 2043956 bytes
15842ms: frame #7, 4862936 bytes
17804ms: frame #8, 4395816 bytes
19702ms: frame #9, 3376140 bytes
21646ms: frame #10, 4763260 bytes
23551ms: frame #11, 2102252 bytes
25404ms: frame #12, 2179348 bytes
27409ms: frame #13, 4900184 bytes
29333ms: frame #14, 2070264 bytes
31172ms: frame #15, 2168304 bytes
33024ms: frame #16, 2171856 bytes
35128ms: frame #17, 4005108 bytes
37123ms: frame #18, 5468384 bytes
39356ms: frame #19, 3244668 bytes
41187ms: frame #20, 1974792 bytes
43028ms: frame #21, 1877584 bytes
44910ms: frame #22, 1938668 bytes
46838ms: frame #23, 1924736 bytes
49204ms: frame #24, 1994724 bytes
51366ms: frame #25, 1916320 bytes
53313ms: frame #26, 1966052 bytes
55174ms: frame #27, 1867436 bytes
57107ms: frame #28, 1790396 bytes
58909ms: frame #29, 644328 bytes
60842ms: frame #30, 2189824 bytes
63636ms: frame #31, 2479336 bytes
65500ms: frame #32, 608292 bytes
As you can see, the input document contains 32 pages that are about 82 MB in total. I'm extracting the frames page by page and it takes about 2 seconds to extract and write each page. The overall multi-paged TIFF file parsing/analysis takes about 1-2s at the very start and this is perfectly fine to me.
The main performance killer here is the destinationTiffImage.save(...) invocation, and I guess it takes too long due to re-encoding (am I getting the method behind the scenes stuff right?). Would it be possible just to store source frames without re-encoding or whatever heavy stuff happenning during the destinationTiffImage.save(...) method invocation directly to target single TIFF page images?
Am I doing something wrong and is there any way to speed the things up? Or, probably, the save method is designed to re-encode and I have to use another method just to redirect frames? Any help would be greatly appreciated. Thank you!
(P. S. My another idea was extracting the source TIFF image input streams directly somehow using the DataStreamSupporter.getDataStreamContainer method, and then just decorating it somehow to let it be a single-image TIFF file (don't know yet if it's ever possible), but the method always returns null for my files. But I'm afraid it might return raw data always without any way of putting the original metadata to the destination files.)