Convert TIF to PDF Massive File Size Increase

I’m having an issue converting multipage TIF images to multipage PDFs and the file size of the converted file becoming way too big. We have 2 ways we convert images, and the old way converted images fine, but couldn’t retain if the individual page was portrait or landscape, which is needed. So we updated our logic (using an Aspose suggested logic/solution) to do the conversion page by page, which did retain the page by page orientation. However, the downside to this is the file sizes are far too big to be usable. The conversion logic between our old way and new way is below, under the test results.

Another issue we ran into with other files (not in this set) couldn’t convert the TIF, so we had to upgrade to a new Aspose.PDF version, which did allow conversion; the Aspose.PDF version is noted in the tests below.

Below is some information about the total differences when converting based on converting 3942 TIF files.

TIF Source Info
Total Source TIF Files: 3942
Total Source TIF File Size: 2.20 GB

Old Conversion (v10)
Converted Total File Size: 2.75 GB
Total Difference from Source: 0.55 GB
Average File Size Increase: 24.94%
Max File Size Increase: 398.54%

New Conversion (v10)
Converted Total File Size: 12.450 GB
Total Difference from Source: 10.24 GB
Average File Size Increase: 464.21%
Max File Size Increase: 3878.59%

New Conversion (v18)
Converted Total File Size: 12.449 GB
Total Difference from Source: 10.24 GB
Average File Size Increase: 464.20%
Max File Size Increase: 3878.53%

--------------- Code ---------------
Old Logic

using (FileStream fs = new FileStream(imagePath, FileMode.Open, FileAccess.Read))
{
	byte[] tmpBytes = new byte[fs.Length];
	fs.Read(tmpBytes, 0, Convert.ToInt32(fs.Length));

	using (MemoryStream memoryStream = new MemoryStream(tmpBytes))
	{
		Bitmap bitmap = new Bitmap(memoryStream);

		// Instantiate a Pdf object
		Aspose.Pdf.Generator.Pdf pdfGenerator = new Aspose.Pdf.Generator.Pdf();

		// Create a new section in the Pdf document
		Aspose.Pdf.Generator.Section pdfSection = new Aspose.Pdf.Generator.Section(pdfGenerator);

		// Set margins so image will fit, etc.
		pdfSection.PageInfo.Margin.Top = 5;
		pdfSection.PageInfo.Margin.Bottom = 5;
		pdfSection.PageInfo.Margin.Left = 5;
		pdfSection.PageInfo.Margin.Right = 5;

		pdfSection.PageInfo.PageWidth = (bitmap.Width / bitmap.HorizontalResolution) * 72;
		pdfSection.PageInfo.PageHeight = (bitmap.Height / bitmap.VerticalResolution) * 72;

		// Add the section in the sections collection of the Pdf document
		pdfGenerator.Sections.Add(pdfSection);

		// Create an image object
		Aspose.Pdf.Generator.Image pdfImage = new Aspose.Pdf.Generator.Image(pdfSection);

		// Add the image into paragraphs collection of the section
		pdfSection.Paragraphs.Add(pdfImage);
		pdfImage.ImageInfo.ImageFileType = Aspose.Pdf.Generator.ImageFileType.Tiff;

		// Set the ImageStream to a MemoryStream object
		pdfImage.ImageInfo.ImageStream = memoryStream;

		// Save the converted file
		pdfGenerator.Save(outputfileName);
	}
}

New Logic

using (var pdfDocument = new Document())
{
	using (var tifFileStream = new FileStream(imagePath, FileMode.Open, FileAccess.Read))
	{
		var bitmap = new Bitmap(tifFileStream);

		// Get the number of pages in the TIF
		var frameDimension = new FrameDimension(bitmap.FrameDimensionsList[0]);
		int frameCount = bitmap.GetFrameCount(frameDimension);

		// Loop through each page and add it to the new PDF
		for (int pageNum = 0; pageNum < frameCount; pageNum++)
		{
			// Add a new page to the PDF document
			Page pdfPage = pdfDocument.Pages.Add();

			// Get the current TIF page
			bitmap.SelectActiveFrame(frameDimension, pageNum);

			using (var currentStream = new MemoryStream())
			{
				// Store the current page for use
				bitmap.Save(currentStream, ImageFormat.Tiff);

				// Set portrait or landscape
				pdfPage.PageInfo.IsLandscape = bitmap.Width > bitmap.Height;

				// Set the margins
				pdfPage.PageInfo.Margin = new MarginInfo(20, 20, 20, 20);
				
				// Set the size
				pdfPage.PageInfo.Width = (bitmap.Width / bitmap.HorizontalResolution) * 72;
				pdfPage.PageInfo.Height = (bitmap.Height / bitmap.VerticalResolution) * 72;

				// Add the image as a paragraph in the new PDF page
				var pdfImage = new Aspose.Pdf.Image
				{
					ImageStream = currentStream
				};
				pdfPage.Paragraphs.Add(pdfImage);

				// Save the new page to the final PDF
				pdfDocument.Save(outputfileName);
			}
		}
	}
}

Please Note: I’ve tried like 50 different ways/combinations to get the code to post and format properly, but the markup tool doesn’t want to work properly, so I gave up.

@bscharf

Thank you for contacting support.

Would you please share a sample TIF and PDF file via Google Drive or Dropbox if files are huge and may not be uploaded to forum. We will try to reproduce and investigate it in our environment to help you out. Moreover, with Visual Studio or any other application, you may indent your code snippet having four blank spaces on the left so that it is formatted properly.

The code is formatted correctly in Visual Studio, the forum editor doesn’t seem to want to allow me to format properly. Is there a possibility of sending you a file directly instead of publicly, for the sake of client privacy?

@bscharf

We have edited your post for proper formatting; in case forum editor does not format the code properly then as an alternative, you may press Tilde key thrice and copy the code between that block. For instance:

```
//TODO Code Here
```

About the files, forum attachments are accessible to thread owner and Aspose staff only so you may share smaller files here. For bigger files, you may share the download link with us by clicking on my username and sending a message. Once sent, please mention here for reference.

@bscharf

You can upload any file or a zipped archive by using the upload button in header of the post editor. Please see this screenshot for your kind reference HowToUpload.jpg. Only you and our staff will be able to access shared data.

I have attached a sample source and converted file in a zip.

AsposeTestFiles.zip (8.4 MB)

@bscharf

Thank you for sharing requested data.

We have worked with the data shared by you and have been able to notice increased file size. A ticket with ID PDFNET-46121 has been logged in our issue management system for further investigation and resolution. The ticket ID has been linked with this thread so that you will receive notification as soon as the ticket is resolved.

We are sorry for the inconvenience.

Any update on this? Its been over 2 months with no response, just wanted to see if there was any news or update.

@bscharf

Thank you for getting back to us.

Please note that issue has been logged under free support model and will be investigated on first come first serve basis. Therefore, it may take few more months to resolve. As soon as we have some definite updates regarding ticket resolution, we will let you know.

Can you fill me in on why this is under the free support model? We have an active license with support for 12 months. We just tried v19.5 and the file size is still growing massively.

@bscharf

Please note that the customers who have paid support subscription, report their issues at Paid Support Helpdesk. You may create a post over Purchase Forum while sharing your 12 digit order ID for further information about your subscriptions.

Moreover, we have investigated your scenario a little bit more and have found the size to be decreasing when IsBlackWhite property of Image class is set to true. We have also attached the PDF document generated with below code snippet.

PDF_19.5.pdf

using (var pdfDocument = new Document())
{
    using (var tifFileStream = new FileStream(dataDir + "SOURCE1.tif", FileMode.Open, FileAccess.Read))
    {
        var bitmap = new Bitmap(tifFileStream);

        // Get the number of pages in the TIF
        var frameDimension = new FrameDimension(bitmap.FrameDimensionsList[0]);
        int frameCount = bitmap.GetFrameCount(frameDimension);

        // Loop through each page and add it to the new PDF
        for (int pageNum = 0; pageNum < frameCount; pageNum++)
        {
            // Add a new page to the PDF document
            Page pdfPage = pdfDocument.Pages.Add();

            // Get the current TIF page
            bitmap.SelectActiveFrame(frameDimension, pageNum);

            using (var currentStream = new MemoryStream())
            {
                // Store the current page for use
                bitmap.Save(currentStream, System.Drawing.Imaging.ImageFormat.Tiff);

                // Set portrait or landscape
                pdfPage.PageInfo.IsLandscape = bitmap.Width > bitmap.Height;

                // Set the margins
                pdfPage.PageInfo.Margin = new MarginInfo(20, 20, 20, 20);

                // Set the size
                pdfPage.PageInfo.Width = (bitmap.Width / bitmap.HorizontalResolution) * 72;
                pdfPage.PageInfo.Height = (bitmap.Height / bitmap.VerticalResolution) * 72;

                // Add the image as a paragraph in the new PDF page
                var pdfImage = new Aspose.Pdf.Image
                {
                    ImageStream = currentStream
                };
                pdfImage.IsBlackWhite = true;
                pdfPage.Paragraphs.Add(pdfImage);

                // Save the new page to the final PDF
                pdfDocument.Save(dataDir + "PDF_19.5.pdf");
            }
        }
    }
}

Please feel free to contact us if you need any further assistance.

So we recently purchased a new license allowing us to use the newest version. I went ahead and made the code changes to what you supplied and it works fine on v18.6 to vastly reduce the file size, however the text of our images became nearly unreadable

I then upgraded to v19.5, hoping the text readability would improve, to run through my same set of images, however I am getting a bunch of different errors (listed below) that I didn’t get before.
“Image export failed.”
“A generic error occurred in GDI+”
“Exception of type ‘System.OutOfMemoryException’ was thrown” (note in all other versions we had no mem issues
“Object reference not set to an instance of an object” (thrown from Aspose.PDF)

@bscharf

Thank you for elaborating it further.

Would you please create separate topics for each problem you are facing, while sharing the files and code snippet so that we may address each scenario and assist you efficiently.