Issue with File Size Increase after Conversion with latest Version of Aspose.PDF - Seeking Assistance

Dear Aspose Team,

Upon converting files from latest version of Aspose for text, PNG, and JPG formats, we have noticed a sudden and significant increase in file size w.r.t older version of Aspose(21.7.0). This unexpected behavior is causing concerns for us as it impacts the efficiency of our processes and the overall storage requirements for our files.

Could you kindly investigate this matter further? We would appreciate it if you could provide us with some insights into why this might be happening and if there are any recommended solutions or workarounds to mitigate the file size increase during conversions.

Sample_JPG.jpg (55.2 KB)
Sample_PNG.png (24.5 KB)

Name SizeAfterConversion(21.7.0) SizeAfterConversion(22.5.0/23.7.0) MethodName
Sample_TXT.txt 36312 84823 ConvertTextToPdf
Sample_PNG.png 65403 113929 ConvertPngToPdf
Sample_JPG.jpg 53086 101611 ConvertJpgToPdf

private void ConvertPngToPdf(string inputPath, string outputPDF, ref bool bReturn)
 {   
try
{
	using (Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document())
	{
		
		System.Drawing.Image srcImage = System.Drawing.Image.FromFile(inputPath);
		int h = srcImage.Height;
		int w = srcImage.Width;

		Aspose.Pdf.Page page = pdfDocument.Pages.Add();
		Aspose.Pdf.Image image = new Aspose.Pdf.Image();
		image.File = (inputPath);

		page.PageInfo.Height = (h);
		page.PageInfo.Width = (w);
		page.PageInfo.Margin.Bottom = (0);
		page.PageInfo.Margin.Top = (0);
		page.PageInfo.Margin.Right = (0);
		page.PageInfo.Margin.Left = (0);
		page.Paragraphs.Add(image);

		pdfDocument.Save(outputPDF);
	}
	bReturn = true;
 }
catch (Exception ex)
{
	bReturn = false;
}
}

private void ConvertJpgToPdf(string inputPath, string outputPDF, ref bool bReturn)
{
try
{
	using (Aspose.Pdf.Document doc = new Aspose.Pdf.Document())
	{
		Aspose.Pdf.Page page = doc.Pages.Add();
		Aspose.Pdf.Image image = new Aspose.Pdf.Image();
		image.File = (inputPath);
		page.Paragraphs.Add(image);
		doc.Save(outputPDF);
	}
	bReturn = true;
}
catch (Exception ex)
{
	bReturn = false;
}
}

private void ConvertTextToPdf(string inputPath, string outputPDF, ref bool bReturn)
{

try
{
	// Read the source text file
	using (TextReader tr = new StreamReader(inputPath))
	{
		// Instantiate a Document object by calling its empty constructor
		using (Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document())
		{

			// Add a new page in Pages collection of Document
			Aspose.Pdf.Page page = pdfDocument.Pages.Add();
			//To keep the same fonzt size and font as of active pdf file conversion
			page.PageInfo.DefaultTextState.Font = FontRepository.FindFont("Arial");
			page.PageInfo.DefaultTextState.FontSize = 8;
			page.PageInfo.Width = Aspose.Pdf.PageSize.PageLetter.Width;
			page.PageInfo.Height = Aspose.Pdf.PageSize.PageLetter.Height;
			page.PageInfo.Margin.Left = 70;
			page.PageInfo.Margin.Right = 70;
			page.PageInfo.DefaultTextState.LineSpacing = 1;
			
			// Create an instance of TextFragmet and pass the text from reader object to its constructor as argument
			Aspose.Pdf.Text.TextFragment text = new Aspose.Pdf.Text.TextFragment(tr.ReadToEnd());
			page.Paragraphs.Add(text);

			pdfDocument.Save(outputPDF);
		}
	}

	bReturn = true;

}
catch (Exception ex)
{
	bReturn = false;
}
}

Thanks,
Saurabh

@sranjan50

Please try to optimize the PDF document after converting from image or text like below code snippet:

using (Aspose.Pdf.Document doc = new Aspose.Pdf.Document())
{
    Aspose.Pdf.Page page = doc.Pages.Add();
    Aspose.Pdf.Image image = new Aspose.Pdf.Image();
    image.File = (dataDir + "Sample_JPG.jpg");
    page.Paragraphs.Add(image);
    doc.Save(dataDir + "FromJPG.pdf");
}

var oo = new Aspose.Pdf.Optimization.OptimizationOptions();
oo.ImageCompressionOptions.ImageQuality = 50;
oo.ImageCompressionOptions.MaxResolution = 300;
oo.ImageCompressionOptions.ResizeImages = true;
oo.ImageCompressionOptions.CompressImages = true;
oo.ImageCompressionOptions.Encoding = Optimization.ImageEncoding.Jpeg;
oo.ImageCompressionOptions.Version = Aspose.Pdf.Optimization.ImageCompressionVersion.Standard;
oo.AllowReusePageContent = true;
oo.RemoveUnusedObjects = true;
oo.RemoveUnusedStreams = true;
oo.LinkDuplcateStreams = true;
oo.SubsetFonts = true;
oo.AllowReusePageContent = true;

using (Document document = new Document(dataDir + "FromJPG.pdf"))
{
    //document.OptimizeSize = true;
    document.Flatten();
    document.OptimizeResources(oo);
    document.Save(dataDir + "FromJPG_optimized.pdf");
}

FromJPG.pdf (17.2 KB)
FromJPG_optimized.pdf (8.4 KB)

Hi @asad.ali

Still after optimization, there is no improvement. Please find the comparison data from two versions. Its critical. Kindly help.
V1 = 21.7.0.0
V2 - 22.5.0.0

image.png (7.4 KB)

Note: We are in process to get the new license until that we cause 22.5.0 version because current subscription included in our license allows free upgrades until 27 May 2022.

@sranjan50

These results look different than what we obtained in our environment using 23.8 version of the API. You can still use 23.8 version of the API to see if it resolves your issue. You can obtain a 30-days temporary license for free to evaluate the latest version. In case you still notice some issue with the latest version of the API, please let us know.

HI @asad.ali

Following your recommendation, we obtained the necessary license and updated our codebase to utilize the latest version of the aspose.dll(23.8.0). However, upon conducting a thorough comparison between the previous version and the new version, we have observed significant differences in terms of both file processing time and resulting file sizes. For a comprehensive overview of the disparities, please refer to the attached comparison chart.

The areas that have been most notably affected are as follows:

  1. Convert to PDF: We have identified a substantial increase in the file size and time taken to perform PDF conversions using the updated library. The conversion process appears to be considerably slower compared to the previous version, which is impacting our workflow efficiency.
    Doc : ConvertDocToPdf (Higer processing time)
    Html : ConvertHtmlToPdf (Higer processing time)
    Jpg : ConvertJpgToPdf(File size issue & Higer processing time)
    Jpeg : ConvertJpgToPdf (Higer processing time)
    Msg : ConvertMsgToPdf(Higer processing time)
    Png : ConvertPngToPdf (file size issue & Higer processing time)
    Text/Tmp : ConvertTextToPdf (File size issue & Higer processing time)
    Tiff : ConvertTiffToPdf (Higer processing time)
  2. Append PDF File: Our tests have revealed that appending PDF files has doubled the size with the new version. This operation is a critical part of our application, and the decreased performance is hindering our ability to deliver results promptly.
    Method : AppendPdfFile (File size issue)
  3. PDF Page Count: We have noticed that the calculation of PDF page counts has become more time-consuming with the updated library. This is particularly concerning as it affects the accuracy of our application’s reporting features.
    Method : GetPdfPageCount (Higer processing time)

To assist you in diagnosing and addressing this issue, we have attached the comparison chart along with all relevant code snippets that we have used for testing and verification. We believe that resolving these performance discrepancies is crucial to maintaining the efficiency and reliability of our application.

We kindly request your prompt assistance in investigating and resolving this matter. Our team is eager to continue using the latest version of Aspose, but these performance concerns are currently preventing us from doing so effectively. If possible, we would appreciate any insights or guidance you can provide to expedite the resolution process.

ComparisionChart.PNG (33.6 KB)

AsposeMethods.docx (20.9 KB)

Thanks,
Saurabh

@sranjan50

We are checking it and will get back to you shortly.

@sranjan50

We already have your sample JPG and PNG files. If possible, could you kindly share your sample and same source files with which you performed tests at your side? We will continue our investigation and share our feedback with you.

SampleDocuments.docx (608.5 KB)

All the files are attached in doc file.

@sranjan50

We are checking it and will get back to you shortly.

Hi @asad.ali

Do we have any update on this? This is critical for us. On the basis of this we need to take decision, whether we can proceed with latest versions or not.

Thanks,
Saurabh

@sranjan50

We are in process of testing your all documents and will be responding you shortly.

@sranjan50

We have checked your shared files and document and performed an initial testing against all the cases related to Aspose.PDF. We noticed that the results we are getting are quite different than what you shared. The file size does increase to some extent in case of different source files but, after optimizing the PDF, it is producing optimized PDF with reduced size. Below are all the attachments that we got in our environment:

Windows 11
Visual Studio 2022
x64 Debug Mode
16GB RAM

after compression_optimized_TXT.pdf (2.6 KB)
FromTXT.pdf (73.3 KB)
after compression_optimized_HTML.pdf (7.0 KB)
FromHTML.pdf (190.6 KB)
after compression_optimized_PNG.pdf (7.3 KB)
FromPNG.pdf (29.9 KB)
after compression_optimized_JPG.pdf (8.4 KB)
FromJPG.pdf (17.2 KB)

Furthermore, we noticed that you are also using Aspose.Words and Aspose.Imaging in your code for DOC to PDF and TIFF to PDF Conversions. We request you please create a topic in respective forum category to get proper assistance.

Please note that the processing time taken by the API is measured on subsequent runs because at the start, API takes and loads required resources into memory for processing. So the time get reduced on subsequent runs. You can please share a console application for our reference in case you are still getting issues at your end so that we can try to observe similar issues in our environment and address it accordingly.

Hi @asad.ali,

As you suggested, Please find the attached console application that contains the same code only they are pointing to two different version of Aspose.PDF dll.

File size after conversion is just getting doubled with the latest version of aspose.pdf. Please find the comparisons screenshot.

It is very necessary to fix this blocker issue. We will not able to move with latest version of aspose dlls.

Thanks,
Saurabh

AsposeConversionWithVersion_23.8.0.zip (853.3 KB)
AsposeConversionWithVersion_21.7.0.zip (822.7 KB)

Comparision.PNG (101.8 KB)

@sranjan50

We are checking it and will get back to you shortly.

Hi @asad.ali,

Are you able to replicate this issue?

Thanks,
Saurabh

@sranjan50

Yes, we were able to reproduce the issue using your shared console applications. We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-55497

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.