Error when opening PDF in Reader

I have the evaluation version of Aspose.Pdf and I am evaluating using it to convert scanned images into PDF files. The scanned images can be in two flavors of JPEG (color or grayscale) or a bitonal TIFF. The images are acquired from the scanner using a web service and passed to my code as streams. I am using the following code to write those streams into a PDF.

This works fine for JPEG-Grayscale and the bitonal TIFF. However, when you open the JPEG-Color PDF Acrobat reader does not display the images and if you try to zoom, displays a message indicating the image is corrupt. A sample JPEG-color image is attached to this post.

Aspose.Pdf.Pdf p = new Aspose.Pdf.Pdf();
Aspose.Pdf.Section section = p.Sections.Add();

section.PageInfo.Margin.Bottom = 0;
section.PageInfo.Margin.Left = 0;
section.PageInfo.Margin.Right = 0;
section.PageInfo.Margin.Top = 0;

// figure out size of image and set the pagesize of the pdf.
// assumes all pages are the same size so we don't have to load every page into memory.
using (System.Drawing.Bitmap bmp = new System.Drawing.Bitmap(webViewerController.SavedState.ImageStreams[0]))
{
section.PageInfo.PageHeight = (bmp.PhysicalDimension.Height / bmp.VerticalResolution) * 72;
section.PageInfo.PageWidth = (bmp.PhysicalDimension.Width / bmp.HorizontalResolution) * 72;
}

// add images to pdf
for (int i = 0; i < webViewerController.SavedState.ImageStreams.Length; i++)
{
Aspose.Pdf.Image img = new Aspose.Pdf.Image(section);
if (webViewerController.SavedState.MimeType.Equals("image/jpeg"))
img.ImageInfo.ImageFileType = Aspose.Pdf.ImageFileType.Jpeg;
else
img.ImageInfo.ImageFileType = Aspose.Pdf.ImageFileType.Tiff;
img.ImageInfo.ImageStream = webViewerController.SavedState.ImageStreams[i];
section.Paragraphs.Add(img);
}

// save pdf
p.Save(tmpFile);

Anyone have a chance to look at this?

I changed the code to use Direct-To-PDF and it still fails. I assume there is something about the image that causes Aspose.PDF to write the image into the PDF incorrectly.

I've attached an example of the generated PDF to this post.

Thanks,
MAC

Hi,

I've tested the scenario while adding the Image file to the PDF document and in my case the resultant PDF is being generated correctly, without any issue.

I've used the following code snippet to test the scenario. The resultant PDF is also in attachment, please take a look.

[C#]

Pdf p = new Aspose.Pdf.Pdf();
Aspose.Pdf.Section section = p.Sections.Add();

section.PageInfo.Margin.Bottom = 0;
section.PageInfo.Margin.Left = 0;
section.PageInfo.Margin.Right = 0;
section.PageInfo.Margin.Top = 0;

FileStream fs = File.OpenRead(@"d:\pdftest\scanned.jpg");
byte[] data = new byte[fs.Length];
fs.Read(data, 0, data.Length);
MemoryStream ms = new MemoryStream(data);

// figure out size of image and set the pagesize of the pdf.
// assumes all pages are the same size so we don't have to load every page into memory.
using (System.Drawing.Bitmap bmp = new System.Drawing.Bitmap(ms))

{
section.PageInfo.PageHeight = (bmp.PhysicalDimension.Height / bmp.VerticalResolution) * 72;
section.PageInfo.PageWidth = (bmp.PhysicalDimension.Width / bmp.HorizontalResolution) * 72;
Aspose.Pdf.Image img = new Aspose.Pdf.Image(section);
img.ImageInfo.ImageFileType = Aspose.Pdf.ImageFileType.Bmp;
img.ImageInfo.ImageStream = ms;
section.Paragraphs.Add(img);
}

// save pdf
p.Save(@"d:/pdftest/BitmapConversion.pdf");

It appears you've set ImageInfo.ImageFileType to Bmp???

This works for me, too. However, I'm not sure what I should do here since the image is clearly a jpeg image. I could do a workaround to make it work but, it would be correct to set ImageInfo.ImageFileType to Jpeg.

Is this a bug in Aspose.PDF?

Thanks,
MAC

Hi,

Sorry for replying late.

You can set the img.ImageInfo.ImageFileType as ImageFileType.Jpeg. I’ve retested the code by checking value change from Bmp to Jpeg and the resultant PDF is being generated without any issue.

I’ve also attached the resultant PDF for your reference, please take a look. In case of any further queries, feel free to contact.

Hummm... It makes a difference for me.

If I use ImageFileType.Jpeg, the resulting PDF is corrupt. It opens without error but the images do not display and if you try to zoom you get an error indicating the image "has insufficient data".

If I use ImageFileType.Bmp, the resulting PDF seems fine. The images display and you can zoom without error.

Is this possibly a problem with the evaluation version watermarking the pages?

My current code is;

string tmpFile = Utilities.UtilityTools.getTempFilenameWithExtension(".pdf");
using (System.IO.FileStream fs = new System.IO.FileStream(tmpFile, System.IO.FileMode.Create))
{
//Aspose.Pdf.License lic = new Aspose.Pdf.License();
//lic.SetLicense("Aspose.Pdf.lic");

Aspose.Pdf.Pdf p = new Aspose.Pdf.Pdf(fs);
Aspose.Pdf.Section section = p.Sections.Add();

section.PageInfo.Margin.Bottom = 0;
section.PageInfo.Margin.Left = 0;
section.PageInfo.Margin.Right = 0;
section.PageInfo.Margin.Top = 0;

// figure out size of image and set the pagesize of the pdf.
// assumes all pages are the same size so we don't have to load every page into memory.

using (System.Drawing.Bitmap bmp = new System.Drawing.Bitmap(webViewerController.SavedState.ImageStreams[0]))
{
section.PageInfo.PageHeight = (bmp.PhysicalDimension.Height / bmp.VerticalResolution) * 72;
section.PageInfo.PageWidth = (bmp.PhysicalDimension.Width / bmp.HorizontalResolution) * 72;
}

// reset the stream. The bitmap load moved it to the end.
webViewerController.SavedState.ImageStreams[0].Position = 0;
webViewerController.SavedState.ImageStreams[0].Flush();

// add images to pdf
for (int i = 0; i < webViewerController.SavedState.ImageStreams.Length; i++)
{
Aspose.Pdf.Image img = new Aspose.Pdf.Image(section);

if (webViewerController.SavedState.MimeType.Equals("image/jpeg"))
img.ImageInfo.ImageFileType = Aspose.Pdf.ImageFileType.Jpeg;
else
img.ImageInfo.ImageFileType = Aspose.Pdf.ImageFileType.Tiff;

img.ImageInfo.ImageStream = webViewerController.SavedState.ImageStreams[i];
section.AddParagraph(img);
}

// save pdf
p.Close();
}

Hi,

Sorry for replying late.

I’ve again tested the scenario using the code snippet that you have shared and have made slight changes in it. When you are creating a Bitmap object, please try using the method System.Drawing.Bitmap.FromStream(FileStream) to create it instead of System.Drawing.Bitmap(FileStream). I’ve tested it using Aspose.Pdf for .NET 4.1.0.0.

In my case I’ve generated the resultant PDF while setting the Image.ImageInfo.ImageFileType as ImageFileType.Jpeg. I’m able to view the document without any problem and even I’m able to zoom into the file up to 1200%. The resultant PDF is in attachment, please take a look.

[C#]

string tmpFile = @"d:/pdftest/JPegRenderingTest.pdf";
using (System.IO.FileStream fs = new System.IO.FileStream(tmpFile, System.IO.FileMode.Create))
{
    //Aspose.Pdf.License lic = new Aspose.Pdf.License();
//lic.SetLicense("Aspose.Pdf.lic");

    Aspose.Pdf.Pdf p = new Aspose.Pdf.Pdf(fs);
    Aspose.Pdf.Section section = p.Sections.Add();

    section.PageInfo.Margin.Bottom = 0;
    section.PageInfo.Margin.Left = 0;
    section.PageInfo.Margin.Right = 0;
    section.PageInfo.Margin.Top = 0;

    // figure out size of image and set the pagesize of the pdf.
// assumes all pages are the same size so we don't have to load every page into memory.
    FileStream bmpstream = new FileStream(@"d:/pdftest/scanned.jpg", FileMode.Open);
    byte[] data = new byte[bmpstream.Length];
    bmpstream.Read(data, 0, data.Length);

    using (System.Drawing.Bitmap bmp = (Bitmap)System.Drawing.Bitmap.FromStream(bmpstream))
    {
        section.PageInfo.PageHeight = (bmp.PhysicalDimension.Height / bmp.VerticalResolution) * 72;
        section.PageInfo.PageWidth = (bmp.PhysicalDimension.Width / bmp.HorizontalResolution) * 72;
    }

    // add images to pdf
    {
        Aspose.Pdf.Image img = new Aspose.Pdf.Image(section);
        //if (bmpstream. webViewerController.SavedState.MimeType.Equals("image/jpeg"))
            img.ImageInfo.ImageFileType = Aspose.Pdf.ImageFileType.Jpeg;
        //else
            // img.ImageInfo.ImageFileType = Aspose.Pdf.ImageFileType.Tiff;
        img.ImageInfo.ImageStream = bmpstream;
        // webViewerController.SavedState.ImageStreams[i];
        section.AddParagraph(img);
    }
    // save pdf
    p.Close();
    // reset the stream. The bitmap load moved it to the end.
    bmpstream.Close();
    fs.Close();
}

In case you still face any problem or you’ve any further query, please feel free to contact.

This is strange. I don't see how you're really doing anything different than I am. But it works for you but not me. Your PDF looks fine.

The bitmap part isn't altering the image in the stream. It is only be used to figure out the image size so I can set the page size appropriately. The image written to the PDF comes from the stream, not the bitmap. The stream is not altered by loading it into the bitmap. Consequently, I don't see how the bitmap part would have any effect on the resulting PDF.

Also, I get the image directly from the scanner as a MemoryStream. I don't use a FileStream at all (as far as the image is concerned). In order to use a FileStream, I would have to take the MemoryStream and write it to a file, then open a FileStream from the file. This seems very wasteful and unnecessary.

Why are you reading the FileStream into a byte[]? I don't see how you are using this?

I see from your PDF you are not using the evaluation version. What happens for you if you don't register a license?

Thanks,

MAC

Hi,

I’ve used the FileStream to load an image from my system. In your case, the image is coming from a scanner, so you may try first saving the image in a MemoryStream object and create a BMP image from it.

MemoryStream s = new MemoryStream(webViewerController.SavedState.ImageStreams[0], 0, webViewerController.SavedState.ImageStreams[0].Length);

System.Drawing.Bitmap bmp = (Bitmap)System.Drawing.Bitmap.FromStream(bmpstream);

Moreover, I’ve also tested the scenario without using the license file, and I’m unable to notice the problem.

The image is already a MemoryStream when I get it from the scanner.

I tried changing to use Bitmap.FromStream - it had no effect.

What is the implication of not setting ImageInfo.ImageFileType? If I just leave it at Unknown everything seems to work OK. I can scan all three different image types (bitonal, grayscale, and color) and the resulting PDFs seem to be fine.

The following code seems fine.

string tmpFile = Utilities.UtilityTools.getTempFilenameWithExtension(".pdf");
using (System.IO.FileStream fs = new System.IO.FileStream(tmpFile, System.IO.FileMode.Create))
{
//Aspose.Pdf.License lic = new Aspose.Pdf.License();
//lic.SetLicense("Aspose.Pdf.lic");

Aspose.Pdf.Pdf p = new Aspose.Pdf.Pdf(fs);
Aspose.Pdf.Section section = p.Sections.Add();

section.PageInfo.Margin.Bottom = 0;
section.PageInfo.Margin.Left = 0;
section.PageInfo.Margin.Right = 0;
section.PageInfo.Margin.Top = 0;

// figure out size of image and set the pagesize of the pdf.
// assumes all pages are the same size so we don't have to load every page into memory.
using (System.Drawing.Bitmap bmp = (System.Drawing.Bitmap)System.Drawing.Bitmap.FromStream(webViewerController.SavedState.ImageStreams[0]))
{
section.PageInfo.PageHeight = (bmp.PhysicalDimension.Height / bmp.VerticalResolution) * 72;
section.PageInfo.PageWidth = (bmp.PhysicalDimension.Width / bmp.HorizontalResolution) * 72;
}

// reset the stream. The bitmap load moved it to the end.
webViewerController.SavedState.ImageStreams[0].Position = 0;
webViewerController.SavedState.ImageStreams[0].Flush();

// add images to pdf
for (int i = 0; i < webViewerController.SavedState.ImageStreams.Length; i++)
{
Aspose.Pdf.Image img = new Aspose.Pdf.Image(section);
img.ImageInfo.ImageStream = webViewerController.SavedState.ImageStreams[i];
section.AddParagraph(img);
}

// save pdf
p.Close();
}

Is there going to be a problem if I don't set ImageFileType?

Thanks,

MAC

Hi,

During our test we've noticed that while setting the ImageFileType to Unknown or if we leave the step for setting the ImageFileType, everything seems to be working correctly. As you've mentioned that everything is working correctly at your end, so I don't think it will cause any problem.

Please try using this approach and in case you face any problem, please let us know.