Pdf to image Quality

How to convert pdf to picture 1:1, I tried to set dpi but the image quality is poor

result image:
pdf_backgroubd_f1ab0af45d5c4cb7914ec4f2b34c2167_1.jpg (161.4 KB)

pdf:
00-90-0000-28-107.pdf (598.5 KB)

The text in the red box is no longer clear

image.png (22.8 KB)

Hi, @dalazi

Here are examples of converting PDF to JPEG using Aspose.PDF.Drawing Library 24.1 and Aspose.PDF JPEG Converter:

        public static void ExampleLibrary()
        {
            //var licensePath = @"Aspose.PDF.NET.lic";
            //var license = new License();

            //license.SetLicense(licensePath);
            Resolution resolution = new(300);
            var filePathPdf = "00-90-0000-28-107.pdf";

            var document = new Document(filePathPdf);
            JpegDevice jpegDevice = new(resolution);
            for (int pageCount = 1; pageCount <= document.Pages.Count; pageCount++)
            {
                jpegDevice.Process(document.Pages[pageCount],
                        $"C:\\Sample\\Results\\image{pageCount}_out.jpeg");
            }
        }
        public static void ExamplePlugin()
        {
            //var licensePath = @"Aspose.PDF.NET.lic";
            //var license = new License();

            //license.SetLicense(licensePath);
            var filePathPdf = "00-90-0000-28-107.pdf";


            // Create JpegOptions.
            var convertorOptions = new JpegOptions();
            convertorOptions.AddInput(new FileDataSource(filePathPdf));
            convertorOptions.AddOutput(new FileDataSource(@"C:\Sample\Results\"));
            convertorOptions.OutputResolution = 300;

            // Create a new instance of Jpeg.
            var converter = new Jpeg();

            // Process the PDF to JPEG conversion.
            ResultContainer resultContainer = converter.Process(convertorOptions);

            // Print the paths of the converted JPEG images.
            foreach (FileResult operationResult in resultContainer.ResultCollection.Cast<FileResult>())
            {
                Console.WriteLine(operationResult.Data.ToString());
            }
        }

You can also see the results of conversion in Results.zip (5.8 MB).

If you use other libraries or versions, please let us know.

I want to export pdf 1:1 into image, but I find that the Rect.width Rect.height of pdf page is the same as the exported result under 72dpi, but it is not clear enough. Is there any way to set the dpi of opening pdf

The reason for the poor quality of that part of your document is that the company name is not text or an image. This is a set of small black images.
In this case, we can suggest a workaround: do the PDF to JPEG conversion and then resize the result.

        public static void Example()
        {
            //var licensePath = @"c:\keys\Aspose.PDF.NET.lic";
            //var license = new License();

            //license.SetLicense(licensePath);
            Resolution resolution = new(1200);
            var filePathPdf = "00-90-0000-28-107.pdf";

            var document = new Document(filePathPdf);
            JpegDevice jpegDevice = new(resolution);
            jpegDevice.RenderingOptions.InterpolationHighQuality = true;
            for (int pageCount = 1; pageCount <= document.Pages.Count; pageCount++)
            {

                var memoryStream = new MemoryStream();
                jpegDevice.Process(document.Pages[pageCount], memoryStream);
                memoryStream.Seek(0, SeekOrigin.Begin);

                System.Drawing.Image originalImage = System.Drawing.Image.FromStream(memoryStream);

                ImageResizer resizer = new ();

                Bitmap resizedImage = resizer.ResizeImage(originalImage, 4960, 3508);
                resizedImage.Save($"C:\\Samples\\Results\\image{pageCount}_out.jpeg", ImageFormat.Jpeg);
            }
        }

and here is code of helper class:

public class ImageResizer
{
    // <summary>
    // Resizes the input image to the specified width and height.
    //
    // Parameters:
    // - image: The image to be resized.
    // - width: The desired width of the resized image.
    // - height: The desired height of the resized image.
    //
    // Returns:
    // - A Bitmap object representing the resized image.
    // </summary>
    public Bitmap ResizeImage(Image image, int width, int height)
    {
        Bitmap resizedImage = new Bitmap(width, height);

        using (Graphics graphics = Graphics.FromImage(resizedImage))
        {
            graphics.DrawImage(image, 0, 0, width, height);
        }
        resizedImage.SetResolution(72, 72);
        return resizedImage;
    }

    // <summary>
    // Adjusts the DPI (Dots Per Inch) value of the input image to the specified DPI.
    //
    // Parameters:
    // - image: The image whose DPI needs to be adjusted.
    // - dpi: The desired DPI value.
    //
    // Returns:
    // - A Bitmap object representing the image with the adjusted DPI.
    // </summary>
    public Bitmap AdjustDPI(Image image, int dpi)
    {
        Bitmap adjustedImage = new Bitmap(image.Width, image.Height);

        adjustedImage.SetResolution(dpi, dpi);

        using (Graphics graphics = Graphics.FromImage(adjustedImage))
        {
            graphics.DrawImage(image, 0, 0);
        }

        return adjustedImage;
    }
}

how to pdf export to image Coordinate 1:1 reduction ?
I find a text element coordinate x,y is not equal to the derived image x,y

like this text : GTG Substation
location info: {( 139.00001292399975, 2170.3199913187195 )}
but this Text is in the upper left corner
image.png (57.9 KB)
image.png (95.8 KB)

this image is origin pdf export with 72 dpi
this is export params
this image has compress
image.png (24.7 KB)
image1_out_72.jpeg (202.6 KB)

Rectangular LLX,LLY, URX,URY are the two endpoints?

Sorry, but could you explain exactly what the problem is?
Text GTG Substation has location info: {( 139.00001292399975, 2170.3199913187195 )} because PDF has its coordinate system and point (0,0) is in the left bottom corner.
Please pay attention; you have the first page rotated 270 degrees. You can check the Page.Rotate property.

yes because PDF has its coordinate system and point (0,0) is in the left bottom corner.

but i set roteate=none the point (0,0) is not top left

@dalazi Sorry for the delay; we need more time to check the issue and propose a better solution. We will answer you within the next 8h.

Thank you very much
Waiting for your message!

If you set page.Rotate = Rotation.None; the point (0,0) will be in the left bottom corner.

Unfortunately, we don’t have a simple tool (class or method) for detecting in which coordinates some particular object will be placed on an image.

Therefore, we can recommend using another approach - mark the desired object in PDF and then render a JPEG image.

Below, you can see an example of how to use annotations to mark some text.
You can also see two arrows indicating the point 0,0 and direction.
Please let us know if you find this helpful.

We see some wrong text detection and will create a task after your response.

            //var license = new License();

            // license.SetLicense(licensePath);
            int dpi = 72; // Standard DPI value

            Resolution resolution = new(dpi);
            var filePathPdf = "00-90-0000-28-107.pdf";

            var document = new Document(filePathPdf);
            JpegDevice jpegDevice = new(resolution);
            jpegDevice.RenderingOptions.InterpolationHighQuality = true;

            var page = document.Pages[1];
            page.Rotate = Rotation.None;
            var xAxis = new LineAnnotation(page,
                new Aspose.Pdf.Rectangle(0, 0, 100, 5),
                new Aspose.Pdf.Point(3, 3),
                new Aspose.Pdf.Point(95, 3))
            {
                Title = "Aspose.PDF",
                Color = Aspose.Pdf.Color.Red,
                Width = 3,
                EndingStyle = LineEnding.OpenArrow,
                Popup = new PopupAnnotation(document.Pages[1], new Aspose.Pdf.Rectangle(842, 124, 1021, 266))
            };
            page.Annotations.Add(xAxis);

            var yAxis = new LineAnnotation(page,
                new Aspose.Pdf.Rectangle(0, 0, 5, 100),
                new Aspose.Pdf.Point(3, 3),
                new Aspose.Pdf.Point(3, 95))
            {
                Title = "Aspose.PDF",
                Color = Aspose.Pdf.Color.Blue,
                Width = 3,
                EndingStyle = LineEnding.OpenArrow,
                Popup = new PopupAnnotation(document.Pages[1], new Aspose.Pdf.Rectangle(80, 10, 100, 200))
            };
            page.Annotations.Add(yAxis);

            var textFragmentAbsorber = new TextFragmentAbsorber("RS485");
            textFragmentAbsorber.Visit(page);
            foreach (var tf in textFragmentAbsorber.TextFragments)
            {
                var annotation = new SquareAnnotation(page, tf.Rectangle)
                {
                    Title = "Aspose.PDF",
                    Subject = "Text detection",
                    Color = Aspose.Pdf.Color.DarkRed,
                };
                page.Annotations.Add(annotation);
            }

            page.Flatten();

            document.Save($"C:\\Samples\\Results\\00-90-0000-28-107-rot.pdf");
            jpegDevice.Process(page, $"C:\\Samples\\Results\\test-image_rot.jpeg");

Because I need to extract the text element in the pdf, and then restore this element to the image, but the coordinate system of the pdf element is in the lower left corner, is there any way to convert the coordinate, width, height, and other position information of the text element to the upper left coordinate system

@dalazi

We are looking against the comments that you shared and will be getting back to you shortly.

ok! waiting for you reply

As mentioned before, we don’t have a solution for translation coordinates.

We can offer to use a custom translator. Something like this:

    // <summary>
    // Represents a utility class for handling rectangle conversions.
    // The class provides a method to convert a rectangle with coordinates in points (pt) to pixels (px).
    // </summary>
    public class RectangleConverter
    {
        // <summary>
        // Converts a rectangle with coordinates in points (pt) to pixels (px).
        //
        // Parameters:
        // - xPt: The x-coordinate of the rectangle in points.
        // - yPt: The y-coordinate of the rectangle in points.
        // - widthPt: The width of the rectangle in points.
        // - heightPt: The height of the rectangle in points.
        // - dpi: Dots Per Inch (DPI) value for the conversion.
        //
        // Returns:
        // - A tuple containing the rectangle's x-coordinate, y-coordinate, width, and height in pixels.
        // </summary>
        public static (int xPx, int yPx, int widthPx, int heightPx) ConvertPtToPx(double xPt, double yPt, double widthPt, double heightPt, int dpi)
        {
            // Calculate the conversion factor from points to inches.
            double inchPerPt = 1.0 / 72.0;

            // Convert points to inches.
            double xInch = xPt * inchPerPt;
            double yInch = yPt * inchPerPt;
            double widthInch = widthPt * inchPerPt;
            double heightInch = heightPt * inchPerPt;

            // Convert inches to pixels based on DPI.
            int xPx = (int)(xInch * dpi);
            int yPx = (int)(yInch * dpi);
            int widthPx = (int)(widthInch * dpi);
            int heightPx = (int)(heightInch * dpi);

            return (xPx, yPx, widthPx, heightPx);
        }
    }

Below, you can see how to use it.

            var license = new License();

            license.SetLicense(licensePath);
            int dpi = 72; // Standard DPI value

            Resolution resolution = new(dpi);
            var filePathPdf = "00-90-0000-28-107.pdf";

            var document = new Document(filePathPdf);
            JpegDevice jpegDevice = new(resolution);
            jpegDevice.RenderingOptions.InterpolationHighQuality = true;

            var page = document.Pages[1];
            page.Rotate = Rotation.None;

            var textFragmentAbsorber = new TextFragmentAbsorber("RS485");
            textFragmentAbsorber.Visit(page);

            string OutputImageFileName = $"C:\\Samples\\Results\\test-image_rot.jpeg";
            string OutputMarkedFileName = $"C:\\Samples\\Results\\test-image_drw.jpeg";
            jpegDevice.Process(page, OutputImageFileName);

            var image = System.Drawing.Image.FromFile(OutputImageFileName);
            Graphics g = Graphics.FromImage(image);
            var pen = new Pen(System.Drawing.Color.Red, 3);
            foreach (var tf in textFragmentAbsorber.TextFragments)
            {
                var (xPx, yPx, widthPx, heightPx) =
                    RectangleConverter.ConvertPtToPx(
                        tf.Rectangle.LLX,
                        tf.Rectangle.URY,
                        tf.Rectangle.Width,
                        tf.Rectangle.Height,
                        dpi);
                g.DrawRectangle(pen, new System.Drawing.Rectangle(xPx, image.Height - yPx, widthPx, heightPx));
            }
            image.Save(OutputMarkedFileName);        

A post was split to a new topic: Removing text in TextFragment takes a long time

A post was split to a new topic: Can’t clear text using TextFragment