Invalid document model upon processing PDF

I actually have the same exception when trying to create an image from the first page of the attached document. The document itself was actually created with Aspose.Cells 24.3.0 from an XLSX and Aspose.Words 24.3.0 was used to read and convert.
budget.pdf (122.3 KB)

@bitterlich Could you please provide code that will allow us to reproduce the problem? Do you use the attached PDF document as an input document for Aspose.Words? I have tries this scenario using the latest 24.3 version of Aspose.Words and the problem is not reproducible.

This code is used under NetFramework 4.7.2 with the attached document

using (var fs = File.OpenRead(source))
{
    var doc = new Document(fs);

    var pageSize = doc.GetPageInfo(0).GetSizeInPixels(1f, 72f);

    for (var page = 0; page < doc.PageCount; page++)
    {

        var output = source + $"{page}.jpg";

        var pi = doc.GetPageInfo(page);
        var size = pi.GetSizeInPixels(1f, (float)300);

        using (var bmp = new Bitmap(size.Width, size.Height))
        {
            using (var g = Graphics.FromImage(bmp))
            {
                g.Clear(Color.White);
                g.SmoothingMode = SmoothingMode.HighQuality;
                g.TextRenderingHint = TextRenderingHint.ClearTypeGridFit;

                doc.RenderToSize(page, g, 0, 0, pageSize.Width, pageSize.Height);
            }

            bmp.Save(output, ImageFormat.Jpeg);
        }
    }
}

@bitterlich Thank you for additional information. Unfortunately, I still cannot reproduce the problem on my side using the latest 24.3 version of Aspose.Words.

Also, please note, Aspose.Words is designed to work with MS Word documents. MS Word documents are flow documents and they have structure very similar to Aspose.Words Document Object Model. On the other hand PDF documents are fixed page format documents . While loading PDF document, Aspose.Words converts Fixed Page Document structure into the Flow Document Object Model. Unfortunately, such conversion does not guaranty 100% fidelity.

Though you can use the following code to convert PDF document to image without loading PDF document into Aspose.Words DOM:

Aspose.Words.Pdf2Word.FixedFormats.PdfFixedRenderer pdfRenderer = new Aspose.Words.Pdf2Word.FixedFormats.PdfFixedRenderer();
using (FileStream pdfStream = File.OpenRead(@"C:\Temp\in.pdf"))
{
    Aspose.Words.Pdf2Word.FixedFormats.PdfFixedOptions opt = new Aspose.Words.Pdf2Word.FixedFormats.PdfFixedOptions();
    opt.ImageFormat = Aspose.Words.Pdf2Word.FixedFormats.FixedImageFormat.Png;
    IReadOnlyList<Stream> images = pdfRenderer.SavePdfAsImages(pdfStream, opt);
    for (int frameIdx = 0; frameIdx < images.Count; frameIdx++)
    {
        using (Stream imgStream = images[frameIdx])
        using (FileStream imgFile = File.Create(string.Format(@"C:\Temp\out_{0}.png", frameIdx)))
        {
            imgStream.CopyTo(imgFile);
        }
    }
}

Thanks for the reply. I will try your proposed solution.

I would also rather like to use Aspose.PDF directly, but unfortunately this library still not works under Linux as it requires System.Drawing.Common, which is only available under Windows as you know. For Aspose.Slides you already using your own implementation, but not for Aspose.PDF.

Also I like to ask, Is there a way to specify a font folder like with the Document in Aspose.Word?

@bitterlich

Aspose.PDF provides Aspose.PDF.Drawing package, which is variation of Aspose.PDF that does not required System.Drawing.Common:
https://docs.aspose.com/pdf/net/drawing/

It is not quite clear what you mean. You can specify font location using font settings:
https://docs.aspose.com/words/net/specifying-truetype-fonts-location/

Or you mean specifying fonts used by Pdf2Word while rendering PDF document? If so there is no way to set font location in this case.

Wow, this is great news. I wasn’t aware of.

Is there also a version like this for Aspose.Cells?

@bitterlich

It would be better to ask in Aspose.Cells support forum. My colleagues from Aspose.Cells team will answer you shortly.