Performance of single page conversion dependent on document size?

Hello,

I am saving a single page of multipage doc to tiff (or any other image format). It seems like the performance of the save method is dependent of the document size/page count, although I am only saving one single page. For the attached ~1000page document it takes on my PC about 7sec per page, compared to 3.5sec for the ~500page document. The PageCount-property shows the same behaviour.

Here is my code:

var saveOptions = ImageSaveOptions(SaveFormat.Tiff)
{
Resolution = dpi,
PageIndex = pageIndex,
PageCount = 1,
};
new Document(fileName).Save(outputFile, saveOptions);

Any thoughts on this? Is there any possibility to get a performance independent of document size/page count?

Thanks and best regards,

Peter Wolf

Hi Peter,

Thanks for your inquiry. Please note that the performance depends on the nature of content you have on your first page. The process of building layout model is not linear i.e. it may take a minute to render one page and may take a few seconds to render 100 pages to fixed page formats e.g. tiff.

Please let me know if I can be of any further assistance.

Best Regards,

What suprises me is that I get nearly the same duration for all pages of that document. So any page that I render, be it the first or the 100th oder 1000th, takes about 7sec to render, whereas it takes about 3.5sec in the 500 page document. As you can see, the 500 page document consists of the first 500 pages of the 1000 page document, so the content on the pages is the same. The documents only differ in overall size and page count.

One more thing. I always throw everything away in between conversions of pages.

I create the document object, convert one page and throw everything away and do the same thing for the next page. This is obviously in general a bad idea, but I have to do it in my special case.

But now I realised that only the first page conversion on a new document object is dependent on the page count of the document. all other conversions on the same document object take about 100ms.

Hi Peter,

Thanks for the additional information. I think, you should use the code like below to extract the first page of the document so it can be rendered quicker without needing to render the entire document.

Document doc = new Document("input.docx");
Document docPreview = GetFirstPageOfDocument(doc);
docPreview.Save("output.pdf");
///
/// Extracts the first page of a document based on section, page breaks or from a set number of block levels nodes.
///
public static Document GetFirstPageOfDocument(Document doc)
{
    // Number of paragraphs or tables in the document body to extract before stopping if we do not encounter any page or section breaks.
    const int maxNumberOfBlockLevelNodes = 50;
    int currentCount = 0;
    Document previewDoc = (Document)doc.Clone(false);
    NodeImporter importer = new NodeImporter(doc, previewDoc, ImportFormatMode.UseDestinationStyles);
    foreach (Section section in doc.Sections)
    {
        // If this section starts on a new page then we know we have the first page.
        if (section != doc.FirstSection)
        {
            SectionStart sectionType = section.PageSetup.SectionStart;
            if (sectionType == SectionStart.EvenPage || sectionType == SectionStart.NewPage || sectionType == SectionStart.OddPage)
                break;
        }
        // Add the section to the document.
        previewDoc.AppendChild(importer.ImportNode(section, true));
        previewDoc.LastSection.Body.RemoveAllChildren();
        foreach (CompositeNode composite in section.Body.ChildNodes)
        {
            // Copy the node to the empty document.
            previewDoc.LastSection.Body.AppendChild(importer.ImportNode(composite, true));
            currentCount++;
            // If the max number of nodes we predict are on the first page is reached or if the current paragraph contains a page break
            // then we know we have the first page so return the document as is.

            if (currentCount > maxNumberOfBlockLevelNodes || (composite != section.Body.LastParagraph && composite.Range.Text.Contains(ControlChar.PageBreak)))
                return previewDoc;
        }
    }
    return previewDoc;
}

I hope, this helps.

Best Regards,