Spacing on a page in a pdf

I was wondering if someone could advise what is the best way to determine when a new page is required.

We currently count how many words are on the page to determine when yo make a new page but this is now ideal in every situation

example below

Agenda_(59) (1).pdf (93.0 KB)

@cbparl

Could you please share that how are you adding content/text inside the PDF document? It would be helpful if you can please share a sample code snippet to generate PDF document same as you shared. We will surely try to create a sample code snippet to determine when to add new page in a scenario which you are dealing with and share our feedback with you.

Here is a sample project of the issue

what should happen is that each item in the list should not go over a page break ie what is happening to item 5 in the sample project

@cbparl

We tested the scenario by commenting following part of the code and obtained the attached document:

if (totalWords > 200)
{
     agendaTextHtml.IsInNewPage = true;
     itemCounter = 0;
     totalWords = 0;
}

testingDocument.pdf (88.3 KB)

We could not notice that the item 5 was breaking over a page from the middle or anywhere of its content. Could you please share a screenshot to elaborate on the issue so that we can further proceed accordingly.

Here is the example of item 5 being split due to page break. You can see that the page break has separated the item into two parts. This is replicated when the code ive sent over is not adapted.

image.png (93.8 KB)

My requirement is to ensure that each item is evenly space out and when this page break issue happens that the item is a new page. I know how to add a new page however i need to know a way dynamically without me manually having to add a page.

this ideally would be like the end goal

image.png (54.5 KB)

@cbparl

We are also trying to replicate the issue by commenting the respective code lines but, the output PDF is being generated differently than the one you showed in image. CodeChanges.png (20.3 KB)

Also, please note that it is hard to control the text content during conversion of HTML to PDF. You can split the content across pages by adding new page (as you are doing already) or you can specify CSS Property i.e. page-break-before: always; in the HTML which will cause a page break inside PDF document during conversion.

The output is going to be different if you change the code.

I sent the code over without that section being commented out. If you dont adapt the code you will replicate the problem I have described. Example of the pdf created without the code being changed.

testingDocument.pdf (94.5 KB)

Please re-download the project again.

The problem is we have no way of identifying when a page is full. This stops us from know when to “split the content” across pages by adding a new page or adding the CSS property. I am asking if I could be advised on what is the best way to detect when a page is full. Once I know that I will be able to determine when i need to make a new page to split the content

@cbparl

It is quite hard to determine whether the content has reached the lowest boundary of the page so that a new page can be added while creating the PDF on-fly. Please note that the final structure of the PDF is decided at the final stage of saving it. The content remains in the memory and correct positions and dimensions cannot be obtained unless you have already mentioned them.

For example, please check below code snippet where Document.ProcessParagraphs() method is being called in each iteration to determine the location of TextFragment on the page. Once it is known that the test has been reached to the very bottom of the page, a new page has been added:

Document doc = new Document();
Page page = doc.Pages.Add();

for(int i = 1; i <= 200; i++)
{
 TextFragment tf = new TextFragment("Test Text" + i);
 page.Paragraphs.Add(tf);
 doc.ProcessParagraphs();
 var rect = tf.Rectangle;
 if (rect.URY == (page.PageInfo.Height - page.PageInfo.Margin.Bottom))
 {
  page = doc.Pages.Add();
 }
}
doc.Save(dataDir + "test.pdf");

Please note that the above method would increase the processing cost as it is processing the PDF structure in each iteration. Please let us know in case you need further information.

Hi @asad.ali

I understand that it is hard to dictate where content is place on the pdf.

The code snippet above I have tried and may be a possible solution (not as exact as needed) however it uses the method doc.ProcessParagraphs() which we were recommended not to use as this was causing the landscape issue we were having.

I have tried just letting aspose created the page dynamically and let apsose decide when a new page is required however when you try to count how many pages have been created doc.Pages it will ignore the page that was created dynamically like it wasnt there.

@cbparl

As shared earlier in this thread, you can also try using incremental approach by calling Document.Save() method without any argument.

You can please try using Document.Save() method before getting the count of pages and let us know in case issue still persists.

You can please try using Document.Save() method before getting the count of pages and let us know in case issue still persists.

I save the document before counting the pages and have 2 pages missing. The 2 pages that are missing are the ones that have been created dynamically (2nd and 3rd page)

image.png (14.5 KB)

testingDocument.pdf (94.5 KB)

@cbparl

Can this project be used to replicate this issue which was shared by you in quoted post?

@asad.ali

yes if you add in the page counts similar to the image below on lines 613 and 626 you will see that 2 pages are missing from the count.

image.png (36.1 KB)

@cbparl

We are checking the scenario and will get back to you soon.

hi @asad.ali

just checking to see if there are any updates to this ?

@cbparl

Thanks for your patience.

We were able to reproduce the issue at our end by testing the scenario using Aspose.PDF for .NET 21.2. We noticed both page count and page orientation issues in cases of calling Document.Save() and Document.ProcessParagraphs() methods respectively.

Therefore, an issue has been logged in our issue tracking system as PDFNET-49565 for the sake of correction. We will look into its details and keep you posted with the status of its correction. Please be patient and spare us some time.

Furthermore, as a workaround, you can please use below code snippet instead of Save() or ProcessParagraphs() method in order to prevent the issue.

System.IO.MemoryStream ms = new System.IO.MemoryStream();
document.Save(ms);
ms.Seek(0, System.IO.SeekOrigin.Begin);
document = new Document(ms);
var pageCount = document.Pages.Count;

We apologize for the inconvenience.